PhantomJsCloud.com API Documentation
Options
All
  • Public
  • Public/Protected
  • All
Menu

PhantomJsCloud.com HTTP Endpoint API Client Library Documentation

This documentation is for the HTTP Endpoint. For GET and POST requests.

Using Curl, C#, PHP, Python, or Node.js?

For API Libraries and Examples in these or other languages, refer to the Docs Index Page

Examples! (Quick Start)

These examples use GET requests because they are the simplest way of showing examples and getting started, but it is strongly suggested you use the POST endpoint for "real" usage.

Basic Examples:

Advanced Examples:

Other Examples: See the Docs Index Page Usage FAQ for other examples (such as Page Automation / Button Clicking)

Usage

To use this API, you need to submit a GET or POST request to the API endpoint:

GET:

GET requests are the simplest way of showing examples and getting started, but it is strongly suggested you use the POST endpoint for "real" usage.

  • Url (uriComponent encoded requests)
    • http(s)://PhantomJsCloud.com/api/browser/v2/[YOUR-KEY]/?request=[REQUEST-JSON]
    • The entirety of your [REQUEST-JSON] should be encoded via encodeURIComponent(), do not encode the parts individually.
  • Url (base64 encoded requests)
    • http(s)://PhantomJsCloud.com/api/browser/v2/[YOUR-KEY]/?requestBase64=[REQUEST-JSON]
    • The entirety of your [REQUEST-JSON] should be Base64 Encoded., do not encode the parts individually, and do not urlEncode before you Base64 Encode.

POST:

  • Url
    • http(s)://PhantomJsCloud.com/api/browser/v2/[YOUR-KEY]/
  • Post Body
    • [REQUEST-JSON]
    • You should not Base64 Encode, nor UrlEncode your Post Body, or any of it's parts.
  • Content Type
    • application/json

Input Parameters

[YOUR-KEY]

Your key used for creditBalance/billing. For the Stage2 Preview, this can be anything you want. Later you can obtain this by signing up.

[REQUEST-JSON]

The details of your request as a JSON object. This JSON can take one of the following three forms, each of which are described on the io-datatypes doc page:

  • A single PageRequest object (the simplest approach)
    • Example: {url:"http://www.example.com",renderType:"jpg",outputAsJson:true}
  • Array of PageRequest objects
    • Example: [{url:"http://www.google.com"},{url:"https://www.google.com/doodles/",renderType:"jpg",outputAsJson:true}]
  • A UserRequest object (Most complex: required if you want to use Proxy Servers).
    • Example: {proxy:false,pages:[{url:"http://www.google.com"},{url:"https://www.google.com/doodles/",renderType:"jpg",outputAsJson:true}]}

PageRequest Default Values

While the url is required, everything else is optional. Click here to view the default values if you leave the various parameters blank

CORS/JSONP (Browser API Support)

You can use CORS (prefered) or JSONP to integrate api calls from PhantomJs Cloud directly into your webpage.

  • CORS: All endpoint are CORS enabled by default.
  • JSONP: To obtain results in JSONP format, add the ?callback=CALLBACK_FUNCTION_NAMEquerystring to your request (for either GET or POST requests). You MUST set outputAsJson:true for JSONP to work. See the Advanced Examples section for a demo.

Additional Documentation

Please refer to the main Docs Index Page for Basic Troubleshooting, Testing and Performance Optimization, Usage FAQ, and language-specific samples.

Index

External modules

External modules

"io-datatypes"

"io-datatypes":

ICookie

ICookie:

Various methods in the phantom object, as well as in WebPage instances, utilize phantom.cookies objects. These are best created via object literals.

domain

domain: string

expires

expires: number

unix epoch timestamp (in ms) Javascript Example: (new Date()).getTime() + (1000 * 60 * 60) // <-- expires in 1 hour

httponly

httponly: boolean

name

name: string

path

path: string

secure

secure: boolean

value

value: string

IPageFrame

IPageFrame:

information about the frames of the page

childCount

childCount: number

number of children contained by this frame

childFrames

childFrames: IPageFrame[]

the children of this page (a hiearchy of frames)

content

content: string

the html content of the frame

name

name: string

the name of the frame. use this when requesting the frame to be rendered

url

url: string

the url of the frame

IPageRequest

IPageRequest:

The parameters for requesting and rendering a page. When you submit an array of IPageRequests, they are loaded in-orrder, and only the last one is rendered. All variables except 'url' are optional.

content

content: string

if specified, will be used as the content of the page you are loading (no network request will be made for the url). However, the url property is still required, as that will be used as the page's "pretend" url example: content:"<h1>Hello, World!</h1>",url:"about:blank"

outputAsJson

outputAsJson: boolean

TRUE to return the page conents and metadata as a JSON object. see IUserResponse if FALSE, we return the rendered content in it's native form.

renderSettings

renderSettings: IRenderSettings

settings related to rendering of the last page of your request. See the IRenderSettings documentation (below) for details

renderType

renderType: string

"html": returns the html text, "jpeg"|"jpg" : The default. renders page as jpeg. transparency not supported. (use png for transparency), "png": renders page as png, "pdf": renders page as a pdf, "script": returns the contents of window['_pjscMeta'].scriptOutput. see the scripts parameter for more details, "plainText": return the text without html tags (page plain text),

requestSettings

requestSettings: IRequestSettings

settings related to requesting internet resources (your page and resources referenced by your page)

scripts

scripts: IScripts

Execute your own custom JavaScript inside the page being loaded. see IScripts docs for more details.

suppressJson

suppressJson: string[]

add the nodes from your pageResponse that you do not wish to transmit. This reduces the size of your data, thus reducing cost and transmission time. if you need the data in these nodes, simply remove it from this array.

url

url: string

required. the target page you wish to load

urlSettings

urlSettings: IUrlSettings

adjustable parameters for when making network requests to the url specified

IPageResponse

IPageResponse:

Information about the page transaction (request and it's response).

cookies

cookies: ICookie[]

cookies set at the moment the page transaction completed.

events

events: Array<object>

events that occured during requesting and loading of the page and it's content

frameData

frameData: IPageFrame

the Frames contained in the page. The first is always the main page itself, even if no other frames are present.

headers

headers: object[]

headers for the primary resource (the url requested). for headers of other resources, inspect the pageResponse.events (key='resourceReceived')

Type declaration

  • name: string
  • value: string

metrics

metrics: object

information about the processing of your request

Type declaration

  • elapsedMs: number
  • endTime: string
  • pageStatus: string
  • startTime: string

pageRequest

pageRequest: IPageRequest

the request you sent, including defaults for any parameters you did not include

scriptOutput

scriptOutput: any

statusCode

statusCode: number

the status code for the page, a shortcut to metrics.targetUrlReceived.value.status

IPdfHeaderFooter

IPdfHeaderFooter:

options for specifying headers or footers in a pdf render.

firstPage

firstPage: string

if specified, this is used for the first page (instead of the repeating)

height

height: string

required. Supported dimension units are: 'mm', 'cm', 'in', 'px'. No unit means 'px'.

lastPage

lastPage: string

if specified, this is used for the last page (instead of the repeating)

onePage

onePage: string

if specified, this is used for single page pdfs (instead of the repeating)

repeating

repeating: string

specify a header used for each page. use wildcards for pageNum,numPages as shown in this example: repeating:<h1><span style='float:right'>%pageNum%/%numPages%</span></h1>

IPdfOptions

IPdfOptions:

options specific to rendering pdfs. IMPORTANT NOTE: we strongly recommend using px as your units of measurement.

border

border: string

Border is optional and defaults to 0. A non-uniform border can be specified in the form {left: '2cm', top: '2cm', right: '2cm', bottom: '3cm'} Use of px is strongly recommended.

footer

settings for footers of the pdf

format

format: string

Supported formats are: 'A3', 'A4', 'A5', 'Legal', 'Letter', 'Tabloid'. Internally we convert this to a width+height using 150dpi.

header

settings for headers of the pdf

height

height: string

height and width are optional if format is specified. Use of px is strongly recommended. Supported dimension units are: 'mm', 'cm', 'in', 'px'. No unit means 'px'.

orientation

orientation: string

optional. ('portrait', 'landscape') and defaults to 'portrait'

width

width: string

height and width are optional if format is specified. Use of px is strongly recommended. Supported dimension units are: 'mm', 'cm', 'in', 'px'. No unit means 'px'.

IProxyCustomOptions

IProxyCustomOptions:

auth

auth: string

authentication information for the proxy. ex: username:password

host

host: string

the address and port of the proxy server to use. ex: 192.168.1.42:8080 If your proxy requires a IP to whitelist, use api-static.phantomjscloud.com for your requests.

type

type: string

type of the proxy server. default is http available types are http, socks5, and none

IProxyOptions

IProxyOptions:

allows specifying a proxy for your userRequest (all the pageRequests it contains) To use the built-in proxy servers, you must set the geolocation parameter. Alternatively, you may use your own custom proxy server by setting the custom parameter.

custom

allows you to use a custom proxy server. if you set this, the built-in proxy will not be used. default=NULL

geolocation

geolocation: string

specify the geographic region of the builtin proxy server you use. defaults to any. possible values are any, us (usa), de (germany), gb (great britan), ca (canada), sg (singapore) IMPORTANT: Not yet implemented. So for now, all values are treated as any

instanceId

instanceId: number

specify what builtin proxy server you use. by default, the auto-proxy system will randomly pick from an available proxy server. If you want to specify a specific (fixed) proxy server, set this instanceId to a number, then all requests will direct to the same builtin server.. If you want to use the proxy server in a round-robin style (recommended!) each request should increment this instanceId by one.

IRenderSettings

IRenderSettings:

when a page is rendered, use these settings.

clipRectangle

clipRectangle: object

This property defines the rectangular area of the web page to be rasterized when using the requestType of png or jpeg. If no clipping rectangle is set, the entire web page is captured. Beware: if you capture too large an image it can cause your request to fail (out of memory). you can choose any dimensions you wish as long as you do not exceed 32M pixels

Type declaration

  • height: number
  • left: number
  • top: number
  • width: number

passThroughHeaders

passThroughHeaders: boolean

default false. If true, we will pass through all headers received from the target URL, with the exception of "Content-Type" (unless the renderType=html)

pdfOptions

pdfOptions: IPdfOptions

pdf specific settings. Example:

border: "0",
footer: {
firstPage: "", height: "1cm", lastPage: "", onePage: "", repeating: "<h1><span style='float:right'>%pageNum%/%numPages%</span></h1>"
},
format: "letter",
header: {
firstPage: "", height: "0cm", lastPage: "", onePage: "", repeating: ""
},
height: "11in",
orientation: "portrait",
width: "8.5in",
}

quality

quality: number

jpeg quality. 0 to 100. default 70. ignored for png

renderIFrame

renderIFrame: string

specify an IFrame to render instead of the full page. must be the frame's name.

viewport

viewport: object

size of the browser in pixels

Type declaration

  • height: number

    height is not used when taking screenshots (png/pdf). The image will be as tall as required to fit the content. To set your screenshot's dimensions, use the pageRequest.clipRectangle property.

  • width: number

zoomFactor

zoomFactor: number

This property specifies the scaling factor for the screenshot (requestType png/pdf) choices. The default is 1, i.e. 100% zoom.

IRequestSettings

IRequestSettings:

settings related to requesting internet resources (your page and resources referenced by your page)

authentication

authentication: object

username/password for simple HTTP authentication

Type declaration

  • password: string
  • userName: string

clearCache

clearCache: boolean

if true, will clear the browser memory cache before processing the request. Good for expiring data, and very important if blacklisting resources (see resourceModifier parameter). Default is false.

clearCookies

clearCookies: boolean

if true, will clear cookies before processing the request. Default is false. IMPORTANT NOTE: to protect your privacy, we always clear cookies after completing your transaction. This option is only useful if making multiple requests in one transaction (IE: multiple pageRequests in a userRequest API call)

cookies

cookies: ICookie[]

Set Cookies for any domain, prior to loading this pageRequest. If a cookie already exists with the same domain+path+name combination, it will be replaced. See ICookie for documentation on the cookie parameters.

customHeaders

customHeaders: object

specify additional request headers here. They will be sent to the server for every request issued (the page and resources). Unicode is not supported (ASCII only) example: customHeaders:{"myHeader":"myValue","yourHeader":"someValue"} if you want to set headers for just the target page (and not every sub-request) use the pageRequest.urlSettings.headers parameter.

Type declaration

  • [key: string]: string

deleteCookies

deleteCookies: string[]

delete any cookie with a matching "name" property before processing the request.

disableJavascript

disableJavascript: boolean

set to true to disable all Javascript from being processed on your page.

ignoreImages

ignoreImages: boolean

set to true to skip loading of inlined images

maxWait

maxWait: number

the maximum amount of time (timeout) you wish to wait for the page to finish loading. When rendering a page, we will give you whatever is ready at this time (page may be incompletely loaded). Can be increased up to 5 minutes, but that only should be used as a last resort, as it is a relatively expensive page render.

resourceModifier

resourceModifier: IResourceModifier[]

array of regex + adjustment parametes for modifying or rejecting resources being loaded by the webpage. Example: "resourceModifier": [{regex:".*css.*",isBlacklisted:true}{"regex": "http://mydomain.com.*","setHeader": {"hello": "world","Accept-encoding": "tacos"}}] IMPORTANT NOTE: If you use this to blacklist resources, it is strongly recommended you also set the clearCache parameter. This is because cached resources are not requested, and thus will not be able to be blacklisted.

resourceTimeout

resourceTimeout: number

maximum amount of time to wait for each external resource to load. we kill the request if it exceeds this amount.

resourceWait

resourceWait: number

maximum amount of time to wait for each external resources to load. (.js, .png, etc) if the time exceeds this, we don't cancel the resource request, but we don't delay rendering the page if everything else is done.

stopOnError

stopOnError: boolean

if true, will stop page load upon the first error detected, and move to next phase (render or next page)

userAgent

userAgent: string

default useragent is "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/534.34 (KHTML, like Gecko) Safari/534.34 PhantomJS/2.0.0 (PhantomJsCloud.com/2.0.1)"

waitInterval

waitInterval: number

Milliseconds to delay rendering after the last resource is finished loading (default is 1000ms). This is useful in case there are any AJAX requests or animations that need to finish up. If additional network requests are made while we are waiting, the waitInterval will restart once finished again. This can safely be set to 0 if you know there are no AJAX or animations you need to wait for (decreasing your billed costs)

webSecurityEnabled

webSecurityEnabled: boolean

set to true to enable web security. default is false

xssAuditingEnabled

xssAuditingEnabled: boolean

set to true to prohibit cross-site scripting attempts (XSS)

IResourceModifier

IResourceModifier:

regex + adjustment parametes for modifying or rejecting resources being loaded by the webpage. Example: {regex:".*css.*",isBlacklisted:true}

changeUrl

changeUrl: string

changes the current URL of the network request. This is an excellent and only way to provide alternative implementation of a remote resource. you can even use a dataURI so that you can set the contents directly, Example: data:,Hello%2C%20World!

isBlacklisted

isBlacklisted: boolean

if true, blacklists the request unless a later matching resourceAdjustor changes it back to false (we process in a FIFO fashion) by default, we don't blacklist anything. You should keep it this way when rendering jpeg (where the visuals matter), if processing text/data, blacklisting .css files ['..css.'] will work fine. check the response.metrics for other resources you could blacklist (example: facebook, google analytics, ad networks)

regex

regex: string

pattern used to match a resource's url examples: it really depends what the site is and what you are wanting to block, but for example to block anything with the text "facebook" or "linkedin" in the url:


It's especially useful if you just need the text, as you can block all css files from loading, such as: ```".*\.css.*"

Don't use this to block images. instead, images are blocked by using the requestSettings.ignoreImages:true property

setHeader

setHeader: object

optional key/value pairs for adjusting the headers of this resource's request. example: {"Accept-encoding":"gzip", "hello":"world"}

Type declaration

  • [key: string]: string

IScriptPjscMeta

IScriptPjscMeta:

properties exposed to your custom scripts via window._pjscMeta

manualWait

manualWait: boolean

set to false by default. if true, will delay rendering until you set it back to false. good if you are waiting on an AJAX event.

pageResponse

pageResponse: IPageResponse

Scripts can access (readonly) details about the page being loaded via window._pjscMeta.pageResponse See IPageResponse for more details.

scriptOutput

scriptOutput: object

Your scripts can return data to you in the pageResponse.scriptOutput object. You can access this directly via windows._pjscMeta.scriptOutput or your script can simply return a value and it will be set as the scriptOutput (not available on external, url loaded scripts)

Type declaration

scriptsExecuted

scriptsExecuted: number

how many custom scripts have been loaded so far

IScripts

IScripts:

Execute your own custom JavaScript inside the page being loaded. INPUT You can pass in either the url to a script to load, or the text of the script itself. Example: scripts:{domReady:["//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.0/jquery.js","return 'Hello, World!';"]} OUTPUT Your scripts can return data to you in the pageResponse.scriptOutput object. You can access this directly via windows._pjscMeta.scriptOutput or your script can simply return a value and it will be set as the scriptOutput (not available on external, url loaded scripts) Also, if you use the pageRequest.renderType="script" setting, your response will be the scriptOutput itself (in JSON format) which allows you to construct your own custom API. A very powerfull feature! *

domReady

domReady: string[]

triggers when the dom is ready for the current page. Please note that the page may still be loading.

loadFinished

loadFinished: string[]

triggers when we determine the page has been completed. If your page is being rendered, this occurs immediately before then. IMPORTANT NOTE: Generally you do NOT want to load external scripts (url based) here, as it will hold up rendering. Consider putting your external scripts in domReady

IUrlSettings

IUrlSettings:

adjustable parameters for when making network requests to the url specified. used by PageRequest.

data

data: any

submitted in POST BODY of your request.

encoding

encoding: string

defaults to 'utf8'

headers

headers: object

custom headers for the taret page. if you want to set headers for every sub-resource requested, use the pageRequest.requestSettings.customHeaders parameter instead.

Type declaration

  • [key: string]: string

operation

operation: string

GET (default) or POST

IUserRequest

IUserRequest:

The 'main' form of user request, allows specifying pages to load in order. Later will provide other 'global' options such as geolocation choices.

pages

pages: IPageRequest[]

array of pages you want to load, in order. Only the last successfully loaded page will be rendered.

proxy

proxy: boolean | IProxyOptions

Use proxy servers for your request. default=false. set to true to enable our builtin proxy servers, or use the parameters found at IProxyOptions for more control/options, including the ability to specify your own custom proxy server. IMPORTANT: for now, to use the builtin proxy servers, you must use the api endpoints found at api-static.phantomjscloud.com This is because our proxy provider requires Whitelisting us by Static IP addresses. This requirement will be removed after we exit Beta. Additionally, When you use proxy servers, be aware that requests will be slower, so consider increasing the pageRequest.resourceTimeout parameter like the Proxy Example does.

IUserResponse

IUserResponse:

This is returned to you when "outputAsJson=true".

content

content: object

the rendered output of the last pageRequest

Type declaration

  • data: string

    data in either base64 or utf8 format

  • encoding: string

    utf8 or base64

  • Optional headers?: object[]

    headers of the target url, only set if pageRequest.renderSettings.passThroughHeaders===true

    • name: string
    • value: string
  • name: string

    filename you could use if saving the content to disk. this will be something like 'content.text', 'content.jpeg', 'content.pdf' thus this informs you of the content type

  • size: number

    the size of data, in bytes

  • statusCode: number
  • url: string

    the final url of the page after redirects

meta

meta: object

metadata about the transaction

Type declaration

  • Optional backend?: object

    information about the PhantomJsCloud.com system processing this transaction

    • id: string

      identifier of the system, for troubleshooting purposes

    • os: string
    • platform: string

      PhantomJs

    • platformVersion: any

      version of phantomjs. (major/minor/point)

    • requestsProcessed: number

      number of requests processed by this backend

  • Optional billing?: object

    how much this transaction costs. NOTE: the creditCost, prepaidCreditsRemaining, and dailySubscriptionCreditsRemaining are also returning in the HTTP Response Headers via the keys pjsc-credit-cost, pjsc-daily-subscription-credits-remaining, and pjsc-prepaid-credits-remaining

    • bytes: number
    • Optional creditCost?: number

      the total cost of this response

    • Optional dailySubscriptionCreditsRemaining?: number

      estimation of your remaining daily creditBalance. This is incrementally refilled hourly.

    • elapsedMs: number
    • Optional prepaidCreditsRemaining?: number
  • Optional outputAsJson?: boolean

    hint our pjsc-be-phantom writes so api endpoint knows if should send back only the content.

originalRequest

originalRequest: IUserRequest

the original request, without defaults applied. to see the request with defaults, see pageResponses.pageRequest

pageResponses

pageResponses: IPageResponse[]

a collection of load/processing information for each page you requested.

statusCode

statusCode: number

the HTTP Status Code PhantomJsCloud returns to you

pageRequestDefaultsGet