Interface: Finder

Finder

A link discovery class blueprint

Methods

getRunnable() → {function}

A method to get a function from that is evaluated within the web page to discover links to follow. The returned function should call back with an array of discovered URLs by calling `window.callPhantom(error, urlArray)`. If your method did not provoke an error, pass null as the first argument. The time out of the returned function is controlled via Finder#timeout. The returned function will be called immediately after page load.
Source:
See:
Returns:
A function to be evaluated within the crawled webpage
Type
function

urlFilter(toBeAddedUrl, discoveredOnUrl) → {boolean|String}

Optional. A method that allows you to filter and rewrite discovered URLs. This method is run in node space and so can use all features and closures available there.
Parameters:
Name Type Description
toBeAddedUrl String The URL that is about to be added.
discoveredOnUrl String The origin URL where the new one that is about to be added was found.
Source:
See:
Returns:
Return `false` to discard the URL (e.g. not add it to the queue at all). Any other return value (as long as it is a valid URL) will be used instead. If you return a relative URL, it will be rewritten absolute to the URL where it was found. Invalid URLs (e.g. javascript:;, mailto:, etc.) will be ignored.
Type
boolean | String