Quickstart & Example
The following steps show the usage of phpcrawl.
This is what you have to do to start a crawling-process:
1. Include the phpcrawl-mainclass to your script or project. Its located in the "classes"-path of the package.
include("classes/phpcrawler.class.php");
2. Extend the phpcrawler-class and override the handlePageData-Method with your own code to handle
the information of every page or file the crawler will find.
class MyCrawler extends PHPCrawler
{
function handlePageData(&$page_data)
{
// your code, do something with the array $page_data
// that contains the page/file-information
}
}
3. Create an instance of that class, define the behaviour of the crawler with the
given methods and start crawling.
$crawler = &new MyCrawler();
$crawler->setURL("www.foo.com");
$crawler->addReceiveContentType("/text\/html/");
// ...
$crawler->go();