Best Blackhat Forum

PHP Web Crawler is a software that searches for
links in the web. It stores the links and some extra data in a database
and shows them as HTML output.

Features:

The crawler can be run as multiple instances
It can be run by a cron job.
Crawl results are saved in a MySQL database. It generates the table "urls" to store the crawls.
For each url it saves the url of source, the url of the destiny and the anchor text.
- Validates the urls via a regular expression. It
avoids the links to static data into the site. Including the unnecessary
media files. Despite this I can't ensure that the crawler avoids all
the media files. That be more complex to validate.

http://www.binpress.com/app/php-web-crawler/113

thanks

donjoe1