Best Blackhat Forum

Full Version: [REQ] PHP Web Crawler
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
PHP Web Crawler is a software that searches for
links in the web. It stores the links and some extra data in a database
and shows them as HTML output.


Features:

  • The crawler can be run as multiple instances
  • It can be run by a cron job.
  • Crawl results are saved in a MySQL database. It generates the table "urls" to store the crawls.
  • For each url it saves the url of source, the url of the destiny and the anchor text.
    - Validates the urls via a regular expression. It
    avoids the links to static data into the site. Including the unnecessary
    media files. Despite this I can't ensure that the crawler avoids all
    the media files. That be more complex to validate.
http://www.binpress.com/app/php-web-crawler/113

thanks
Reference URL's