bot-trap

bot-trap allows your Web site to automatically ban bad Web robots (a.k.a. Web spiders) that ignore the robots.txt file.
Download

bot-trap Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Public Domain
  • Price:
  • FREE
  • Publisher Name:
  • Daniel M. Webb
  • Publisher web site:
  • http://danielwebb.us/software/bot-trap/

bot-trap Tags


bot-trap Description

bot-trap allows your Web site to automatically ban bad Web robots (a.k.a. Web spiders) that ignore the robots.txt file. bot-trap allows your Web site to automatically ban bad Web robots (a.k.a. Web spiders) that ignore the robots.txt file.This does not include Googlebot and other well-behaved robots. The main advantage over other implementations of this concept is that bot-trap has a manual "unban" feature so that humans can unban, but robots can't.How It Works:- You place a small "web-bug" strategically in your web pages. This bug is just a tiny image link that says to go to /bot-trap/index.php. Normal people don't see this link, but web bots do.- You create a /robots.txt file that tells web bots not to go to the /bot-trap directory.When the bad robot visits /bot-trap/index.php anyway, /bot-trap/index.php adds the IP address of the bad bot to a block list in /.htaccess. They are blocked from access to the site from then on. You can also be emailed when this happens.SafeguardsIt is possible that someone is banned who shouldn't be. Perhaps a previous user of an IP address in a DHCP pool was a naughty user and ran a bad bot, but now the new user is banned. Not to worry, the custom "403 Forbidden" page allows any user to unban themselves by typing a requested word into a form box. Real people can easily do this, but bots can't!Installation:1. Unpack the tarball in your web page root directory: # tar -xzf bot-trap-x.x.tar.gz2. Either add a line to your root .htaccess file like: ErrorDocument 403 /bot-trap/forbid.php or copy the premade one (bot-trap/htaccess-root-example). Notice that since once an IP is banned, it can't access anything in /, so the 403 page should be in /bot-trap, and /bot-trap/.htaccess should only say "Allow from all". Look at the forbid.php file in the distribution to see how to do this, or just use it as-is. 3. Make sure .htaccess controls are allowed in your Apache configuration (especially the "AllowOverride" directive). This allows bot-trap to ban IP addresses using the htaccess mechanism.4. Create the empty file blacklist.dat in your web root directory, and make blacklist.dat, .htaccess, and the bot-trap directory in your web root directory owned by the www user with write permission. If web server uses a group (like the group "www-data" on Debian GNU/Linux), set these files and directories group-writable.5. Edit bot-trap/settings.php to hold the correct email addresses to send alerts to.6. Add "web-bugs" to your main web page to catch the bad bots. This is the XHTML code:< !-- Bad robot trap: Don't go here or your IP will be banned! -->< a href="/bot-trap/">< img src="bot-trap/pixel.gif" border="0"alt=" " width="1" height="1"/>< /a>7. Add the bot-trap directory to your robots.txt file, or copy the example robots.txt file (bot-trap/robots.txt.example) to the root directory.8. Make sure /.htaccess and all other files have the correct permissions and ownership for your site.


bot-trap Related Software