.htaccess bot protection rule
-
I've been trying to create a group of rules in .htaccess that will deny bots access to specific groups of pages in my site. Specifically, my site is a Wiki (running MediaWiki), and I would like to prevent bots from accessing any pages in the "User", "Talk" or "Special" namespaces, while allowing them to spider other pages on the site. Below is my attempt... but it's not working. The basic approach I'm trying to use is this: 1. set an environment variable that identifies if the REQUEST_URI's that I want to exclude 2. set an environment variable that detects if the user agent is a bot 3. If both of the above environment variables are set, deny the page. Can anyone give me a tip as to why the code below is NOT doing what I describe above?
RewriteEngine on
Identify non-bot pages with environment variable
RewriteCond %{REQUEST_URI} ^/reference/index.php?title=User:.* [OR]
RewriteCond %{REQUEST_URI} ^/reference/index.php?title=Talk:.* [OR]
RewriteCond %{REQUEST_URI} ^/reference/index.php?title=Special:.*
RewriteRule ^.* - [E=PAGE_NO_BOT:1]Identify bot user agents with environment variable
RewriteCond %{HTTP_USER_AGENT} ^.*Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*robot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Slurp.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Scooter.*
RewriteRule ^.* - [E=CLIENT_IS_BOT:1]If it is a bot AND it is looking at a non-bot page, deny
RewriteCond %{ENV:PAGE_NO_BOT} ^1$
RewriteCond %{ENV:CLIENT_IS_BOT} ^1$
RewriteRule ^.* - [F,L]