Alternate robots.txt files with Htaccess
So here's the basic idea: There are 2 sites, a development site and a live site. They are essentially mirrors of each other in terms of they have the same files. You need to disallow all search engine robots from indexing and crawling the development site, while allowing full crawling of your live site. Htaccess to the rescue!
Create a robots-off.txt
You already should have a robots.txt
file, now you just need to create a robots-off.txt
file in the same directory as the robots.txt
file. This blocks all legitimate search engines.
User-agent: * Disallow: /
Htaccess Rewrite for Alternate robots.txt
The below code is simple! It just checks the HTTP_HOST
to see if it starts with "development", and if so (development.site.com
) it internally rewrites (not redirects) requests for /robots.txt
to /robots-off.txt
### ### Alt robots.txt ala askapache.com/htaccess/alternate-robots-txt-rewrite.html ### RewriteCond %{HTTP_HOST} ^development.*$ [NC] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*robots\.txt.*\ HTTP/ [NC] RewriteRule ^robots\.txt /robots-off.txt [NC,L]
« phpMyAdmin Shortcuts with .htaccessShare a Mouse and Keyboard between Windows and Linux »
Comments