Search the blog

Worried about accidentally publishing your test site robots.txt to your live site robots.txt and thereby blocking the search engines from your site?

It’s an easy mistake to make and one that can be costly if not noticed right away. What about if your robots.txt file adjusted itself automatically based on whether it’s the test site or the live site?

Here’s an easy way to do it with Apache’s .htaccess file and PHP.

.htaccess

Add this to your .htaccess file:

RewriteEngine on
RewriteRule ^robots.txt$     /robots.php [L]

If RewriteEngine on has already been called in your .htaccess omit the first line. /robots.php can be changed to any PHP page.

PHP

Now create robots.php in the root (or whatever file/location you chose) and use this PHP code:

<?php

// Make it a plain text file
header('Content-Type:text/plain');

// Output based on HTTP host
if($_SERVER['HTTP_HOST'] == 'testsite.mydomain.com') {

    // Enter your test site robots.txt here

?>
User-agent: *
Disallow: /
<?php

}
else {

    // Enter your live site robots.txt here

?>
User-agent: *
Disallow: /cms/
<?php    

}

?>

The above code should be self-explanatory and can be adapted to handle multiple hosts if you need to — but this simple example should be sufficient for most cases.

Once set up, when you visit robots.txt you will see it automatically adjust itself based on which site you are accessing. You never need worry about uploading the wrong robots.txt file again!

Tim Bennett is a Leeds-based web designer from Yorkshire. He has a First Class Honours degree in Computing from Leeds Metropolitan University and currently runs his own one-man web design company, Texelate.