When it comes to search engines, most of the time the question is “how do I make my site more visible?” Occasionally however the opposite need arises. Whether it be out of date content archived on the server, or new drafts that you’re not ready for the world to see, the question becomes “how do I keep search engines from indexing certain parts of my site?” Fortunately, there is an easy solution… use a robots.txt file. This text file, when placed in the root of your site, will tell search engines which pages or directories not to index.
It is important to note that although robots.txt will work for search engines that play by the rules, it is not fool proof. Nefarious robots, like those used for email address harvesting for example may ignore your robots.txt file. Furthermore, robots.txt does not actually hide pages, it just helps prevent them from being indexed. If you have sensitive information on your site that you don’t want anyone to see you’re better off keeping it offline, or implementing some form of password protection.
For technical specifics on creating and using a robots.txt file, visit http://www.robotstxt.org/robotstxt.html

