Wednesday, September 21, 2011

SEO: Best Practices Make effective use of robots.txt

Best Practices
Use more secure methods for sensitive content
You shouldn't feel comfortable using robots.txt to block sensitive or confidential material. One reason is that search engines could still reference the URLs you block (showing just the URL, no title or snippet) if there happen to be links to those URLs somewhere on the Internet (like referrer logs). Also, non-compliant or rogue search engines that don't acknowledge the Robots Exclusion Standard could disobey the instructions of your robots.txt. Finally, a curious user could examine the directories or subdirectories in your robots.txt file and guess the URL of the content that you don't want seen. Encrypting the content or password-protecting it with .htaccess are more secure alternatives.
Avoid:
  • allowing search result-like pages to be crawled                                                                                   -  users dislike leaving one search result page and landing on another search result page that doesn't add significant value for them  
  • allowing URLs created as a result of proxy services to be crawled

0 comments:

Followers

Powered by Blogger.