4ps Marketing SEO Agency Leading Search and Social Agency

Call us today on: 0207 607 5650

  • 4ps Blog



Home Blog The robots.txt file checker
6 September 2010 Gerard Harris

The robots.txt file checker

robots.txt history

Web robots, also known as web crawlers or web spiders, are programs that Search engines use to roam the web automatically to index its content. In the early 90s there were occasions where robots visited servers where they weren’t welcome for a number of reasons:

•    swamping servers with requests reducing performance
•    retrieved the same files repeatedly
•    indexing duplicated content or temporary information
•    scanning for email address to send spam

These incidents highlighted a need to develop the means to restrict what the robots could index.

This led to the creation of a file on your server which dictates where the robots can’t go. The file has to be accessible via HTTP on the local URL “/robots.txt“, this can be easily implemented and the robot can find tell where it’s not allowed to go by retrieving one single document.

robots.txt basics, covered in an earlier post, can be found here: Robots.txt – A Rough Guide

robots.txt checker

To ensure there are no errors in your robots.txt file, instead of wading through lines of html code, you can use this handy robots.txt checker tool from motoricera, an Italian non-profit SEO site.

In their own word: ‘This robots.txt checker is a “validator” that analyzes the syntax of a robots.txt file to see if its format is valid as established by Robot Exclusion Standard (please read the documentation and the tutorial to learn the basics) or if it contains errors. The validation process takes in account both Robots Exclusion Standard rules and spider-specific (Google, Inktomi, etc.) extensions (including the new “Sitemap” command).’

While this is a neat little tool that will save you some time checking your robots.txt file, if it does uncover errors that could mean your site is not being correctly indexed and could be affecting your presence in the web. Unless you know your html I wouldn’t recommend fiddling with your robots.txt yourself rather get in touch with a professional.

  • No Related Post

Save on Delicious
Categories: Blog SEO

Written by Gerard Harris

As Digital Group Account Director, Gerard is responsible for setting the strategic direction of our agency's SEO work. His day to day work involves, managing his SEO team & maintaining client relationships to make sure we are meeting and exceeding targets.

More about Gerard Harris

Leave a Reply

Subscribe to BlogGet all of our latest industry tips by email

Newsletter



  • Related Posts

      • No Related Post

    Newsletter