Don’t Block Google From Spidering Site Files

Google Partner Badge
McCord Web Services is a Google Partner.

Google has recently made some nice improvements in how and what it can spider to get a fuller picture of your website. By upgrading technology, no longer does their robot spider see the web in nearly a text version, but now almost as a browser sees the page.

As a result Google is letting webmasters know to not block spider access to CSS files, JavaScript, and image files. Read the full Google release on this subject.

Personally I think that Google is also looking for CSS for hidden text and other spammy and black hat uses but they are couching this “enhancement” as a way to provide “optimal indexing” of your website.

It has previously been common practice for webmasters to block search engine spiders from certain sections of their website using disallow in the robots.txt file in the root of a website’s hosting server, but Google clearly now wants to “see it all” and is instructing webmasters to not block their access.

There is still a place for considering blocking search engine robots using the robots.txt file in this fashion:

User-agent: *
Disallow: /folder-name/

One such case may be your draft file folder. If you work with a team and are doing page change drafts you may want to block those working files and old files so they do not get indexed in error.