Google Focuses on Ridding Their Index of Duplicate Content

In a recent blog post, Matt Cutts, a Google Engineer and high profile personality in my industry stated:

“My post mentioned that “we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.” That change was approved at our weekly quality launch meeting last Thursday and launched earlier this week.” Read the full article.

That’s big news for legitimate businesses like mine that are commonly scraped or have content stolen and used on other websites without my approval. Several years ago Google introduced a patent that had identification of duplicate content as its core technology. I wondered at that time how Google was going to determine who the original content owner was. I wondered if a new meta tag would be introduced that allowed us to tag ourselves as owners of original content.

As of today, no new meta tag has surfaced but we do advocate one best practice to our clients, and that is to post the content on your own website first.

What Matt Cutts of Google is addressing here is not duplicate content on article networks, but rather what we in my industry call scrapers. Scrapers are robot tools or disreputable website owners that just steal your content and then place it on their website.

In many cases these scrapers are really stealing content to beef up AdSense advertisement websites. Meaning that they are not really in competition with you and your services, but hope to make money off of someone clicking their ad when they visit their site to read your stolen content.

Thank goodness Google is finally addressing this issue. I for one believe that my site will benefit from this action.

What is Google Base?

Setting up a Google Base account is the first step you must take to have your items for sale to be shown in Google Products. What exactly is Google Base? Let’s explore that a bit is easy terms.

First here is the link to set up your Google Base account.  When you load your products and you will do this in a very special syntax your products will then have the opportunity to be shown on Google Products pages as well as in the organic results under a listing called Shopping.

Sometimes a pay per click customer will see a competitors item shown with an image in the organic results and think that this is an AdWords vehicle and that if they set up AdWords image ads there images of products will show in the organic results of Google.com, this is not the case. Google Base is unique and another free search vehicle and is not associated with participation in Google AdWords, although Google AdWords text ads may show on the Google Products shopping page but only as text ads.

You can submit your product listing to Google Base as either a tab delimited text spreadsheet or as an XML data feed. I personally like the XML data feed. For more information on the specifics of how to do this and the format, please visit this Google Base information page on data feeds.

Once your feed is created, you will load the feed using FTP to a special location that Google provides. After that it is up to Google if they will show your products, images, pricing and information when it matches a search.

For e-commerce sellers Google Base provides another great way to showcase your products using Google.

The Canonical Issue for SEO Performance – Do You Need To Worry?

I am seeing increasing client concern on the canonical link issue of their website and how Google and other engines are indexing them become almost a level of paranoia this month. Let me take a few moments to talk about this issue to help you understand more about it and what you can do to keep your website search engine friendly.

First, for those of you who don’t know what I mean when here is an example on the way that you link within your website that is a factor for which URLs search engine should recognize in their index.

http://mccordweb.com or http://www.mccordweb.com – Note the first URL has no www and the second does. This is the whirlwind that surrounds the canonical topic.

Now, my site and the sites that I have designed have never had issues here as I am nitpickingly consistent on how I link withing sites, I always use the www. It is a standard that I never waiver on. If your website or blog has used both links you may have an issue. What is at stake is which page should the search engine index? As more engines and particularly Google want to properly index your site and remove peripheral pages, you want to help Google know which page link style to use.

To understand more on this topic you can watch the video on the new canonical link element from Google Engineer Matt Cutts. If you do not know how Matt Cutts is, he is a star in the SEO industry and the Google mouthpiece on technology and Google indexing to the professional community. We all hang on what he says!

The bottom line is that the big three engines are now recognizing a new head tag link element in the source code that allows you to control the page that Google will index. Here is the syntax:

<link rel=”canonical” href=http://www.mccordweb.com/weblogs/index.php> This relates to the home page of my blog. This appears in the source code of your site. You can even create code in your template to pull in the preferred canonical link on your page if you desire and are using header and footer includes. Although for many sites this issue is really not a make or break issue and really will not significantly impact your organic placement for sites that have URLs that contain session cookies and other dynamic elements you can now tell the search engines to index your SEO friendly URL instead of the URLs that may contain session information. Now that has importance! Especially if your URL has been well crafted to contain a product name, model number or SEO keywords.

For normal websites you simply do not need to be spun up on the topic. You can now add the link code in your head tag and you could always through the Google Webmaster control panel tell Google what canonical link to use, but for dynamic sites this is important news for optimization and potentially could be used as just another tool to get better organic placement.