Check to See If Someone Has Snatched Your Blog Content

A blog reader at my other blog Web-World Watch, left this link http://www.copyscape.com/ on a post that spoke about Google dinging sites for showing duplicate content.

I entered my own blog address in this tool, and found that there were sites that had actually snatched my own blog content verbatim and had not supplied a link back or even had identified me as the author. In fact they had passed the content off as their own, and had selected some of my hottest traffic posts!

I have notified them of copyright infringement! You should check your own content to see if you have a similar problem. If you are like me, you don’t mind if others quote you, even show one or two paragraphs of your post and link back to read the full content, or even contact you for approval, but to simply snatch content and provide no links back and pass the content off as their own intellectual property? Very bad form!

The issue on duplicate content that Google is particularly targeting in one of their most recent patent disclosures is simply this case in point. Who should get the credit for duplicate content? Google is developing a way to identify the author of content just in a case like this. I would imagine that this will revolve around the initial post date recorded by the web server and a factor of a match to other content and writing style on the site. Eventually I am looking to the development of a trust certification for site owner to embed on their page that tags their content for Google.

In the meantime, if you are scraping someone else’s content from their blog, please stop! It’s time to create your own, and if you aren’t then check to see if someone is at Copyscape.com.

Check to See If Your Content Has Been Copied

In a previous post, I noted that Google is really cracking down on duplicate content. All site owners should work to clean up their site to make sure that duplicate pages like printer friendly versions of pages are blocked from spidering using the robots.txt file. This will prevent Google from dinging your site for duplicate content.

I did get a comment from a reader which pointed to a site where you can also check to see if someone has snatched your content or duplicated what you have done. Click my post title to visit CopyScape.com.

When I ran my own site through the tool, I found another site that had scraped several blog posts verbatim from my site and passed the content off as theirs. Hmm, that’s a copyright violation. I have notified the sites! I do not mind if you mention my content or show one or two paragraphs, but you must link back to the full article on my site. To simply snatch my content and say it is your intellectual property is wrong.

This is what the Google duplicate content algorithm change is all about! Identifying the legitimate owner and blocking from the index other sites that show this content. In some cases Google is identifying the rightful owner by the post date and by authority. I believe in the next year or even months to come, that we will even see a digital authority head tag tied to domains that Google will pick up to verify the site owner.

In the meantime, watch your site for duplicate content, check to see who has scraped your content, and if you have scraped my content please remove it or link back to my site and give me credit with a link.

An Interesting Article on the Supplemental Index

Click our post title to read this interesting article on how to keep your blog out of Google’s Supplemental Index. The writer offers an interesting tip on how to update your .htaccess file to turn all URLs into www’s. However you can only consider doing this if you are using FTP blogging on many different platforms. If your blog is hosted at Blogspot, you don’t have access to the server.

 

Google Releases New Webmaster Guidelines

Click our post title to see a clickable list of the new items that Google has release today in its Webmaster Guidelines section.

Of note it the mention of WebPosition Gold. Using this product now violates Google’s terms of service. Google has also really spelled out details for duplicate content, affiliates sites, printer friendly page versions, loading keywords in link title tags and image alt tags, and many other important issues. If you are in the business, you really need to take a few seconds and make sure that you are up-to-date.

One thing that I would like to point out that I am seeing that clearly Google will be looking for is the use of false anchor tags in text. The use of this technique may get your site banned from Google as it really falls into the category of hidden text. In several cases on sites that I webmaster that have been optimized by other firms this is a technique that is used. The anchor goes no where, but allow keyword stuffing in the source code. Another trick that I have seen that is sure to give you problems is to include in html comments bogus links to websites that the domain contains your keywords, and I don’t mean one link or two, I mean like 50 links in one commented section.

Be careful, make sure that your site is being webmastered by a reputable firm to assure that this tricks that Google has clearly identified and is targeting for dropping from their index is not being used on your website.