A blog reader at my other blog Web-World Watch, left this link http://www.copyscape.com/ on a post that spoke about Google dinging sites for showing duplicate content.
I entered my own blog address in this tool, and found that there were sites that had actually snatched my own blog content verbatim and had not supplied a link back or even had identified me as the author. In fact they had passed the content off as their own, and had selected some of my hottest traffic posts!
I have notified them of copyright infringement! You should check your own content to see if you have a similar problem. If you are like me, you don’t mind if others quote you, even show one or two paragraphs of your post and link back to read the full content, or even contact you for approval, but to simply snatch content and provide no links back and pass the content off as their own intellectual property? Very bad form!
The issue on duplicate content that Google is particularly targeting in one of their most recent patent disclosures is simply this case in point. Who should get the credit for duplicate content? Google is developing a way to identify the author of content just in a case like this. I would imagine that this will revolve around the initial post date recorded by the web server and a factor of a match to other content and writing style on the site. Eventually I am looking to the development of a trust certification for site owner to embed on their page that tags their content for Google.
In the meantime, if you are scraping someone else’s content from their blog, please stop! It’s time to create your own, and if you aren’t then check to see if someone is at Copyscape.com.
In a previous post, I noted that Google is really cracking down on duplicate content. All site owners should work to clean up their site to make sure that duplicate pages like printer friendly versions of pages are blocked from spidering using the robots.txt file. This will prevent Google from dinging your site for duplicate content.
I did get a comment from a reader which pointed to a site where you can also check to see if someone has snatched your content or duplicated what you have done. Click my post title to visit CopyScape.com.
When I ran my own site through the tool, I found another site that had scraped several blog posts verbatim from my site and passed the content off as theirs. Hmm, that’s a copyright violation. I have notified the sites! I do not mind if you mention my content or show one or two paragraphs, but you must link back to the full article on my site. To simply snatch my content and say it is your intellectual property is wrong.
This is what the Google duplicate content algorithm change is all about! Identifying the legitimate owner and blocking from the index other sites that show this content. In some cases Google is identifying the rightful owner by the post date and by authority. I believe in the next year or even months to come, that we will even see a digital authority head tag tied to domains that Google will pick up to verify the site owner.
In the meantime, watch your site for duplicate content, check to see who has scraped your content, and if you have scraped my content please remove it or link back to my site and give me credit with a link.
Click our post title to read this interesting article on how to keep your blog out of Google’s Supplemental Index. The writer offers an interesting tip on how to update your .htaccess file to turn all URLs into www’s. However you can only consider doing this if you are using FTP blogging on many different platforms. If your blog is hosted at Blogspot, you don’t have access to the server.
Click our post title to see a clickable list of the new items that Google has release today in its Webmaster Guidelines section.
Of note it the mention of WebPosition Gold. Using this product now violates Google’s terms of service. Google has also really spelled out details for duplicate content, affiliates sites, printer friendly page versions, loading keywords in link title tags and image alt tags, and many other important issues. If you are in the business, you really need to take a few seconds and make sure that you are up-to-date.
One thing that I would like to point out that I am seeing that clearly Google will be looking for is the use of false anchor tags in text. The use of this technique may get your site banned from Google as it really falls into the category of hidden text. In several cases on sites that I webmaster that have been optimized by other firms this is a technique that is used. The anchor goes no where, but allow keyword stuffing in the source code. Another trick that I have seen that is sure to give you problems is to include in html comments bogus links to websites that the domain contains your keywords, and I don’t mean one link or two, I mean like 50 links in one commented section.
Be careful, make sure that your site is being webmastered by a reputable firm to assure that this tricks that Google has clearly identified and is targeting for dropping from their index is not being used on your website.
Just what is that? Find out if your site is included in Google’s supplemental index by using this to do a Google search
site:www.yoursite.com *** –sjpked of course insert your correct domain name.
I have a few sites that we have designed and not provided content for (the client sent us content) that just have not placed well. I have really struggled to understand why those sites were not well placed as typically we have great success in moving a client in or up. Well now I have the answer and can help future clients. When a client simply snatches content from other sites and passes it off to you, the webmaster, as original content, they are cutting their own throat. The site will typically go right into the Google’s supplemental index as Google is too smart to simply let the duplicate content go right into the main index.
There are some times when content across the Web for a specific topic may be similar. Let’s say for example you are a merchandise broker and you and others are promoting a set of the same products to the world. This may happen in the franchise business or if you are an affiliate marketeer. The information may be similar. Be careful to take time to make your site’s content interesting and create unique features. Consider introducing a questions and answer section on the product or top tips or special reasons for consideration. Do not simply copy content and pass it off as your own. One it is wrong to do this and can cause a copyright issue and two Google will catch you and your new site will end up in the supplemental index. Sometimes we, the webmaster, simply do not know where you have gotten your content. If you have copied it, please let us know so we can rework it (for an extra fee unfortunately) and keep you out of the supplemental index.
Once you are in the supplemental index, it is a hard thing to move out. Some webmasters recommend abandoning the page URL completely and re-creating content under a new URL. This requires one a full revamping of content and typically a rework of the navigation, plus 301 redirects. Ka–ching. Why not help your webmaster and web designer by being up front initially and investing in unique content. The remediation to repair a problem can be expensive!
Google is really cracking down on duplicate content. So for me, I will have a more frank talk about this specific topic with all new clients and if you are a site owner, please don’t take the easy way out and snatch content and just change a few words and pass it off to your web designer as your own. Invest time in making your site unique by creating interesting content that is not a match for what is already out there.
We offer inexpensive content creation charges when we design your website. Invest in your placement on the Web with content that will win with Google, bring you new customers, and keep you out of the supplemental index!