Search engines like Google have a problem. It’s called ‘duplicate content’. Duplicate content means that similar content is being shown on multiple locations (URLs) on the web. As a result, search engines don’t know which URL to show in the search results. This can hurt the ranking of a webpage. Especially when people start linking to all the different versions of the content, the problem becomes bigger. This article is meant for you to understand the different causes of duplicate content, and to find the solution for each of them.
You can compare it to being on a crossroad, and road signs are pointing in two different directions for the same final destination: which road should you take? And now, to make it ‘worse’ the final destination is different too, but only ever so slightly. As a reader, you don’t mind: you get the content you came for. But a search engine has to pick which one to show in the search results, as it doesn’t want to show the same content twice.
Let’s say your article about ‘keyword x’ appears on http://www.example.com/keyword-x/ and the exact same content also appears on http://www.example.com/article-category/keyword-x/, a situation that’s not so fictitious: this happens in lots of modern Content Management Systems. Your article has been picked up by several bloggers, and some of them link to the first URL, others link to the second URL. This is when the search engine’s problem shows its real nature: it’s your problem. This duplicate content is your problem because those links are both promoting different URLs. If they were all linking to the same URL, your chance of ranking in the top 10 for ‘keyword x’ would be much higher.