The power of links – non-indexed pages out-ranking optimised ones + robots.txt flaw.
Tuesday, 10 April 2007
My friend Mike Grehan often talks in his linking presentations about how a page not even in Google can out rank one that has actually been indexed and optimised, well here's a good example of exactly that. The links in position 2 and 3 are on Googles own website and explicitly blocked by their robots.txt file. Notice no cache or description snippet, that’s because the pages themselves have not been crawled (not indexed, not in Google). All that Google has to go on is the fact that someone else linked to the page and the text they used.
The anomaly this highlights though, is that when a page is linking to another that is blocked by the robots.txt file Google opts to display the link text from the linking page as the result title, which when you think about it is actually quite a serious flaw in their treatment of the robots.txt protocol.
It means Google has an exploitable loophole allowing one website to control the representation of another website within the algorithmic results page, and effectively own the title space of another websites results, for content that website did not want included in the index. Google is showing priority to the decision of the linker to link, over the content owner who wants it excluded. In a worst case scenario this could be defamatory text or something libellous and it would appear to be coming from the website being linked to. In the examples below the Title text is coming from links on a review page of UK search engine marketing agencies http://www.sci7.com/cms/62/uk-google-qualified-professionals.html supposedly we are one of the trendy ones, so it's definitely relevant to the query ;-) but there is potential this would not always be the case.
It's my opinion that Google would be much better off displaying only the URL and not third party link text in the results page.
Great example Teddy, This also happens a lot with affiliate links, with enough links they outrank the non parameter versions of landing pages, even when they are blocked via robots.txt or the meta robots tag.
Posted by: algoholic | Thursday, 12 April 2007 at 08:49 AM
If I had a large affiliate network that was causing that particular problem, I'd be more inclined to set a server side trigger in my pages activated by the presence of the affiliate tracking ID, that allowed them to register the affiliate, drop the cookie and then force a 301 permanent redirect to the page I wanted to rank.
More sales + better desired rankings + more control of the brand. Problem fixed :-)
Posted by: Teddie | Sunday, 15 April 2007 at 09:01 PM
Someone needs to come in and change the whole way it works,that would shake things up!!
Lucy
Posted by: Spanish Property | Tuesday, 09 December 2008 at 05:07 PM
nice post and i like it very much~it's so useful,thanku for sharing it~
Posted by: mac makeup bags | Saturday, 17 November 2012 at 06:21 AM
Im so amazed, things on Google are still this way. It was set up to help small companies out rank big companies, but like everything you can pay to get ahead of the rest.. great article.
Posted by: Properties in Spain | Monday, 14 January 2013 at 02:03 PM