Yesterday we made a discovery regarding one of Google's advanced search parameters. When you do a Google search, if you exclude a keyword by putting a minus sign in front of it (e.g. -keyword), you are not only excluding results that contain that particular word in their on-page copy, but we also now believe that it also excludes any webpage with at least one link pointing to it with that word in the anchor text.
Liberty team member Steve made this discovery when he responded to a tweet by Rand Fishkin, founder of SEOmoz. Rand had seen the "Weirdest SERP" (which stands for Search Engine Results Page) for the search query ["johnnie walker" -johnny] and couldn't figure out why the third result was ranking (here is a screenshot that Rand posted). The third result was an Edinburgh web design company that didn't have the words "Johnnie" or "Walker" anywhere in the text on its homepage, let alone the full phrase, yet it was ranking for that keyword. Steve examined the site's backlink profile and assessed that the site had quite a few in-bound links pointing to it with "Johnnie Walker" in the anchor text. It turns out that Johnnie Walker is a member of staff at the company.
Steve offered this explanation to Rand, and while he thought that the "anchor text likely helps" with its high ranking, he believed that it "just doesn't seem like it would be enough for such a tough-to-rank SERP."
This was a fair enough point. A typical search for ["johnnie walker"] in the US would show Johnnie Walker whisky's official website, Wikipedia pages for the whisky brand as well as the BBC Radio DJ and other pages talking about and relating to the two. While some of them had spellings of both "Johnnie" and "Johnny" on the page, others only had the former. So why are all of them removed when the word "Johnny" is excluded from the search query?
The answer lies in the anchor text of in-bound links pointing to the pages. Steve's next step was to look at the backlinks of the official site's homepage, the relevant Wikipedia pages and other sites that ranked normally for ["johnnie walker"] (i.e. without excluding the keyword "Johnny"). He discovered that other sites were linking to those pages - perhaps naturally - with the incorrect spelling of the whisky and the radio DJ: "Johnny Walker." Regardless of whether or not the word "Johnny" appeared within the text on the page, if the word appeared in the anchor text of at least one in-bound link to the page, Google would remove it from the results if "-johnny" were included in the search query. Therefore, sites such as the Edinburgh web design company were ranking for the ["johnnie walker" -johnny] search because they didn't have the word "Johnny" appear in either instance: it doesn't appear in the on-page text, nor does it appear in the anchor text of any in-bound link pointing to the page.
What is interesting to note is that in its Web Search Help section, Google explains how the keyword exclusion parameter works and claims that:
"Attaching a minus sign immediately before a word indicates that you do not want pages that contain this word to appear in your results." (Our emphasis)
Perhaps this statement needs updating - it is potentially misleading, as "pages that contain this word" suggests that the word must be visible and apparent somewhere on the pages, whether it is somewhere in the copy or in its META data. Obviously, the anchor text of any in-bound links are not actually contained within the page - they appear elsewhere on the Web.
So what does this all mean for businesses? Is there any way to take advantage of this discovery? Sadly, not that we can currently think of. Only an extreme minority of people use minus keywords when conducting searches - those who do are often SEOs, perhaps conducting research and therefore using advanced search parameters to make deductions. However, for those purposes alone, it is still something to be aware of as it could affect the results and conclusions of one's research. If that is the case, it can affect a business via its in-house staff or the online marketing agency that it employs.