PageRank Is Dead: Long Live The Knowledge Graph

That title relates to a more frequently heard saying. "The King is dead. Long live the King". This is a traditional proclamation made following the accession of a new monarch in various countries, such as the United Kingdom. 

This post was in the making for some time since the Google PageRank has been dead for some time.  However only today can we proclaim that Google may have a new monarch who can replace the dead king.  Perhaps a little explanation on all this is appropriate.

PageRank Defined

The Google website still describes what it believes PageRank is

PageRank is Google’s view of the importance of a webpage.  Web pages with a higher PageRank are more likely to appear at the top of Google search results.

That’s a somewhat circular definition.  However back in 2007, Danny Sullivan offered a somewhat more precise explanation of PageRank.

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at considerably more than the sheer volume of votes, or links a page receives; for example, it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.” Using these and other factors, Google provides its views on pages’ relative importance.

That seems straightforward but it really has created a veritable Pandora’s box of associated problems.

PageRank Was A Good Idea – If It Had Remained a Secret

It’s interesting that Danny Sullivan used that phrase,  the uniquely democratic nature of the web.  If only it were so.  Democracy as Google defines it means A system of government by the whole population or all the eligible members of a state, typically through elected representatives.  In other words, one person, one vote.  It wasn’t a good description of the web in 2007 and it is even less so now.

As Google has grown in popularity as the dominant way to search for online information, everyone wants to have their website ranking first in keyword searches.  If back links or inlinks for a web page signal its importance then the game is to get as many such links as is humanly possible.  Quite naturally it no longer is one person one vote but it is the largest number of links that your computer system can generate.

PageRank As A Quality Guidelines Factor

Googlers still seem to accept as ‘gospel’ that PageRank can be a useful measure of the importance of a web page.  Indeed they seem to feel it is part of the quality of a web page.  That is increasingly not a view that is shared by the general population.  As a ‘for instance,’ Barry Schwartz expressed surprise that a Google AdWords Manager could be  happy over PageRank

Eric, a Google AdWords representative, said he noticed that in the May 2012 PageRank update, the PageRank score for the new AdWords forum had increased to a PageRank of 7. He wrote that this is "quite an achievement" for a "new AdWords Community from scratch."  (He seemed to be unaware that this was just mathematics) if you are linked to from several pages.

Google provides Webmaster Guidelines, which are Best practices to help Google find, crawl, and index your site.  This includes what they call Quality guidelines.  However this is not what it seems.  Just read the introductory paragraph.

These quality guidelines cover the most common forms of deceptive or manipulative behavior, but Google may respond negatively to other misleading practices not listed here (e.g. tricking users by registering misspellings of well-known websites). It’s not safe to assume that just because a specific deceptive technique isn’t included on this page, Google approves of it. Webmasters who spend their energies upholding the spirit of the basic principles will provide a much better user experience and subsequently enjoy better ranking than those who spend their time looking for loopholes they can exploit.

In summary, this is not at all about the Quality of a website.  Instead it is a heavy-handed FUD (fear, uncertainty and doubt) attempt to warn webmasters not to take actions that will invalidate the PageRank theory.  For some websites, many links may undermine their authority rather than building it up.

PageRank Cannot Be Fixed

There is an inescapable logic that Google cannot circumvent.

  1. PageRank assumes that more links implies greater authority.
  2. Google has widely publicized this and dominates the search process so its message is highly important.
  3. Everyone wants to create links to generate this greater authority.
  4. When they do this,  more links no longer implies greater authority so PageRank fails

That inescapable logic should mean that Google buries the concept of PageRank.  Instead Google has taken on the Herculean task (think manure and the Augean Stables) of

  • trying to get everyone to remove the auto-generated links, and
  • de-indexing links they believe have been generated purely for PageRank effect.

The effort and manpower (Matt Cutts and his team et al. et al.) involved in tackling this enormous task is staggering. 

Google Search Algorithm Updates

The effect on the online community is also massive and causes incredible anguish.  Practices which were thought to be acceptable suddenly become completely abhorrent.  The pace of the effort is ramping up. 

If you were unaware of what is happening, you might wish to consult the SEOmoz Google Algorithm Change History.  This gives a complete history of updates going back to 2002.  The introduction describes the current frenetic pace.

Each year, Google changes its search algorithm up to 500 – 600 times. While most of these changes are minor, every few months Google rolls out a “major” algorithmic update that affect search results in significant ways.

Many will be aware of two of these that have excited much discussion: Panda and  Penguin.  There have been other major updates that did not get named.

The most recent Search Algorithm Update Targets Web Spam.  Distinguished Google Engineer Matt Cutts, head of the web spam team, specifically noted the following:

Websites that are likely to lose rankings are those that practice keyword stuffing and sites that have “unusual linking patterns,” such as links from spun content with anchor text that is completely unrelated to the actual on-page content.

During the month, Google was sending warnings about “artificial” or “unnatural” links.  Some dubbed this the Unnatural Link Algorithm.

The goal of many of these updates is to separate out those websites that have completely "natural" links from those that have links that were generated to produce an elevated PageRank value.  You might describe this as Google’s attempt to separate out the sheep from the goats.

Whenever you attempt to make such a separation with such a woolly concept as "natural" links, then you are very likely to mis-identify: some sheep will be called goats and vice-versa.  In this Google case, the goats will be happy to be mislabeled as sheep but many mislabeled sheep will rightly feel that Google is treating them unfairly.

The surprising thing is that despite all this, the search results seem no better.  The black-hat SEO websites still find a way of ensuring their spam results feature in the keyword listings.

What Google Should Do About PageRank

PageRank by now is part of the Google brand and changing a brand is always heart-wrenching and involves risk.  However PageRank is dead and all these efforts to keep it alive are doomed to failure. 

flogging a dead horse

In my opinion, Google would be just as strong (or perhaps even stronger) without PageRank.  A state funeral would be in order.  Normally this would be a time of sadness. 

The reality is quite different.  Google has amassed a huge amount of information on what people are doing as they search for answers or information.  This allows them now to unveil a new beginning for Google Search, which does not have the inherent self-destructive quality that PageRank has shown.

Enter The Knowledge Graph

Google is using all it knows about individuals and the search process to incorporate in the Knowledge Graph.

Here is their summary description.

The Knowledge Graph enables you to search for things, people or places that Google knows about – landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more – and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

Google’s Knowledge Graph isn’t just rooted in public sources such as Freebase, Wikipedia and the CIA World Factbook. It’s also augmented at a much larger scale – because we’re focused on comprehensive breadth and depth. It currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. And it’s tuned based on what people search for, and what we find out on the web.

It is based on a much more complete process that involves 3 steps:

  1. Find the right thing
  2. Get the best summary
  3. Go deeper and broader

In some ways the Knowledge Graph is akin to the synaptic circuits that function in our brains and link together a myriad of facts and concepts.  It has affinities with the Mind Maps that some of us use to capture the diverse thoughts we may have around a given subject.  In other words it’s strengthening the thinking processes that we all use to develop ideas and solutions.  This is an exciting time for Google and for us all.

So let us not waste too much time, energy and emotion in the funeral process for PageRank.  Instead let us exult in the birth of a new Google search process that will undoubtedly stand the test of time.

Credit:  Image courtesy of Micky Aldridge via Flickr

