The Google Tango

What image does that Google Tango call to mind?  Perhaps it was of the Google co-founder Sergey Brin ordering three electric Tango vehicles.  Brin and others have been heavy into electric cars recently. Brin is invested in Tesla, the manufacturer of the Tango, and has ordered three Tangos (all the luxury T6000 model, which cost $148,000 each).

Courtesy of Camille Cusumano

What we had in mind was the other Tango.  For those who are not into ballroom dancing, that’s the evocative South American dance with the rhythm, Slow, Slow, Quick, Quick, Slow.  That seemed an appropriate description of Google’s speed of action on a variety of operations.  Of course Google prides itself on delivering search results on complex keyword searches in a fraction of a second. 

Google can also react fast to signals that are sent directly to it.  This means that for blogs, indexing of blog posts can be very fast given that RSS news feeds provide an immediate signal when new posts have been added.

That is a process that Google finds very effective.  That is why Google is pushing for a new system that will allow the Google Index to Go Real Time.

Google is developing a system that will enable web publishers of any size to automatically submit new content to Google for indexing within seconds of that content being published. The PubSubHubbub (PuSH) real time syndication protocol, could be used by Google for indexing the web instead of crawling the links.  PuSH is a syndication system based on the ATOM format whereby a publisher tells the world about a Hub that it will notify every time new content is published. Google would ask every website to declare which Hub they push to at the top of each document.

So much for the Quick, Quick but why the Slow, Slow for Google.  This is because there are some processes that operate on a much slower time cycle. Perhaps one of the most extreme is Google Maps.  Google can partially blame the map database sources it uses. However there are some examples that are almost ludicrous.  The biggest local example of that is hard to miss.  The data for the Golden Ears Bridge across the Fraser River took almost 9 months of operations before Tele Atlas updated its map index as of March 31.  Mapquest picked it up immediately.  At the time of writing some 12 days later, Google Maps still has not picked this up.

The other area where Slow, Slow applies is the speed at which new web pages not included in RSS news feeds get into the Google index. In some cases, this can be measured in months.  Here the enormous and explosively growing size of the Internet limits what is possible.  Even if a URL to a web page is found, it may be some time before the spiders or crawlers can revisit to fully identify what is located at that URL.

In this case, Google had a choice on whether its index should be Big and/or Fast and/or Accurate.  In practice given the Internet dynamics, only two of these are attainable at the same time.  Google has chosen Big and Accurate and the result is as fast as they can make it, which is still very slow. 

We are now promised that a new process, Google Caffeine, is being slowly rolled out.  However this will probably deal with the way search results are developed rather than the way web pages are added to the index.  It seems likely that we must stay satisfied with the Slow, Slow rhythm for the speed at which web pages are included in the index.

Nevertheless Google offers sufficient processes that go at the Quick, Quick pace so must of us will continue to be happy with the Google Tango.

Reblog this post [with Zemanta]

2 thoughts on “The Google Tango”

  1. Very interesting, I wasn’t aware of this PuSH plan that Google has in the works. It seems like it will really make a huge difference in how quickly many websites are indexed. But it also sounds like it’s opening the door for a lot of abuse too. Like, what if someone has their website constantly making minor changes automatically and constantly pushing them into the index?

Comments are closed.