Is Search Rank Group-think?

Way back in April 1997, Jakob Nielsen tried to educate us on Zipf Distributions and the power law, and their relationship to the web. This is where discussions of the Chris Anderson’s Long Tail start, but the emphasis is on the whole picture, not just the many economic opportunities at the end of the tail.

Long tail.

Here’s how it works with hits to websites:

  • a few sites become popular and form the “big head” at the left
  • a few more sites form the slope
  • a huge number of websites score very low and form the “long tail”

Nielsen adds these examples:

  • a language has a few words (“the”, “and”, etc.) that are used extremely often, and a library has a few books that everybody wants to borrow (current bestsellers)
  • a language has quite a lot of words (“dog”, “house”, etc.) that are used relatively much, and a library has a good number of books that many people want to borrow (crime novels and such)
  • a language has an abundance of words (“Zipf”, “double-logarithmic”, etc.) that are almost never used, and a library has piles and piles of books that are only checked out every few years (reference manuals for Apple II word processors, etc.)

But the point here is about Google (or Yahoo, etc.) search results ranking, which puts enormous value in the number of incoming links to a page. It turns out that these links also follow a power-law distribution and it not uncommon to find complaints that Google’s Page Rank recognizes popularity over other factors.

So it’s worth wondering: is popularity bad? Are popularity and quality mutually exclusive? Do search rankings represent some sort of global group-think?

Now put this in an academic library context and consider a student Googling for background for a research paper (think University freshmen the night before it’s due). Is it possible that linking patterns work like Wikipedia and tend to favor quality, or do they simply represent lowest common denominator popularity. Do search results reflect the sum of our altruistic linking intentions or our base crudity?

More about search ranking and libraries: