Search Engines: Pageranks and the Incredible Lightness Of Being

Yahoo has just released its Top searches of 2006 list, leaving us with the impression that people are using it only to find information on Brittney, Shakira or Paris Hilton. Nevertheless George W., N.Y. Yankees, Spider-Man and American Idol are scoring top as well.

Recently the American Mathematical Society has featured an article with an in-depth explanation of the type of mathematical operations behind Google’s search engine pagerank. The story with the title How Google Finds Your Needle in the Web’s Haystack points out because roughly 95% of the text in the 25 billion web pages indexed by Google is composed from a mere 10,000 words determining relevance requires extremely sophisticated sets of methods.

…Brin and Page introduced Google in 1998, a time when the pace at which the web was growing began to outstrip the ability of current search engines to yield usable results. At that time, most search engines had been developed by businesses who were not interested in publishing the details of how their products worked. In developing Google, Brin and Page wanted to “push more development and understanding into the academic realm.” That is, they hoped, first of all, to improve the design of search engines by moving it into a more open, academic environment. In addition, they felt that the usage statistics for their search engine would provide an interesting data set for research. It appears that the federal government, which recently tried to gain some of Google’s statistics, feels the same way.

There are other algorithms that use the hyperlink structure of the web to rank the importance of web pages. One notable example is the HITS algorithm, produced by Jon Kleinberg, which forms the basis of the Teoma search engine. In fact, it is interesting to compare the results of searches sent to different search engines as a way to understand why some complain of a Googleopoly…

Encourage by the article we had a look how our so far most popular story (Top 10 time waster games) is ranked with Google.

It turned out that has a pagerank of 3 (out of more then 1 million pages) plus we are also featured by the links on position one and seven.

Not bad for a web site that is public since a mere 1.5 months. And we do not use any specific SEO tricks or tools – it’s content only.

img page ranks

Update 07-December-2006 6.15am
Some people over at commenting on the AMS article suggested the following “slightly” simplified version of the Google algorithm:

SELECT advertiser, description, link, adcost
FROM tblAdvertisers
WHERE adword LIKE %searchstring%

Related Posts:

  1. Google: Create your own search engine with Google Co-op
  2. This morning Google has launched a customizable version of its search engine that allows users to prioritize or limit search results to a defined set of web sites. While Google has since quite some time allowed web site owners to create their customized search engines the new service includes many more options. Users will also [...]

  3. Weekend activity: Have Fun with Ms. Dewey
  4. You can say a lot about Ms. Dewey the Flash interface that Microsoft has put up to promote its Live Search web site. It might be tacky, completely useless for a search engine, they could have chosen a different girl, a complete waste of time. It’s fun – maybe not for longer than a few [...]

  5. Update: Gaming Digg or is anybody actually looking at the posts?
  6. In our previous post Gaming Digg or is anybody actually looking at the posts? we have shown evidence how Digg is currently spammed and asked users to read articles before voting for them. (our article was removed in minutes while being heavily digged by users) Since then a few things have happened: 1. Digg support [...]

  7. Internet: Finding the most watched advertising videos
  8. While many just switch to another channel when a block of advertisements starts on TV, others seemingly can’t get enough of it. And we’re talking dozens of million views here… All-time-high winner Sony Bravia not only made it to the front-page news, got millions of views on the top download sites, it even got its [...]

Comments Off

Comments are closed.