Friday, January 5, 2007

Development of more Sophisticated Ranking Algorithms

Google was started by two PhD students at Stanford University, Sergey Brin and Larry Page, and brought a new concept to evaluating web pages. This concept, called PageRank, has been important to the Google algorithm from the start. PageRank is an algorithm that weights a page's importance based upon the incoming links. PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfed the web, and followed links from one page to another. In effect, this means that some links are more valuable than others, as a higher PageRank page is more likely to be reached by the random surfer.

The PageRank algorithm proved very effective, and Google began to be perceived as serving the most relevant search results. On the back of strong word of mouth from programmers, Google quickly became the most popular and successful search engine. PageRank measured an off-site factor, Google felt it would be more difficult to manipulate than on-page factors.

Despite being difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi search engine, and these methods proved similarly applicable to gaining PageRank. Many sites focused on exchanging, buying, and selling links, often on a massive scale. This has spawned an online industry, that survives to this day, focused upon selling links designed to improve PageRank and link popularity, and not to drive human site visitors, with links from higher PageRank pages selling for the most money.

A proxy for the PageRank metric is still displayed in the Google Toolbar, though the displayed value is rounded to be an integer, and the toolbar is believed to be updated less frequently and independently of the value used internally by Google. In 2002 a Google spokesperson stated that PageRank is only one of more than 100 algorithms used in ranking pages, and that while the toolbar PageRank is interesting for users and webmasters, "the value to search engine optimization professionals is limited" because the value is only an approximation. Many experienced SEOs recommend ignoring the displayed PageRank.

Google — and other search engines — have, over the years, developed a wider range of off-site factors they use in their algorithms. The Internet was reaching a vast population of non-technical users who were often unable to use advanced querying techniques to reach the information they were seeking and the sheer volume and complexity of the indexed data was vastly different from that of the early days. Combined with increases in processing power, search engines have begun to develop predictive, semantic, linguistic and heuristic algorithms. Around the same time as the work that led to Google, IBM had begun work on the Clever Project, and Jon Kleinberg was developing the HITS algorithm.
As a search engine may use hundreds of factors in ranking the listings on its SERPs; the factors themselves and the weight each carries can change continually, and algorithms can differ widely, with a web page that ranks #1 in a particular search engine possibly ranking #200 in another search engine, or even on the same search engine a few days later.

Google, Yahoo, Microsoft and Ask.com do not disclose the algorithms they use to rank pages. Some SEOs have carried out controlled experiments to gauge the effects of different approaches to search optimization. Based on these experiments, often shared through online forums and blogs, professional SEOs attempt to form a consensus on what methods work best, although consensus is rarely, if ever, actually reached.

SEOs widely agree that the signals that influence a page's rankings include:

1- Keywords in the title tag.

2- Keywords in links pointing to the page.

3- Keywords appearing in visible text.

4- Link popularity (PageRank for Google) of the page.


There are many other signals that may affect a page's ranking, indicated in a number of patents held by various search engines, such as historical data.

No comments: