What is the Concept of Google Page Rank Algorithm
Google claims to apply over 200 factors (or metrics, as we call it here) to the pages it considers in order to make a ranking decision? Well, the PageRank is one of those 200+; an important one at that.
There are so many definitions of PageRank around the web trying to capture the whole essence of it. Even Google doesn’t give out much information about it, perhaps understandably so. If it isn’t such a clear concept, why should we learn about it?
Because, Google itself admits that the PageRank system is kind of the heart of its search algorithm. While that says how important a factor the PageRank is in Google’s search algorithm, an SEO should always remember that it is still not the only factor that influences the ranking potential of a web page.
Now, what exactly is the PageRank?
First off, as many SEO beginners may tend to think from looking at its name, it is not the ranking position of a web page in the Google search results. It is but Google's way of calculating the importance of a given page. In other words, it is a score that Google assigns to every web page it has in its index, based on a complex calculation.
Okay, what do you think is the core of a democratic system?
Wait, what? Yeah, you heard me. You’ll understand the PageRank algorithm better when you consider the World Wide Web as a democratic system. Now, the basis of any democracy lies in the people’s power to vote, doesn’t it? So does it on the web, at least for Google. Google considers every link on the web as a vote.
So, the relative importance of a given page is simply the accumulation of the votes casted to it by a number of other pages on the web. Of course, it isn’t as simple as it looks. But that provides a basic understanding about the PageRank algorithm to begin with, which will help when you would want to learn about the PageRank in detail at a later point of time.
Now, let’s get another perspective that could help you have a closer look at how PageRank works.
Read Also: Top 10 ways you can promote your content
The Random Surfer Model
There is also another way of looking at the PageRank system. It goes like this:
Consider a random surfer sitting in front of his computer and browsing the web. Say, he starts on a given web page and browses on by clicking a random link that leads to another random web page without any particular rationale or intent. He just keeps clicking on a random link on the web page which he was led to by a previous page’s link he had clicked on.
The PageRank algorithm is a patented process developed by Google’s founders Larry Page and Sergey Brin as part of their research project when they were at Stanford University. The patent is assigned to Stanford and Google has bought the exclusive license rights on the patent. Wise, isn’t it?
Now, the PageRank of a web page is the likelihood of the random surfer arriving at that web page through random browsing of the web, and how quickly he does so. In a simpler way of putting it, how many links does he probably have to click on before he arrived at the web page in question?
And there is also a chance that the random surfer gets bored at a point of time, breaks the chain and starts over with a completely different page. This mimics a contingency factor (which we will see in a bit) that is also taken into account when calculating the PageRank.
Both of these perspectives may prove to be very important in understanding how PageRank is actually calculated.
Before starting on the specifics of the PageRank calculation, one should also know that it is not necessarily how exactly Google does it today. Though a few research papers related to PageRank are publicly available, however hard it may be for a common SEO person to understand the process, Google may be using a dramatically advanced version of it which the company doesn’t ever disclose.
Now, let’s dive in.
The Calculation of PageRank
The first thing an SEO should remember is that it is absolutely fine if one doesn’t have a clue about what the PageRank formula is trying to do. You don’t have to be a statistics-ninja to understand the role of the PageRank in SEO. Just grab the crux of the concept, and you’re well on your way to be a great SEO.
The PageRank formula goes like this:
PR(A) = (1-d)/n + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
PR(A): The PageRank of page A
n: The number of pages linking to page A
PR(Ti): The PageRank of the page Ti which links to page A
C(Ti): The number of all the outbound links on page Ti
d: A “damping factor” which can be set between 0 and 1 (remember the contingency factor?)
Now, let’s take it slow. We’ll refer to PageRank as PR hereafter. For the convenience of learning, let’s consider four imaginary pages: A, B, C and D where A is the page which is linked to by the three other pages – B, C and D.
Let’s also say:
- The PR of B is 4 and it links to 4 pages in total (including A)
- The PR of C is 2 and it links to only one page in total (which is A)
- The PR of D is 6 and it links to 3 pages in total (including A)
So the formula for calculating the PR of A more or less looks like:
PR(A) = PR(B) / C(B) + PR(C) / C(C) + PR(D) / C(D)
The PageRank formula essentially says that when a web page links to a bunch of other web pages, the PR of the linking web page is divided and distributed evenly among all the pages it links to, which means with our example:
PR(A) = 4/4 + 2/1 + 6/3
which says that the PR of the web page A is 5 on a rough calculation. We have missed something, haven’t we? Yes, the damping factor.
The PageRank of any given page is always between the values 0 and 10. While “0” indicates the lowest possible PR value, “10” indicates the highest.
The Damping Factor
For a fairer web democracy
It’s time to remember our random surfer. Well, our random surfer – though a hypothetical character – is a human character and is likely to get bored of clicking on links at some point of time. So the damping factor is simply the probability, at any step, that the random surfer continues to click on the next random link.
But how does this fit into a mathematical model? I mean, the random surfer model is just like a metaphor for the sake of understanding the concept. Isn’t there a technical explanation for why we’re having the damping factor in the formula?
There is.
When the system follows the pages and counts the “votes”, it is possible that the system stumbles upon a page that doesn’t link to any other page at all (that is, with zero links in it – let’s call it a “sink” page). In that case, it doesn’t necessarily mean that no single page on the internet deserves to be linked to from the sink page.
If a page gets a bunch of votes from other pages and doesn’t cast a single vote to any other page, it isn’t a fair web democracy, is it? The concept of PageRank stays valid only when the linking continuum is maintained smoothly. To ensure this, when the system encounters a “sink” page, it assumes that the sink page links to all other web pages in the lot. This is what is reflected in the formula as the damping factor.
It is generally believed that the damping factor is set to a value of 0.85
There believed to be less than 10 websites in the World Wide Web with a PR value of 10. Currently, the PR value of Google’s home page is 9.
The Key Takeaways for an SEO
That’s what we are here for, aren’t we?
As I said before, an SEO doesn’t need to understand the statistical aspects of the PageRank system in order to deal with it when strategizing an SEO campaign. There are just a few takeaways that we need to understand and remember:
- The PR of a web page, in its simplest sense, is an accumulation of PR it receives through all the other web pages linking to it.
- The PR of a web page is evenly divided and distributed to all the other web pages it is linking to.
- The PR of a web page indicates its relative importance among other web pages.
- The higher the PR value, the more important the web page is considered to be.
- The PR is one of the most important factors which influence the ranking potential of a web page.
- The PR is NOT the only important factor that affects the ranking potential of any web page.
But what do I make of these takeaways? Well, the understanding on the above takeaways very much matter when making a decision about link acquisition such as:
Will it be worth it to acquire a backlink to my website from this page?
Do I accept a backlink from a web page on a completely irrelevant niche, but which happens to have a high PR?
How far should I insist on everyone on my linking pages to have a high PR value?
Okay, now I want to know the PR of my home page. Am I supposed to sit and calculate the PageRank all by myself (No way on earth)?
Don’t worry. It’s Google’s job. There are a few ways to find out the PageRank of a web page:
- Installing Google Toolbar in your browser
- Installing any SEO browser extension that includes the PR information
- Making use of online PageRank checking tools. (There are many available for free)
For a more detailed reading on Google’s PageRank system, there are many useful web resources available.
- Jun 24, 2020