The Full History of Google's Penguin Algorithm
December 6, 2016 |
As Google's Penguin 4.0 update recently completed its rollout (according to Google), I thought I would take this opportunity to provide the complete history of this Google algorithm and its updates
. Up until recently, with the release of Penguin 4.0, an update to the Penguin algorithm was a very big deal (not that it still isn't). That is, until the most recent version, the only time a site could recover from a Penguin penalty was when the algorithm underwent an update. Now Penguin operates in real time (more on that later), but the question is, how did we get to this point?
Penguin 1.0 - Where It All Began - April 24, 2012
It all began during a spring-like morning back on April 24th, 2012. It was a dark time for SEO, "black-hatters" were artificially achieving higher rankings with manipulative tactics like link schemes aimed at fooling the search engine into thinking that a site was more significant than it really it was. It was on this spring day in April that Google decided to take internet search back from these digital pirates by introducing its Penguin algorithm.
Melodrama aside, the release of Penguin 1.0 was a monumental step in many ways, and the algorithm has become a sort of "staple" within the Googleverse. With spammy link schemes becoming more and more common, Google attempted to put a stop to the practice by issuing a ranking penalty to sites employing such tactics. A penalty, that would only be removed upon the release of the algorithm's next update. To an extent, Penguin's release changed the way the "SEO game" was played by ushering in an era of content focused more on quality per se, not link tactics and the like.
Penguin 1.1 - A Data Refresh - March 26, 2012
Just over a month after Penguin initially rolled-out, Google pushed the button on the algorithm's first update. On May 26,
famed Googler Matt Cutts posted the following Tweet:
The interesting thing about Penguin 1.1 is that it represented no actual change to the algorithm. Rather, the updated version was a "data refresh." A data refresh does not mean that an intrinsic change to the algorithm per se has taken place. Rather, it refers to the
, or re-release of the algorithm. In the case of
it meant those sites previously penalized that corrected the cause of their penalty
restored, while some sites that were not caught the first time around may have been hit upon the refresh. In terms of overall impact, the first Penguin update was, as Matt Cutts himself put it, "toward the less-impactful end of the spectrum
Penguin 1.2 - An International Update - October 5, 2012
Like the previous update, when Penguin 1.2 rolled-out on October 5, 2012, it also was just a data refresh. So again, as a data refresh, only a very limited number of queries were impacted. What made this algorithm update unique though was that the update impacted a small number of queries in languages other than English, as indicated by the below Tweet from Matt Cutts:
All in all the update appeared to impact just 0.3% of English language queries, with similar numbers for queries in other languages (i.e. 0.4% of Spanish queries).
Penguin 2.0 - The Next Generation of Penguin Updates - May 22, 2013
Fast forward to May 22,
when the next generation of Penguin updates
born. Moving past the first generation of Penguin updates indicated that Penguin 2.0 was not a mere data refresh. Via a blog post on this next generation of Penguin
, Matt Cutts indicated that the update would impact 2.3% of English queries and that languages that tended to have more spam would be impacted proportionality.
Simply put, Penguin 2.0 represented a technological upgrade that made it better equipped to fight the good fight against spam. Specifically, the new version of Penguin inspected not just a site's home page, but particular landing pages as well. Thus, if a specific page partook of black hat link building, this new version of Penguin would pick up on it, at least to a greater extent
Penguin 2.1 - A Deeper Spam Analysis - October 4, 2013
Released on the 4th of October 2013, Penguin 2.1 ushered in a variety of speculative theories as to what the latest version of the algorithm provided that its predecessors did not. Firstly, it would seem that the update was a bit more than just a data refresh, as Penguin 2.1 was said to have impacted 1% of queries (as compared to Penguin 1.2 which only impacted 0.3% of English queries). But what then was the upshot of the update? While Google never released an official narrative, in all likelihood it would seem that Penguin 2.1 took the technology of version 2.0 to the next level by crawling "deeper web pages" and analyzing if any spammy links were contained on them
Penguin 3.0 - Another Data Refresh - October 17, 2014
It would be just over an entire year before we saw another version of Penguin. Unlike previous updates where Matt Cutts provided a formal announcement along with a bit of commentary, the rollout of the third generation of Penguin took on a more mysterious tone. On October 17,
some in the SEO community began to see a significant change in Google's rankings. It would take two days before Google officially confirmed to Search Engine Land that the rank fluctuations were indeed due to the release of Penguin 3.0
With a name like Penguin 3.0, as opposed to Penguin 2.2, you would expect this new version of Penguin to pack a serious and unique punch aimed at spammy link practices. However, not only was this Penguin incarnation not transparent, it was merely a data refresh according to Googler Pierre Far
. So essentially, the year long Penguin update lapse did not result in a major change to the structure of
but did provide the opportunity for those sites who were penalized under Penguin 2.1 to come out of the penalty box so to speak.
Penguin 4.0 - A Real Time and Core Algorithm - September 23, 2016
After a nearly two year wait, which must have been excruciating for legitimate sites hit by Penguin 3.0, Google finally released Penguin 4.0
on September 23, 2016. Unlike its 3.0 predecessor, this update
is in reality
the next generation of the Penguin algorithm. For starters, the 4.0 release saw Penguin become part of Google's core algorithm. As a result of this, Google said that it would no longer be announcing any further updates to Penguin.
The second piece of news was equally momentous, if not more so. With the rollout of Penguin 4.0, Google announced that henceforth Penguin would be live,
sites in real time as they are being re-indexed.
This in fact
is huge, because it means that if your site should be impacted by Penguin, you would not need to wait months or even years for the next Penguin update in order to bounce back. What's more is that Google later revealed that the new algorithm does not issue a penalty per se, but rather devalues the spammy links
. The devaluation of links is more of a lack of a positive than it is the attribution of a negative. Think of it like this, links are good, they increase your ranking when done properly. Should
be "spammy links," Google will simply ignore them, and your page will not have the added value of having links on it (or as many
This is in contradistinction to the modus operandi
of previous versions of Penguin, that actually demoted the ranking of a page that contained spammy links.
The Evolution of Google's Penguin Algorithm
With the release of Penguin 4.0, the algorithm has in a sense completed the evolutionary cycle. It has certainly come a long way from its original construct, skimming for link spam on the homepage. In fact, even the sort of tense relationship between the algorithm and the SEO community has in many ways been healed as Penguin completed its evolution.
No longer are those legitimate sites who have been hit with a Penguin penalty waiting (which in the case of the latest update was years) to recover. As a result, you can make the case that the most interesting and dynamic aspect of Penguin's progression has not been technological, but sociological - as in its most modern form the algorithm has balanced both technological need with communal unanimity. Taken from this perspective, does Penguin serve as a microcosm of the balance that can exist between technical need and communal preference and consideration that the SEO community seeks?