Rank Ranger Blog

SMX Advanced 2018 - A Machine Learning Love Story

After I had the honor of speaking at SMX Advanced 2018, I wanted to share my thoughts and insights with those who could not attend the conference. So, I did what everybody else does, I threw the slides up on slideshare. Great... now you have a bunch of really nice looking slides with an average of about 5 words per slide... enjoy those insights (insert sarcasm). 

I was faced with three choices; to wish you the best of luck deciphering my message via a bunch of pictures, redo my slides with more text (not happening), or to recreate the presentation somehow. 

I opted for none of the above. 

Instead, what you have is me practicing my presentation... in my car (for better acoustics, but mainly fewer children to ruin the recording). I figured the recording was good (I was recording it to hear myself so that I can be overly critical of my performance, which I realize is a bit masochistic), so why not just give you a raw look at the presentation, sort of like a demo tape? 

Basically, what you have here is a look behind the scenes in what is a homemade and significantly less cool version of Carpool Karaoke with James Cordon (just with no singing, nor any celebrities). 


Beyond the Factors, Beyond the Niche, A Machine Learning Love Story [Transcript] 


Welcome to beyond the niche, beyond the factors, a machine learning love story. My name is Mordy Oberstein and I used to be a classroom teacher, however, through a totally irrelevant and boring story, I am now the marketing manager over at Rank Ranger. Rank Ranger is a custom SEO and digital marketing suite. It was built in 2009 by some pretty gifted folk.

Now, what we all are here for and that is, in any love relationship there are these four sort of dreaded words and those are: "we need to talk.” We need to talk about machine learning, about ranking factor studies, particularly niche factor ranking studies and how you use the data coming out of these studies.

So I'm going to start with a bit of a story. A story about a boy, a quiet boy living a very uneventful life and his name happens to be "rank" and he's someone who lives a very stable, uneventful life... until he meets a big city girl who sort of brings him out of his shell, turns him from a boy and makes him into a man. And despite your naughty naughty thoughts, I'm actually referring to the increase in rank fluctuations, rank volatility over time.

So this is not really the biggest news ticket item. You know this already, rank is more volatile (relative to the past) but it's still worthwhile to take a look and see what exactly does that landscape look like. Okay, so to get a sense of where we stand now with rank, let's take a look at a dataset that spans across five different niches and particularly the top five results that these keywords produced. Taking a look to see how often, from one month to the next, there was a sort of an exact match. In other words top five domains, same domain, same order for one month to the next. And survey says, that used to happen about 27 percent of the time which is pretty good. Currently, that stands at about 10 percent. So you don't need to be Einstein to realize that's an enormous increase in rank volatility among the top five results. You can only imagine that things get progressively worse, top 10 and top 20.

Okay, so rank is more volatile than it used to be. Yes, but to what extent, specifically how many positions are sites moving? One position? Two positions? So same five niches, but this time looking at the top 20 and the number on the average positions that sites tend to move when they are in fact on the move. That number used to stand at about two and a half positions. Of course, you can't move half a position because that'd be crazy. So it's an average, and currently move about four positions. So again, it's not rocket science. That's an enormous increase in the extent to which sites are moving up or down the page, hopefully up.

Yet there's still one more dynamic that I want to take a look at and that is rank diversity, how diverse is rank? And to understand this, let's take a look at the same five niches top 20 results, but counting up the number of unique domains produced by the dataset. So back in 2016, the dataset produce about 1,300 or so unique domains. And currently, we're at about 2,200. So yes, once again, an enormous increase.

So rank is far more volatile, especially in [sites] moving to a greater extent up or down, and rank is much more diverse than it used to be. Which, of course, begs the question of what is behind this? Who is this sort of mystery woman driving rank to be as prolific as it is now? So, as you may have guessed, I'm referring to Google's machine learning properties, RankBrain in particular. Which makes good sense because again, a more diverse understanding of intent requires a more diverse set of sites - makes good sense. But I never taught you anything novel yet. Where am I going with all of this? And what I want to really look at is how or what new dynamics has machine learning produced in terms of rank? What does rank look like now?

So first we have to realize that machine learning, or RankBrain, looks at far more many queries than it used to by a lot and it touches almost everything at this point. As Google machine learning properties are touching more of the, quote-unquote, Googleverse, what's the impact of that? Or keeping with our theme, what is the offspring of this reality?

And to understand this, we're going to take a look at a few insurance websites and their visibility scores... rather than visibility trends. So, here's progressive.com. You'll notice there's a sharp increase from the end of 2016 and onwards and then a decrease in the summer of 2017. And things are looking back up for the site now. Very, very similar pattern for statefarm.com and very similar pattern for allstate.com. If you were to look at geico.com or nationwide.com, you would notice it’s the same sort of trend. Okay. So we have these set of unique trends or visibility patterns for these various sites but what exactly does that mean for us?

So let's take a look at these sites as they actually exist on a page. So here's a query for buy car insurance and you'll notice some of these sites like Geico, rank on this page and they're there in blue, highlighted in blue. But as astute observers you'll also notice that I have a couple of sites in these red boxes here. Now, these are sites where you cannot actually buy a car insurance policy, but rather, you can learn about buying a car insurance policy. They are informative sites and, believe it or not, 4 out of the 10 sites that rank on this page are these sort of informative sites where you can't actually buy a car insurance policy but can only learn about doing so. And, wouldn’t you guess it, these sites have their own ranking pattern, their own visibility score trajectory. So here you have consumer reports.org. Let's compare that to nerdwallet.com and again very, very similar pattern but different than the insurance sites.

So you might say, okay great. We have these two different types of sites with the two different unique ranking patterns that show up on the same page, but is that common? Is it a fluke? No! What I did is I ran 150 buy keywords, buy queries. Simple things like buy socks or buy underwear [and] complicated things like buy a boat, buy a house, buy a car. It turns out that of these 1,500 or so results (150 keywords - 10 results per page... 1,500 or so actual results), about 26 percent of them were these informative sites where I could not buy anything but only learn about buying the product in question.

Big whoop? Well, it is a big whoop. Why? Well, think about it like this; machine learning creates a context for rank. In other words, before you ever get to a ranking factor, per se, Google's machine learning creates a context for rank.

That's a very abstract concept. I'm going try to boil it down to some concrete understandings. Okay? Think about it like this. Google's machine learning understands that there's multiple intents for the keyword "buy car insurance" and in fact it also sets up a proportion of sites to meet each intent. In this case, Google's machine learning has set up four sites to meet the intent to learn about buying car insurance and six to actually buy a car insurance policy. Or, think about it this way. Geico.com is the number one rated site on that particular page. Was it an accident? And I don't mean Geico per se, I mean a site where you could buy car insurance relative to learning about buying car insurance. In other words, no matter what that second site does, no matter how optimized it is, that second site offering the ability to learn about car insurance will never rank in the top spot. Because of a ranking factor? No, because the essential intent of that query is to buy car insurance. So a site that offers car insurance will rank in the top spot.

Or think about it like this, let's take the search "buy car insurance", which sounds simple enough... but Google says, "You know what Oberstein? What you really mean 100 percent of the time is to learn about buying car insurance." Now, what happens to those six sites that are currently ranking where I can buy car insurance? Well, presumably they would all fall off the page with of the SERP. They wouldn't rank anymore. They'd be irrelevant. Or think about it like this. All of these insurance sites go up together. They all go down together. Why? Because Google said, "Hey, you know what we're going to do? We're going to change the way we weigh the various ranking factors for this niche." And it turns out what's going to happen is all these sites coincidentally, are optimized the same way. They all go up together. No, because the common denominator here, is not optimization [rather] it's what these sites do - what they offer. In other words, what’s far more likely is that Google's machine learning properties have understood intent differently for certain keywords and these sites became highly, highly, highly relevant, and their rank/visibility scores went up. And as Google readjusted, their visibility scores went down or their rankings went down.

All of this points to a new ranking paradigm. One that is primarily driven by Google's machine learning properties. Think about it like this. Sixty percent of that buy car insurance page, page one of that SERP, is occupied by sites that offer me car insurance and car insurance policies. And no matter what these sites do, no matter how optimized they are, no matter how in line they are with each niche factor weight, they will never rank for the other 40 percent of the page. Now, this of course, brings up a good practical application. If I was Geico, what would I do? Well, I would write a guide on how to buy car insurance, so they can circumvent this and rank on the other 40 percent of the page.

So what does this mean for niche ranking factor studies? Well, intent or sub-intents, the endless number of intent within intent with intent within intent, throws that data into a bit of a tizzy. And there are multiple ways you can look at this. You're targeting multiple keywords that are more or less synonyms of each other. Yet, the latent intent the Google sees within them is vastly different. Or, you can take a look at one page, one SERP with multiple sites, multiple intents on them like we saw before. So let’s go with the latter because it's a bit easier and a bit less time-consuming to go through.

What I did here was, I ran 100 more buy queries but this time only for buying software. So buy digital marketing software, buy eLearning software, and so forth. It turns out, of these 100 or so keywords that produced 1,000 results - 40 percent of them were these informative sites where I could not buy anything. Which is up from the 26 percent from the keyword set overall which makes good sense because buying socks is far less complicated than buying software. But that's not where the story really ends, because 70 percent of these 400 or so sites were sites that listed best products. In other words, do a search for buy digital marketing software and you get back top five digital marketing software solutions you need, or your entire marketing strategy will blow up in your face. But 30 percent were these, sort of, buying guides that taught me not which product to buy, but how to go about choosing the various product. What considerations I might have a before I choose a product. In other words, you have a query, you have intent A which is to buy and that's simple, but probably not. Then you have a intent B and intent B is to learn about buying. Within intent B is sub-intent A. Sub-intent A is which product to buy. Sub-intent B is how to go about choosing these products, what considerations to have before you buy the software product in question. Each of these intents and sub-intents are going to have their own weighing of the various factors that apply to them. Which means for niche ranking factors studies, and forget general ranking factors for this because they are far too general, with all due respect and while they should be read or they should be used and I’m not advocating otherwise, the usability of the data or how we use it needs to perhaps be reevaluated.

Why? Well, as I mentioned, each intent or sub-intent is going to have its own weighing of the various ranking factors. You have to remember what is a ranking factor? It's a signal to Google that this site will achieve X. It will achieve its purpose. In other words, if a site is intent, no pun intended, on achieving X, well for X, [factors] A-B-C-D-E-F-G are important. But let's say you don't want to achieve X, let's say you are all about achieving Y with your site. Well for Y, [factors] A-B-C-D-E-F-G are irrelevant. [Factors] 1-2-3-4-5-6.... that becomes relevant.

Let me give you an example from the medical niche. Within the medical niche, there are multiple types of sites. For example, my uncle has diabetes. So, he might do a search for how to make this green goo shake if you have diabetes. So it's a recipe site within the health niche and like any other recipe site, and Google's pretty much said as much, having an image is very important because I will bounce if I cannot see what this thing actually looks like. Within the same niche are diagnosis sites like do I have a pinched nerve. Now here a page could very well show you a generic image used in every Icy Hot commercial with some person rubbing the back of their neck. In other words, two sites, within the same niche with two very, very different things that are important to them. So it's not so much what tips the scale for your niche, but what tips the scale for the pages on your site. Which means, of course, that going forward we have to reevaluate , or reconsider, how we use the data from niche ranking factor studies.

Which of course then begs the question of what do we do? Where is opportunity in the world of intent?

So, I am running low on time, but I'll tell as much like this. It becomes highly, highly important to understand the way Google looks at you from the lens of intent. I’ll be so bold as to say that the tools that are around now versus the tools that are around in 10 years from now are going to be the ones that offer you insight on intent. And you don't have to wait for that. There are about a million things you can do. It's very, very particular to what kind of site you have or what you want to do with your site. I mean it's anything as simple as thinking, brainstorming, what are the multiple reasons why a user would search for this term? Why would they go about this query and what are the implications? What are the different paths users will take based upon their intention or the reason why they're searching for this?

Or, it could be as simple as doing what I've been doing this entire time.

Yes, like a good teacher, I've been showing you how to do this without actually telling you. You can simply run a query, break down the multiple intents on the pages, or sub-intents on the page, and then see if that’s pervasive. For example, in the case of buy car insurance, for the term buy Google understood buy as "learn" 26 percent of the time.

So, I'll sort of summarize it or wrap it up with this. And as cliche as it sounds, the future of SEO is in intent analysis. It's understanding how Google views the, to use a Freudian term, the latent and manifest intents that are embedded within the keywords that you're targeting and the manifest and latent intents that are embedded in the content that you're creating on your page. Then understanding how those two align. Like in the case of Geico, it didn't align - so Geico would write (and they have) a guide on how to buy car insurance.

It also means not being afraid of machine learning because there are things you can do, and as I've said, it means going query specific and understanding and experimenting with 'how Google sees me through the lens of intent and which factors are subsequently important or not important for me'. And if you don't know where to start, a niche ranking factor study showing that the most influential factor for this niche is X could be a good place to start. But it also means using your brain a bit and saying, "Well, you know what, the niche ranking factor study says that this is the most influential ranking factor for me and my niche, but it makes no sense for my site?". Well, it might not make sense for your site and it's okay to buck the trend, so to speak. And with that, I will end and I thank you for joining me at SMX and hope to see you at the next one.

About The Author
Mordy is the official liaison to the SEO community for Wix. Despite his numerous and far-reaching duties, Mordy still considers himself an SEO educator first and foremost. That's why you’ll find him regularly releasing all sorts of original SEO research and analysis!

Start your free trial

Get the ultimate SEO tools with Rank Ranger
Start Free Trial
No Credit Card Required