In Search [Episode 76]: Mining Google Patents for SEO Gems
Don't forget, you can keep up with the In Search SEO Podcast by subscribing on iTunes or by following the podcast on SoundCloud!
June 23, 2020 |
The In Search SEO Podcast
Driving Your SEO Strategy by Unpacking Google's Patents: Summary of Episode 76
SEO legend Bill Slawski joins us today for a sweeping look at what patents tell us about Search!
We’re getting into:
- How to approach a Google patent
- Why the information within patents is incredibly important to doing SEO
- What trends the patents have revealed about what Google is looking for
Plus, Google just released something that's set to change the content landscape... here's why top-level content will a horrific death by the SERP!
of Go Fish Digital
and SEO By the Sea
Spam, Damn Spam, and Statistics
In Search SEO Podcast [Episode 55]
When Google’s Intent Targeting Goes Too Far
Dan Russel’s Site
Google is Suggesting Searches Based on Users’ Recent Activity
Page Experience Update will Measure Your AMP Page in 2021
Google is Testing Expandable Tabs Under Images
One Million Sites Using REL=Sponsored
Google to Ban Clickbait in July
New Updates to Google Merchant Center
Smart Display to Look at AMP
Follow the podcast on Twitter
How Google is Making It Tougher Than Ever to Reach Broad Audiences [00:05:30 - 00:17:53]
Sometimes it can be hard trying to explain a change to the SERP but this one is easy. Just head to the SERP, type something in, and sometimes you’ll get a set of filters that will filter the SERP accordingly. For example, if you search for ‘top selling vacuums’ you will get filters for Amazon, Walmart, and Dyson, and when you click one of those options you get brought to a whole new SERP with just the results from one of the options.
On episode 55 of this very podcast, Mordy said Google will have to offer a visual element like it does for Image Search to filter results on the SERP and it seems that day has come.
So what does this filter mean for SEO and content? That it will kill a very popular tactic. Take the vacuum example. While it’s jarring because it’s e-commerce, that’s not what this filter is for.
Where this filter shines are for queries like ‘how much do you need to retire’ where Google gives you a set of age filters like how much do you need to retire if you’re 70, 80, etc. Or, where this filter really shines, with queries like ‘best selling books’ where you get filters for book genres like fiction, sci-fi, mysteries, etc.
Why is Google doing this? It would seem the user is their main priority with this change. They probably just want to show better results and make it easier for the users. This is sort of true, but in what way? Let’s say you searched for ‘best selling books’ before the advent of these top of the SERP filters, you would see a Featured Snippet listing the best selling books. But these books aren’t sorted by any category and would probably be for any genre. This Featured Snippet, and the webpage, is meant to cast a wide net, grab users from many genres to come to your site and from there they can peruse the genre they love most!
But now with these filters, there’s no more need to click on the Featured Snippet that casts that wide net. You just click the filter for the genre of book you’re looking for. Don’t you see what Google just did? They moved the user down the funnel right from the SERP!
This is not even the first time Google’s done this. The time before this was with the advent of zero-click searches where Google was cutting the cord on top of the funnel kind of content. Do you see the pattern? Google wants highly targeted content from us. It’s pushing us to forgo this over-generalized content, that top of the funnel content, and instead to give users highly targeted, highly specific content. Google is asking us, and here’s the hard part, to trust them. Trust them to get the user from the general query to our highly targeted content with things like the filters were talking about here.
In other words, Google is telling us to create targeted content because the age of casting the wide net with your content is over!
By the way, these are not the only ways Google is sending this message. By using super-authorities for top-level keywords (think health queries) Google has pushed the average site off the SERP and has left the average site to offer niche-specific, micro-topical content. A whole study on this is on the way so be sure to check it out!
How to Analyze Google Patents to Unlock SEO Gems: A Conversation with Bill Slawski [00:17:53 - 00:49:41]
Welcome to another In Search SEO podcast interview session. Today we have an SEO wizard for you. He spins out Google patent patterns with pizzazz. He's the director of research at Go Fish Digital. You can catch his insights on SEO by the Sea. He is a legend. He is Bill Slawski.
Thank you very much. I appreciate being here.
My pleasure. I have to ask you, I noticed you have a lot of sea themes with Go Fish Digital and SEO by the Sea. I’m assuming it’s not accidental.
I grew up on the Jersey Shore and we used to go fishing there. I was working in an SEO agency in Havre De Grace, Maryland and watched sails bouncing up and down the Chesapeake Bay when I came up with the name SEO by the Sea. I wanted to run a free SEO conference where the people who were watching it would post things that they wanted to talk about. It was a good idea. It didn't end up getting too many people and I ended up with a website for promoting it and I turned the website into a blog where I just wrote about things I found interesting.
Well, it is quite the blog.
Let's talk about Google patents. Bill, as a patent maven, I have to ask you, how did you first start looking at this whole patent thing? How did they come about?
A couple of different ways. One of them was there were a lot of people talking about specific patents in the early 2000s like information retrieval based upon historic data. It was written by half a dozen search engineers at Google who were in the business for a long time. People like Matt Cutts, Jeff Dean, Monica Henzinger, and some other people who do that type of stuff. It was filled with ways to identify if a webpage was spammy or stale. They came up with a list of a bunch of things. It was as if they were sitting at a bar somewhere writing on napkins.
It was one of these patents that ended up getting rewritten, revised, and updated about a dozen times. It was broken down into different topics because it was too much information for one patent. For example, the Domain Registration Length patent is a spam filter where people only register for one year which Google got wrong. A lot of domain registrars will register for one year, take your credit card information, and then automatically update you at the end of the year. So you don't register for more than one year, you register for just one year. It doesn't mean you're a spammer.
There's a study by Microsoft called "Spam, Damn Spam, and Statistics” that talked about German spammers who would get domain names and create subdomains on those domains. They will create multiple subdomains because they don't want to pay for one account. They came up with this statistic that showed that the more hyphens in the subdomain names, the more likely the site was spam.
Google is trying to do the same thing with this patent with the domain registration length but it’s not really a good sign that a site is spammy that it's only registered for one year. But a year or so after the patent was granted, GoDaddy came out with a commercial that said, "Register your domain name for 10 years. It'll help your SEO.” Matt Cutts had to come out with a video where he said that just because we have a patent doesn't mean we're using it.
That's a very good point. This fact that just because there's a patent doesn't mean Google's using it, how do you reconcile that? How do you deal with that reality if you're trying to use this strategically?
I research. Who wrote the patent? What else did they write? Did they have white papers that are related? Do they have other patents on the same subject? Sometimes patents will say what are the related patents. Once, in 2004, a patent came out from Google on praise-based indexing by Anna Patterson which has about 20 related patents. You don't automatically assume if someone published one patent that they're using it. But if someone publishes 20 related patents then there's a good chance they’re using it.
Right around the Medic Update you wrote about a patent that came out in August of 2018 of Google organizing and categorizing sites and you can sort of see that happened within the medic update itself.
Yes, the Website Representation Vectors patent. It said we're going to classify websites based upon similar websites and features from those websites and use neural networks to classify those pages. We'll take queries and we'll look at query logs and we'll clarify those based upon what we see in the query logs. Then if they fit in certain knowledge domains, we’ll try to match the queries with specific websites. So we may have a query like ‘treatment for diabetes,’ and we’ll only look at health-related websites written by doctors in the search results. They are narrowing the focus down to how many websites they actually have to look through to find the answer and it makes Google much more efficient to do that.
Right and if you do a query you’ll get your top-level sites like WebMD or Mayo Clinic, but if you go longtail like ‘diabetes’ you'll get your niche sites that deal with just diabetes.
Well, it says on the classification patent that Google has some types of sites where they actually want answers from laypeople. Google wants people who may live with diabetes who answer questions like what type of diet do you eat when you have diabetes. You don't necessarily need a doctor or Ph.D. to tell people what you eat every day for breakfast, lunch, and dinner if you're diabetic.
I actually just quoted you in a webinar I did that the intent in that query is experiential knowledge, not academic knowledge. That's not the kind of authority that you want or Google wants to have on the SERP. If I'm undergoing chemotherapy, God forbid, then I want to understand how to live the fullest life possible. Some doctor at Harvard is not going to give me the answer to that but somebody who actually went through that process will.
If we look at the Google Rater Guidelines, it talks about the results that they want to see, but it doesn't necessarily play out with what the algorithms are using.
So is that vindication when you see what you're reading about actually playing itself out in the wild? Is that a complete vindication, that euphoria?
I'm listening to a book on Audible called "Once Upon an Algorithm.” The author is a computer scientist, and he said, "Okay, so algorithms are about computing science, not computer science, because you're not studying how computers work. You're actually studying how people compute. You're studying how algorithms are problem-solving and how people address certain problems. You see that when you read patent after patent. They often start out with, "This is the way life is now, this is the way people do things, this is the problem, and here's a solution.” They solve problems. So seeing patent after patent talk about how they're trying to solve problems is really interesting and helpful. It gets you an idea of what the search engines are trying to do step-by-step to make searching, indexing, crawling, and returning results better
Seeing patent after patent over and over again must get you inside Google's head a little bit. That's got to change the way you think about things when you analyze Google itself or think about Google itself.
Well, when people say they’re favorite author is Stephen King, I say my favorite patent writers are Jeff Dean, Christian Upstill, and Emily Moxley. These are people who write about specific types of things and when I'm looking through patents that get published every week I look for those names.
Yeah, Jeff Dean is a name that I don't think we talk about as much as John Mueller, but when Rankbrain came out, Jeff was all over the place. It’s funny that you don't see him being so mainstream in the SEO community anymore. Do you think that's problematic?
You do see him producing videos where he's talking to a very academic audience. His position isn't a webmaster evangelist, like Gary or John Mueller. Part of their job is to talk to SEOs and the public and help people learn about search. There are other people like Dan Russell who is a specialist in how people search. He has a website where he puts up search problems. He gets people going in the comments with how they would find certain types of things, what they would look for, and what they did when they found those things. He researches things like why do people Google ‘Google’?
That's very good.
I think it'd be great if you had some of those deeper or more substantial elements more integrated into the mainstream SEO community, as opposed to talking about headers and links. Let’s get into the deeper stuff.
This is partially why I like going to the patent so much. Each patent has a discussion where they analyze why they're doing things they're possibly doing. I read one last week, which was actually from three years ago, but it answered a question that was raised in Search Engine Land about how search suggestions were appearing in response to a user’s queries based on things they have seen in the past.
Right. You were tying that into that recent feature that popped up where if you're searching for something and then a couple of days later, you're searching for something similar, Google will put up a suggestion if you want to search for this again. That's interesting to me because a lot of these more peripheral places you hear about Google's looking at clicks or how long you're dwelling on a page, but when people like Barry Schwartz ask Google, "Hey, are you guys using this for the algorithm?” The answer is no and I find that interesting.
It's not being used to rank the results you’re seeing in real-time. It's been used to rank the search suggestions.
Which is interesting because why would you use it there but then not in the results?
Historically, things like dwell time and user click selection are signs of search satisfaction. They show that people want to see the results that they're finding through search. They're spending time with those pages. People from Google and the webmaster evangelists say that those are noisy. It tends to be difficult to say whether or not somebody right-clicked a search result and opened up a new tab and left that open for hours and didn't actually spend all that time there. They may not have even looked at the page, but we can't tell. Or somebody gets your search result, they click through, and they get a phone call as they're looking at it so they get their attention distracted. Again, it's not necessarily a sign that they read the page and enjoyed it.
Yes, but they did click it, but we can go down this wormhole forever.
So the snippet, the caption, the title, the URL, and the small passage of text by itself isn't the webpage, it's a representation of the web page. So if you click the search result that doesn't mean you like the page. That just means you liked what was seen as a small snippet within a bunch of other snippets.
Well, there's no real way to know if someone likes the page other than asking, "Hey, do you like this page?”
To bring it back full circle, how, in a matter of practicality, have your thoughts on Google changed over time by looking at the patents? What's your outlook now versus maybe when you started looking at these things a long time ago?
They've given me lots more questions to ask myself. Lots of avenues to explore that I otherwise wouldn’t have looked for. I've mentioned phrase-based indexing which brought me to the concept of semantic topic models. The fact that if you get a web page that's about the White House, you can tell it's about the White House by the presence of specific phrases on the page, like President of the United States or Oval Office. These things predict that that page is about that. So those are the types of things that I want to include on my webpage. If I'm choosing certain keywords, I want to do a search on Google for that keyword. I want to look at frequently co-occurring words. There was a computer scientist, who said you shall know a word by the company it keeps. It's a matter of certain words co-occurring frequently. That's the basis of word vectors.
It's fascinating to hear you talk about this because I feel like so often we get very lost in the practical tips and really getting into the underpinning of where Google's going directionally whether you're looking at patents or whether you're looking at trying to understand the algorithms. That sort of directional look at what Google is doing gives you the ability to form an overall approach as opposed to trying these five tactical things. From my experience, it gives you a way to say, "Okay, I think I'm going to approach the whole thing this way now.”
Right. Skate to where the puck is going to go.
Hey, a hockey reference. Nice.
So you get an idea of where things are going. You’re doing an SEO strategy for a client and you explain why he's building certain pages. So you design this product and you want to show examples to different audiences of your website of what the products are that you're creating. So you build specific pages for each of those audiences and show them examples. The idea here is you're creating pages that should appeal to specific audiences that answer their questions and that solve their pain points.
Plan this for them, work with your subject matter experts, the site owners in most cases, and get a good sense of what those ideally should be.
So you really use this to say that you see from the patents where Google's heading over here. Let me now try this with my clients. You’re really putting it to action.
That’s right. So a lot of this stuff recently has been about knowledge bases, knowledge graphs, and stuff like that, and I sort of see where Google is going with that. They were relying upon sources like Wikipedia, but they're moving away from that toward opening their own. Google gets a query and will say, "Okay, we'll search for this query, we'll get the top say thousand websites, we'll build a little knowledge graph off of those thousand websites, and enter the question in the query, based upon that knowledge from those thousand websites, it's amazing. So it's no longer using Wikipedia, which is convenient to use because the reason why they abandoned the Google directory and are using Dean laws was that there were human-edited sources that didn't scale well with the web.
So this is basically one-upping Wikipedia.
Right. So if you can use all the news sources in the world to keep track of entities, keep track of news in politics, news in entertainment, and what they're up to, you can build little knowledge graphs on the fly.
That's an unbelievable advantage.
You’re using the web as a scattered database. You've got ways to collect the information and answer questions related to it quickly.
And that makes Wikipedia almost look like before a giant ocean and now it's a pond. That’s amazing.
Well, Wikipedia is human edited, but it also uses robots to do a lot but it still takes time. When something happens, a trade takes place in football, or a country gets overthrown, these things aren't necessarily captured really well in Wikipedia but in the news, they tend to be if you’re looking at the right sources.
And if you're a sports fan like I am, Google's very good at picking up when the player is traded. I haven't checked it to the minute, but I have checked it within the hour and it's updated.
They do that with baseball too and in the NFL Draft they were doing pretty well.
We're talking about you implementing some of these things that you're looking at and you're actually doing them. It's not perfect. It’s not that you're going into Google asking them, "Okay, what are you guys doing?” and they’re telling you what they’re doing. You're theorizing. It's a good theory but sometimes you're going to hit a wall or hit the wrong direction and you have to adjust this.
I went to school for English and I went to law school after that and it didn't prepare me to be a computer scientist. It didn’t prepare me to analyze the math and science behind patents. But some aspects of it helped. When I was an English student, we were studying literary criticism. We were deconstructing things people wrote, breaking them down into pieces, and looking at aspects of it. When I went to law school, we were briefing judicial units. We were breaking them down into what happened in the case, what were the facts of the case, how did the courts hold the rule of law, how did that change things. Here, we’re breaking webpages down into the same type of thing. We’re doing an analysis to figure out what works well and what doesn't work well. When somebody writes a patent they do the same thing. Here's the problem we intend to solve and here's what existed previously as solutions to their problem and why they don't work now. Here's what we're proposing as a new way to solve that problem and here's the analysis for it. It's not computer science in my mind. It’s a logical analysis of how to solve specific problems.
If you find that there's a pattern that Google's not implementing now, is there a chance they'll probably use it or try to use it at some point?
They sometimes come up with things that maybe they won't implement or maybe they will. A lot of times they come up with redundant approaches like here's one way to do something interesting and then a year or so later there’s a different way.
That's very interesting. It'd be cool if you can figure out retroactively which way they went with and why they didn't go the other way.
Optimize It or Disavow It
If you could spend all of your time analyzing either Google's patents or analyzing ranking patterns, ranking trends, algorithm updates, that sort of thing, which would you do?
I would definitely go with the patents. I learned so much from the assumptions they make about search, about searchers, and about search engines. With the studies, we often see large correlation sites on the web where the people creating those correlation studies get to the point where they're saying, "Okay, correlation, infers causation.” They could try to identify the causation. It's like saying that the sales of diapers increase with sales of beer at convenience stores. So they studied it and it appeared that young fathers who were going out tasked with buying diapers would reward themselves with buying beer.
That's great. That's so good.
So there was definitely causation. So you find a correlation, find the causation. See how they’re related.
Thank you so much for coming on.
I enjoyed it. Thank you.
SEO News [00:51:12 - 00:56:58]
Page Experience Update will Measure Your AMP Page in 2021:
Reminder, if you have an AMP page, Google will look at that page when the Page Experience Update hits in 2021!
Google is Testing Expandable Tabs Under Images:
Google is testing using expandable tabs under images on the Image SERP. The tabs expand to show what is basically a Featured Snippet about the topic or entity shown in the image itself.
One Million Sites Using REL=Sponsored:
Google says that a cool million sites are using the rel=sponsored link attribute.
Google to Ban Clickbait in July:
Come July Google will ban the use of anything clickbait-y in ads.
New Updates to Google Merchant Center:
Important updates to Google’s Merchant Center. Google has released new attributes that let you list product specifications, product highlights, etc. Also, come September, Google will require more specifications for certain products. For example, for clothes, you may have to list gender and size, etc. If not, the product won’t be listed.
Smart Display to Look at AMP:
Google says it’s smart display devices will start looking at AMP pages for content.
Tune in next Tuesday for a new episode of The In Search SEO Podcast