Rank Ranger Blog

How Voice Search Will Affect SEO


In the first part of my analysis on the possibility of optimizing for voice search I showed that there is a deep and abstract gap between the language used in voice search versus that of traditional search. Here I'll get to the heart of the matter and show you the practical problems in optimizing for voice search. The vast differences in the language of voice search, when compared to written search, truly precludes dual optimization. There is no way that simply using long tail keywords is going to bridge the linguistic divide as many in the SEO world have expressed. Unless Google can adequately bridge the gap for us, creating content that meets the linguistic requirements of both search mediums is going to be rough. Don't believe me? Read on.... 






The SEO Impact of the Gap Between Verbal and Written Communication 



In continuing to dissect just how far verbal communication is from written communication, and why the current path towards voice search optimization will inevitably hit a major snag, we need to have a sense of how the two forms differ structurally. 

Spoken Language - Beyond Just Less Formal 

 

When we say that to optimize for voice search we're going to have to create less formal content, there is a certain amount of truth to the notion... in a very limited sense. Yes, speech is a less formal form of communication. No one is going to do a voice search for discount flights Miami, whereas cheap plane tickets to Miami seems quite feasible. These sort of keywords present not only longer queries but less formal ones as well.


Formal vs. Informal


However, this distinction is only skin deep. Famous linguist Wallace Chafe, in his often cited work Discourse, Consciousness, and Time indicates that speech is not only less formal, but less purposeful as well. Contrary to what I've seen stated numerous times within the SEO community, verbal language is less precise than written language, as the intent is not always as transparent. As opposed to writing, which is often succinct and purposeful, when we talk we often do so without thinking, off the cuff so to speak. The lesson here is, just because verbal speech is longer does not mean it is less ambiguous. 

Unless Google is able to pick up on intent, how are you supposed to optimize for a form of communication whose intent is often shrouded within the language itself. Though queries like discount flights Miami and cheap plane tickets to Miami may be equally discernible and show with the same sort of results, what about a search for breathing process vs. how do we take in air?


Voice Search Compared to Written Search

Figure 1 - The vagueness of a voice search produces some results that reflect the true intent of the query while simultaneously causing Google to rank irrelevant sites as well


For the latter query, how do we take in air?, Google has trouble with the intention of the query and offers mostly irrelevant results. Notice, Google did include some pertinent results, but included a result for cleaning an air duct as well! The mix is clear indication that Google had a tough time deciphering the intent here. However, think about the query, how do we take in air... I would venture to say most of us would understand the intent based off prior experience with the phraseology. In other words, here, the longer query with its extended language did not make the query any clearer, in fact the opposite occurred. 

You might be asking something like, "Come on, who in their right mind will search for a query like how do we take in air?" It is an odd query, I'll give you that, but think about it for a second... it's Sunday... you've knocked back a couple of beers while watching the game... your kids are running rampant around the house... your wife is asking you to call in an order for dinner at the local pizzeria... to which you don't know the number... just how coherent is your query going to be? 

OK, even if I give it to you, the above query is odd. However, even with more honed in language, Google still has a hard time with a voice oriented search. See then how the results for the query how do we breathe differ from its more formal written counterpart breathing process (see Figure 1).


Voice Search without Dialect Influence
Figure 2 - Google shows mostly different results and a different video thumbnail when standard language is used in a voice type search when compared to traditional and written search phraseology 


While the featured snippet is the same here in Figure 2 as it is for the query breathing process, only three of the results are the same. Notice also that the YouTube video Google offers is different as well. Take a look at the last result in the image above. It's related to breathing techniques, which you'll have to admit have nothing to do with the intent behind the query how do we breathe. It all just goes to show you how hard it is to optimize for a verbal oriented query at the same time as trying to optimize for a more precise written query. 



The Impact of Culture and Dialects on Spoken Language and Why It Matters for Your Voice Search SEO 



In continuing with this notion that oral language is beyond just less formal than written language, linguist William Bright outlines that oral language, unlike written language, exists outside the boundaries of standardization. This is a huge problem when trying to optimize for both traditional and voice search, and may actually be why it impossible to do so. 

This issue of oral language not following set formulation and not being standardized is a very broad issue (which just makes optimization all the harder). Let's then break it down a bit and deal with two major facets of this problem, culture and dialect. 

1. Culture - Voice Search Optimization's Foe 




Boxing


My introduction to the disparity between oral and written language was within the context of the impact of culture on oral communication. Day in and day out I saw firsthand how difficult it was for kids, whose oral language had heavy cultural influences, to write in a standard form of English. Not only was the distance wide, but closing the gap between the two was without doubt one of the most challenging tasks I've ever undertaken. According to one teacher training service "oral language is the primary mediator of culture," a sentiment I heard many a times when undertaking my own graduate studies in education.

This is further evidenced by a study done by the National Academies and published by the National Academies Press. The study notes that students may find difficulty in acquiring proficiency in formal language due to the notable structural differences in their oral dialect. So take it from me, optimizing for a formalized method of communication and one heavily influenced by culture is beyond difficult, if not impossible. 

Culture, vis a vis language, takes on a very broad definition (which further complicates the task of voice search optimization). Meaning, culture in the case of language doesn't only refer to ethnic background, it can even refer to aspects of society such as age. Think about the way millennials speak... now think about the way your grandparents might speak... now think about optimizing for both demographics!

So forget voice search vs. traditional search for a minute, even within the framework of just voice search, optimization is an uphill battle if Google can't figure out a way to translate culturally influenced dialect.

Watch what happens as things stand now, where Google often does not understand cultural dialect as well as it should. Let's do a search for what's the cause of the world's global problems and compare it to the way my much younger millennial sister would ask it: why is there so much drama in the world:


Generational Language in Voice Search
Figure 3 - The use of generation specific language results in Google showing superfluous search results


In this case, the lengthier and less formal "voice type" search produced on-target results relating to the world's poverty and environmental problems. Not so in the second search where the word drama replaces the word problems. Here, the first result is a Yahoo Answer related to the entertainment industry, while the second result is a Bob Marley song! 

OK, now this is my favorite... what if we went full millennial and ran a voice-like search using the following: like why is there so much drama in the world:


Voice Search Results Heavily Impacted by Generational Language
Figure 4 - A search employing heavy usage of cultural specific phraseology produces some bizarre results at the top of the Google SERP 


You're reading this right... Snoop Dogg's hit song Gin and Juice is the source of the world's problems. Evidently my grandparents work at Google. 


2. The Devil is In the Geographic Dialects



A sort of extension, or offshoot, of the cultural element of oral communication are geographic dialects. Like the other problems I've outlined, geographic dialects should not be thought of as only a problem within English, but with almost any language. Be that as it may, regional dialects are in a linguistic position to wreak havoc on voice search (unless that is, Google finds a way to equalize them behind the scenes).


Regional Dialect


As mentioned before, from a search perspective, the beauty of written language is its standardization and uniformity. Folks in Fargo North Dakota will enter the same search terms as people in Philadelphia. Folks in Fargo will not perform a voice search with the same sort of vernacular as people in Philadelphia, generally speaking of course. 

Think about how many regional dialects there are just in the US alone! There are actually more than you realize, according to some accounts over 25 variations of English are spoken in the United States. If we look at English across the globe, you're talking over 150 variations.

Sure, some vernaculars are not that far off from others, but English is just one of the globe's languages. How is Google going to decipher and equalize all of the spoken dialects, of all of the languages humanity speaks? Because if it can't, how are you going to optimize your site for so many variations of the same language?

Does Variation in Spoken Dialect Really Have an Impact on Search Results

 

It's hard to appreciate the gap between some of the ways different people speak within the United States (and my apologies to the rest of the world, but I simply don't feel informed enough to discuss any other languages or dialects other than my mother tongue within my mother nation). So let's run through a few examples to see just what sort of havoc dialectical differences can have on search results. Before I do so, I do want to say that Google does do a decent job at deciphering dialectical differences at times, but this is way beyond that!  

Region Specific Nouns & Their Effect on Search Results 



I as a New Yorker call sprinkles, you know the colorful little pellets that go on ice cream, sprinkles. My wife, being from Baltimore, calls sprinkles... Jimmies. Google is able to pick up on the insulting term my wife uses for the sugary substance and knows that she really meant "sprinkles." However, the results associated with the two terms are totally different. Google, knowing my wife is wrong (that's OK she doesn't read "SEO geek stuff" as she calls it, so I can say what I want here) and presents a plethora of results related to the term, not to the substance itself. 


Voice Search - Regional Speech

Figure 5 - Regional word choice has a deep impact on search results


Moving up my family tree, how would Google handle my father? My father is a typical "Brooklyn-ite" and drops the "a" at the end of the word in favor an an "er" sound. Growing up my father would say things like, "So I says to him, I got a great idear, let's get some cold soder." I'm actually not joking, I promise you. He's gotten better in his "older" age, but it still slips out. So here's how Google would handle my father's regional dialect: 


Local Accent Impact on Voice Search Results

Figure 6 - A Google search is rendered ineffective when unusual, yet commonly used, regional dialects are present in a voice-like search



Not even close, not even a Did you mean: best soda in the world. I can actually imagine the exchange between Google Home and my father going around in circles for hours as Google struggles with my father's "Brooklynese."



Region Specific Pronouns & Their Effect on Search Results 



Region specific nouns have nothing on region specific pronouns simply because of the latter's frequent use. When I moved to Baltimore to teach in the inner-city one of the things that I had a hard time getting used to was hearing "y'all" as the phrase of choice for "you all" or "you're all." I was born and bred in New York, and even though the phrase "are you" was commonly bastardized as "you'ze"... as in, "you'ze gonna finish that cannoli?"... hearing "y'all" constantly was just out of place for me. However, I can't tell you how often the phrase "y'all" was used in Maryland. It's just part of the way people talk there, and if you're doing a voice search from the great state of Maryland all y'all at Google are gonna hear the phrase "y'all" quite often, y'all understand? 

Let's see how things go with the word y'all when running some searches. 

I can imagine my brother-in-law sitting around the living room at Thanksgiving dinner debating the finer points of car engineering with other unnamed family members and in frustration turning to Google Home (which I can't imagine he'd actually ever own) and in his amazingly thick Baltimore accent (and sophisticated manner) ask it something like what y'all need to know about cars


Voice Search Local Phrase

Figure 7 - A common regional vernacular wreaks havoc on Google's results towards the bottom of the SERP 


 
In this scenario, Google offers up some decent results towards the top of the SERP... but it's all downhill from there. The second half of the SERP includes results for books entitled "Absalom's Daughter," "Ferguson Interview Project," and "Murder by the Collar".... definitely not what my brother-in-law had in mind. 

Compare this to the more "standard" query of what you all need to know about cars... while some of the results remained the same, "Absalom's Daughter" is no longer on the SERP. Point... Google had a very hard time processing the very commonly used word, y'all


Voice Search without Local Phrases

Figure 8 - A more standard English within a query structured to mimic a voice search yields pertinent results 



Regional/Cultural Grammar and Sentence Structure Affect Search



This is where I think Google will have the hardest time unifying voice and traditional search, which again, if it fails to do so presents a real optimization problem. This is not just where one word replaces another word in a sentence (like "y'all" does to "you all") but where the whole structure of the sentence is altered. One of the things I remember most from teaching English was the unique word constructs many of my students would use when speaking. Often, students would substitute the word "gots" in place of "got to" when speaking.

So let's try a search using the keywords who's gots the most money in America... and the results are:


Cultural Word Choice in Voice Search

Figure 9 - A voice-like query using culturally influenced language results in the user intent being misunderstood by Google 


Now let's try a search that leaves vernacular to a minimum, who has the most money in America: 


Limited Vernacular Voice Query Results

Figure 10 - A search that is absent of culturally influenced sentence structure shows only those results related to user intent 


The results here are fascinating. First, each query hit the nail on the head with the Forbes 400 list being the top result. What's really interesting though is that Google in this case overcompensated. For the query who's gots the most money in America Google included two results related to music, thereby misunderstanding the intent of the query due to the cultural element of the keywords. Google created an equality between the unique cultural dialect and intent, an equation that is simply not true. This meta-falsity reflects a serious problem Google has in how it determines the intent of voice searches (more on this and RankBrain later).   

Ready for another? According to Wikipedia the term ain't, which is technically not a word, is "among the most pervasive nonstandard terms in English." As you may know, ain't often replaces the phrase is not (among others). The usage of this word often affects the entire structure of the sentence. So let's see how Google does with this common term of spoken English. To start we'll go with a voice-type search that does not make use of the slang term - why isn't there any air in space:


Standard Spoken English Voice Search

Figure 11 - A voice-like search utilizing a more standard form of language produces a highly relevant featured snippet along with authoritative search results



Compare the above for the results we get when we swap the phrase isn't, with ain't, as in, why ain't there no air in space:


Voice Query with Grammatically Incorrect Structure

Figure 12 - A mock voice query that makes use regional dialect and sentence structure returns results unrelated to the initial query 


Everything looks good when you look at the results in the query why is there no air in space. The featured snippet is on-target, the results include authoritatively scientific sites, etc. This would be a case where you can say that Google did a good job in what for all intents and purposes is a standard language voice-type search.

Not so with the search that includes the unique vernacular.  I'm scratching my head trying to determine what Google was thinking here when it gave me these results. I  mean, the top four results (see Figure 12) refer to a quote from the famed cartoon comedy series, the Simpsons! The fifth result relates to some wacky conspiracy theory that we never landed on the moon. The sixth result is good, other than it not answering the query's question. Though at least it's a legitimate site offering actual scientific content. The next result asks why Supergirl can't breathe in space.... really Google? As the Robot from Lost in Space would say, "does not compute, does not compute."

I want to point out that we still haven't searched for the reason why there is no air in space as we normally would in writing. If we took a more traditional approach to search we might try, reason no air in space, which would yield: 


Different Featured Snippet - Written Search vs. Voice Search

Figure 13 - A query utilizing a typical written search linguistic structure shows with a featured snippet from the same site as a voice-like query, only with different content 


The results here are just downright amazing to me. Notice that the featured snippet is totally different than the previous "voice search" (why isn't there any air in space - Figure 11), yet comes from the very same site and exactly the same URL. Why did Google offer a featured snippet that comes from the same site, but offers different content? This is very significant in light of the fact that Google Home offers many of its answers vis a vis the featured snippet. Someone doing a voice search here would get different information than those doing a written search. This again highlights the vastness which separates voice and traditional search as a whole. 



How Will Voice Search Impact Local SEO?



If you're a local business trying to optimize for voice search, and as in the cases shown above can't rely on Google to "translate" from spoken English to written English, what are you going to do? Are you actually going to optimize your site for the word "ain't?" How ridiculous would that be? Worst off, would it even be possible to optimize for all of the ways of speaking even within a certain region? The upshot of a more standardized form of language is that it creates a sort of equal playing field that gives you something to work with. 

When I was a kid growing up in New York I wasted an enormous amount of time watching Sunday afternoon TV. For some reason Sunday afternoons were prime time for car commercials that I found annoying and irrelevant in the wisdom of my youth. The commercials always ended with, "contact your local tri-state (insert car maker's name) dealer..." In this case, the ad was directed at people living in New Jersey, New York, and Connecticut (i.e. the tri-state area).

Local SEO doesn't necessarily mean you're covering a 10 square mile radius. In fact, you could be covering quite a large geographic area, as illustrated with the car commercials of my youth. Areas that could include various regional dialects and ways of speaking. What are going to do, optimize for them all?


Leaving Brooklyn Sign


Let's say you could optimize for the local regional dialect. Say, you're target area is New York City and the other four boroughs (i.e. Brooklyn, Queens, Staten Island, and the Bronx) and you've optimized for that classical New York lingo... fantastic, but did you know that New York's Astoria is one of the most culturally diverse places on the entire planet? How are you going to optimize for all of the various ways each culture approaches the English language? What do you do if you're trying to optimize a site for a plumber in Astoria when considering voice search and Google's possible inability to "translate" language variants?   



RankBrain's Role In Voice Search Optimization  

 

It would appear as if Google knows what we as an industry may not, that voice search optimization is not compatible with the optimization we currently work towards with traditional search. RankBrain is not simply a tool to decipher synonymic terms such as best hotel in NY, or what is a good hotel in NY, or greatest NY hotels, and so forth. RankBrain is really about determining user intent when different linguistic constructs are presented.

In other words, RankBrain is Google's attempt to unify and translate between the linguistic compositions of both oral and written language. A written language that is really a unique form of language which I'll call "search language." Google is not investing so much time and money into a mechanism that can determine synonym phraseology. Google is pouring resources into something that can understand the unique qualitative fabric of oral language and match it to content that employs the unique qualitative fabric of written language.

This is the true novelty in determining user intent.  Determining user intent across surface word choice is something that a good thesaurus could do. Only AI (or an actual person) however, can determine intent across linguistic mediums that are built from different communicative substances.


Machine Learning


This is precisely why you can't really optimize for RankBrain. The very notion is a contradiction. If RankBrain itself is a mechanism that in a sense can translate between two incompatible linguistic systems, how could you optimize for it? If you could optimize for something like that, you wouldn't need RankBrain! To optimize for something like RankBrain would mean that you've somehow solved the linguistic contradiction between the two communicative mediums. 

Somehow we're ignoring what Google themselves is telling us via its admitted necessity for an AI system to determine user intent across incompatible communication mediums. Saying that voice search can be included within the "search universe" by simply optimizing for long tail keywords and more informal content goes against the subtle signals Google themselves are sending out. So if you don't believe me and my notion that the two mediums of language can't co-exist, then take it from Google who is backhandedly admitting the same thing.



What Needs to Happen In Order to Create Voice Search Harmony


The question of course is can Google be successful in using AI to decode voice search and align it to written content? Not to beat a dead horse, but one thing is certain to me, if we are left to optimize for voice search and Google cannot bridge the gap between the various mediums of language, no amount of long tail keywords or casual content is going to help in a systemic manner. I personally find it interesting that Google has released its voice search product with an AI system that is clearly still learning.


Understanding User Intent - AI


Perhaps this reflects Googles confidence in its ability to synthesize the forms of search via RankBrain. Perhaps however, Google sees a certain future that is a voice dominated search industry where written content as we know it will be phased out in favor of orally inclined content. Of course, as I've pointed out in this series, there is no way to write content that matches oral discourse. In such a scenario it could be that Google is banking on our ability to move content as close to the structure of oral language as possible, in which case RankBrain would have a much narrower gap to bridge. Such a theory does present an interesting notion, namely that Google sees voice search as its future.

Time will tell, we'll have to wait and see... I'll let you know as soon as I get word of this theory's veracity.  


About The Author
​Mordy is the CMO of Rank Ranger as well as the host of The In Search SEO Podcast. Despite his numerous and far-reaching marketing duties, Mordy still considers himself an SEO educator first and foremost. That's why you’ll find him ​regularly releasing all sorts of original SEO research and analysis!




Get the ultimate SEO tools with Rank Ranger
Start Free Trial
No Credit Card Required