Google's Ability to Understand Entity Sub-Profiles [Case Study]
July 31, 2019 |
Google has gotten really good at recognizing entities and how various entities relate to each other
. There's no doubt about it. That said, how well can Google truly profile an entity? What happens when an entity has more than one profile? Is Google able to pick up on entity sub-profiles? What if someone is both an actor/actress and a director? What happens when a celebrity goes into politics? How does Google view the secondary profile of these entities?
What is an Entity's Secondary Profile (And Why Do I Care)?
Before I go on and explore Google's ability to understand entities
, I think it only makes sense to explain what it is I mean by a "secondary profile." Most entities, be it a corporation or a person have a relatively singular identity (at the public level). Microsoft is a technology company. Abraham Lincoln is a former US President. Jean Claude Van Damme makes tacky yet enjoyable action movies.
However, sometimes it's not as simple as that. Disney is a series of theme parks as well as a producer of movies. Ronald Reagan is a former US President who was also a famous actor. Jay Z is a musical artist but also an entrepreneur.
Disney's Knowledge Panel presents a diverse set of information that deals with the entity from multiple vantage points
The ability to pick up on these secondary profiles becomes a bit of a litmus test on Google's ability to truly understand an entity. That is, can Google pick up on an entity's secondary profile? How dominant does that other profile have to be for Google to be aware of it? Does Google compensate when profiling entities by defaulting to generic classifications? The answers to these questions in a way reveals the extent of Google's entity understanding.
Assessing Google's Ability to Understand Entities
How do you go about assessing Google's ability to understand entities and subsequently the search engine's ability to pick up on secondary profiles? You could look at the Knowledge Panel. In fact, many would jump there first. However, I don't think it's really the best place for an advanced
look at how Google understands an entity.
For starters, the copy within it is pulled from Wikipedia. So while you may see the panel's summary talk about an entity's multiple profiles, that may just be the result of copy that got pulled in as part of the overall Wikipedia entry. For example, the Knowledge Panel for famed actor and director Clint Eastwood mentions he is an "actor, filmmaker, musician, and politician." Does this mean that Google is aware of each of these profiles? Does Google know Mr. Eastwood (as I call him Clint - we're on a first-name basis) as both an actor and filmmaker or even as a musician? Or, did Google include these elements of the entity's profile accidentally? Is the reason the Knowledge Panel mentions Clint Eastwood 'the politician' because Google knows he's had a political life or is it simply because it was part of the snippet of content that talked about him as an actor?
A Knowledge Panel for 'Clint Eastwood' presents multiple elements of his diverse career
For this reason, and because I do think that Google only includes Mr. Eastwood's political profile accidentally (i.e., Google chose a section of Wikipedia content to use and his political quests just happened to be mentioned in it) the Knowledge Panel is not the best place to get a real look at how Google can or cannot recognize an entity's secondary profile.
Now, you may argue that the 'People Also Search For' element, often found within a Knowledge Panel, would help us see if Google has a holistic understanding of an entity. I think not. First off, the people shown in the carousel come from the number of searches done in relation to the entity featured in the Knowledge Panel. That does not reflect an intrinsic understanding of the entity. Moreover, the results don't really reflect a topical look at the entity. Have a look back at the 'People Also Search For' results in the Knowledge Panel above, do the results reflect any of the topics indicated in the Wikipedia summary? No, the folks listed are all somehow related to the actor. Thus, the element is not much help for our purposes.
What then can help us get a deeper sense of Google's entity understanding capabilities? Related Search Boxes.
These are boxes, or more likely a series of boxes, that appear at the bottom of the Google SERP. Each box contains a topical title with a series of cards within it that represent specific topical elements.
Here's what the feature looks like for the keyword yankees
Here, Google parses the entity that is the famed baseball team, the New York Yankees, into multiple topical categories. Google knows it is a Major League Baseball team, and therefore shows other "MLB teams." Google also knows it's a New York sports team and subsequently shows a box containing other "New York Sports teams."
These boxes reflect how Google dissects an entity, understands an entity, and parses its profile as a segue to other related profiles far better than any other SERP feature, including the Knowledge Panel.
What then do these Related Search boxes tell us about Google's ability to see an entity as it truly is, including secondary/sub-profiles?
Data on Google's Ability to Profile Entities
Coming up with a list of entities that have a strong secondary profile is not as easy as it sounds (though it was a lot of fun). When it was all said and done I came up with a list of 50 entities, all people (not corporations) that have strong secondary profiles.
For example, Elvis Presley is on my list. Elvis is, of course, famous for his music. At the same time, and perhaps more so at the height of his stardom, Elvis is very much known for the 35+ movies he made. Everyone knows the song Jailhouse Rock but it only became as popular as it was due to the movie, Jailhouse Rock.
In creating this list I made sure to include entities where the secondary profile could easily be considered the dominant or primary profile, as well as those instances where the secondary profile was clearly that, secondary.
In each case, I analyzed the Related Search boxes shown for the entity (in each search only the entity's name was used) to see if the secondary profile was represented and if not what was shown instead. (Since a lot of this analysis is a "judgment call" I will be sure to go through as many cases as I reasonably so that you can see for yourself if you agree with my analysis.)
Overall, out of the 50 entities I looked at only 11 of them had their secondary profile's represented in the Related Search boxes. That's just 22%.
However, the numbers don't demonstrate the full extent of the gap Google seems to have in understanding an entity's secondary profile. For that, we need to look at what actually does and does not appear within the Related Search boxes for some of the entities I looked at.
A Case-by-Case Look at Google's Ability to Discern an Entity's Secondary Profile
When thinking about entities with more than one profile, with a strong secondary profile, I automatically leaned on people, famous people. I didn't want to get into corporations as that could be a bit messy depending on the corporation and if it has official subsidiaries. That would make things a bit murky. Can Google really pick up on a secondary profile or does it simply know that "Corporation A" has various official subsidiaries? (Google's parent company, Alphabet, would ironically be a great example of this.) Thus, I stuck to people. When it comes to people, Google can either identify a given entity's multiple profiles or it can't. There's no gray area (or less so when compared to corporations).
When dealing with people, particularly famous people, as entities with multiple profiles, a few categories naturally emerged. You have your entertainers who became politicians, then you have your actors/actresses who became directors, and so forth. Of course, you have some unique entities/people who have a very singular secondary profile. In this case-by-case look at Google's ability to parse out an entity's secondary profile, we'll look at both more "traditional" profiles and some profiles that are a bit more unique.
Entities with Secondary Profiles: Celebrities with a Political Sub-Profile
I want to start off with the entities that have the strongest secondary profile. In other words, let's make it easy for Google and see what happens. By far, the most notable secondary profiles comes when looking at celebrities (movie stars, athletes, etc.) who have entered politics.
The most famous of these cases (from a political perspective) is Ronald Reagan, America's 40th President. Prior to becoming the President of the United States, Reagan was a renowned actor with 81 film credits and a star on the Hollywood Walk of Fame. Bottom line... it's very, very, very well known that Reagan was both an actor and a President... unless you're Google:
The first Related Search box on US Presidents
is obviously a logical choice. That said, I would have expected something related to Reagan's Hollywood life to appear (something along the lines of a box headed by Celebrity Politicians
). Instead, we get a very odd Person Or Being In Fictions.
I don't even know what that means exactly, but aside from being an "eclectic" group of people, none of the results are "fictitious."
Perhaps you'll argue that there is not enough to fill a box of Related Searches with "celebrity politicians." Well, for starters, Google regularly shows a Related Search box with just two or three results. That said, let's explore a few more "celebrity politicians" just to dispel any doubt.
If Ronald Reagan is the most famous politician with a Hollywood background, then Arnold Schwarzenegger is the most famous celebrity with a political background. It seems though, that being the Governor of California for multiple terms is not enough for Google:
Here too, aside for there being any lack of political reference, the choices shown are a bit peculiar in it of themselves. Arnold was not a childhood star. If you want to argue that the results are there due to his former wife, Maria Shriver (the first celebrity listed in the box), I am not sure being the niece of JFK qualifies as being a childhood celebrity.
Here's the icing on the cake. Have a look at the Related Search boxes for Sonny Bono (you know, Sonny and Cher) who happened to serve in the US Congress:
Here we get a Related Search box for Republican Celebrities
that includes... Arnold Schwarzenegger! Please note, that this, in terms of the entity in question (Sonny Bono), is a case where Google does it get right by picking up on the secondary profile (i.e., Sonny Bono not as a celebrity, but as a politician).
Just offer a bit of diversity within the category.... What would you say should appear for a Hall of Fame NFL football player who sat on the Minnesota Supreme Court for well over a decade?
I would expect there to be something related to other judicial figures. All the more so when the second box is merely a list of other famous alumni who went to the Honorable Judge Page's alma mater! (As an aside, we shall see Google use an entity's higher education background as a crutch of sorts.)
Just to highlight how strong the profile for Page's legal career is, have a look at the top of the SERP for a query using just his name:
Both Wikipedia and the Knowledge Panel list his profession as Jurist, not football player. The video carousel shows a result of him receiving the Presidential Medal of Freedom... for his work as a jurist!
Entities with Secondary Profiles: The Military as a Sub-Profile
One of the more interesting entity subsets that I looked at dealt with military service being part of the entity's secondary profile. There's a long list of athletes who have served in the military. I don't mean instances where the athlete did a short compulsory stint, as was common during World War II. I mean folks who either went to a military academy (which is followed by compulsory service) or who volunteered to serve.
I'm not expecting Google to pick up on every athlete's service. Pittsburgh Steelers player Alejandro Villanueva was a Captain in the US Army. Not many people outside of Steelers fanatics would know that. But what about a world-famous athlete whose nickname is the Admiral?
David Robinson, one of the best basketball players to walk this earth is a graduate of the US Naval Academy. Having grown up in his era, this was a fact every kid knew. It was such a part of who Robinson was as an athlete that he was dubbed "the Admiral." In fact, the summary of his Knowledge Panel mentions this fact and presents an image of him in uniform:
However, as soon as we move down the SERP to the Related Questions Box, all mention of his naval service appears to have been forgotten:
Obviously, the first and third boxes are highly relevant and I would not expect anything related to military service to appear in their stead. However, the second Related Search box which represents a completely forgettable children's movie is an odd choice considering that military service played such a large role in Robinson's actual basketball career! It's not like Robinson had top billing in the film. In fact, neither he nor any basketball player appears on the initial cast page that represents the actual stars of the film.
The same is true for famed director Oliver Stone. Stone is famous for his take on America's military history. The basis of this was his personal experience in the Vietnam War. Here too, the SERP itself indicates that Stone's military service is a strong part of his overall profile:
Wikipedia's rich results clearly indicate that the director's military service is a significant part of his profile. Like with Alan Page, the video carousel hints at the entity's secondary profile being quite strong as it directly references military service.
In this particular case, I would understand Google leaving a Related Search box related to the military out. However, when we look at what Google does present you have to ask if Yale University Notable alumni
is really more relevant to extending a user's search journey vis-a-vis Oliver Stone:
The closest thing I've seen to Google recognizing a military sub-profile is for Pat Tillman. Pat Tillman was an American football player for the Arizona Cardinals. He was good. He was hardly famous at the national level though. That is until he forfeited his football career to serve in the US Army in Afghanistan. Tillman reached national celebrity when unfortunately he was killed in action.
In this case, there should 100% be something related to Tillman's military service. I would argue that his service is
his dominant entity profile. That said, here is what Google offers the user:
Again, there is this 'default' move to show "notable alumni." Now, the second box is for the Arthur Ashe award. It's an award given to those who face adversity head-on. This is not a total miss, but I wouldn't call it a hit either as it does not speak to the entity's profile as a member of the US military per se.
Just to highlight the point, here's what Google gives us for former US Senator and famed Vietnam POW John McCain:
The last box is perfect.... Great. The set of "Speech Writers," however, is a bit perplexing. Be that as it may, you would have to think with a guy like Senator McCain that there would be something related to Vietnam or the US Navy.... something. Well, something other than other graduates of the US Naval Academy.
Entities with Secondary Profiles: Other Notable Cases
Before moving on to instances where Google is able to identify an entity's secondary profile I wanted to present a few cases where I believe understanding the entity via a secondary profile should be reflected on the SERP.
I'm not trying to beat a dead horse here. Rather, these instances are such prime examples of Google not seeing an entity's secondary profile that I had to present them here.
First up are Bo Jackson and Deion Sanders. I hate to use too many sports examples, but these two fellas are famous for playing two sports at the professional level and playing them both exceptionally well. Both Jackson and Sanders played baseball and football at the professional level receiving extreme levels of notoriety for both sports.
Yet Google only recognizes Jackson's secondary profile:
This one is not a sports example, but it does touch on basketball ever so slightly. Mark Cuban. Famous for being on the Shark Tank. Famous for his business ventures. Famous for being the very outspoken owner of the Dallas Mavericks basketball team. Yet it seems the university he attended is of more relevance than owning an NBA franchise:
This example really stands out to me. In some of the other cases I used, you could argue that it's hard to show a Related Search box about other military sports figures and the like. But how hard is it to show a box with other NBA owners? Super easy!
Or how about Ben Franklin. Everyone knows about his kite and the lightning bolt! Indeed, a Featured Snippet for what is ben franklin famous for
says he was famous for his dabbling into electricity:
Yet, as with Ronald Reagan, we get this odd Related Search box for Person Or Being In Fictions
I don't think it would be hard to have a box for Famous Inventors
Despite the examples you've already seen I've managed to save the best for last. Michael R. Bloomberg. Michael R. Bloomberg as in he who has a net worth of over $60 billion. Michael R. Bloomberg as in Bloomberg Magazine. He is one of the top 10 richest people in the world, and he happened to be a three-term mayor of New York City.
If you asked me what this entity's dominant profile is, I would say it's his status as a business mogul. You could make the case for it being his role as mayor of NYC. At a minimum, it's a toss-up. Yet, Google's Related Search boxes show nothing related to Bloomberg's wealth, business, or media corporation:
It's hard to find a stronger secondary profile than the sub-profile that belongs to Michael R. Bloomberg.
Where Google Gets It Right When Analyzing Secondary Profiles
I don't want to give off the impression that Google does not do a good job recognizing entities. It generally does. Though, I will say that the Related Search box could use a bit of TLC as there are often some "surprising results."
That said, Google can and at times does pick up on an entity's secondary profile.
Indeed, when it comes to actors turned directors (to the exclusion of Clint Eastwood, as mentioned earlier), Google does a good job hitting both profiles:
I had to ask myself why would Google pick up on the fact that both Howard and Favreau are actors turned directors but not Clint Eastwood? This is where being into "pop culture" helps my SEO career. Both Howard and Favreau are more famous as directors. However, they did appear as actors in some pretty iconic movies/TV shows.
Eastwood, on the other hand, is famous
for winning Oscar awards as a director. but is iconic
for his acting roles. In other words, there is a bit more room to focus just on Eastwood's acting and not his directing. That said, Eastwood not being recognized as a director within the Related Search boxes is a major miss. (By the way, Google still throws in a set of "alumni" for Ron Howard... peculiar.)
Here's another great example where Google gets it right... Dwight D. Eisenhower (who was both a US President and famous WWII general):
If we want to look at a less "ceremonious" entity, have a look at Dwayne Johnson (aka the Rock) as Google nails it:
To me, both Eisenhower and Johnson offer very strong secondary profiles. Both achieved great fame and notoriety in their multiple roles. I would call it extreme notoriety
. I very much think it is because of this extreme notoriety that Google is able to pick up on the "secondary profile" (one that could very well be the dominant profile). Although, seeing Google's treatment of Michael Bloomberg would call this into question. Still, it is the best working theory that I have (namely, that for Google to pick up on a secondary profile it be must on par with the dominant profile).
The Practical Implications of Google's Secondary Profile Recognition
With these kinds of things, the implications are as diverse as the sites found on the web. That said, one thing does stand out to me. Considering that sites are also profiled as entities
these days ensuring you have a single core identity seems to be a priority. However, if you do so want to give your site or brand a secondary profile.... If you do want to venture out into new areas and endeavors that fall a bit outside your core profile you need to ensure it's beyond substantial. Based on what I've seen here if your brand or site aims to have a secondary profile it would need to be nearly as strong as your core.
We've seen plenty of strong profiles ignored by Google in the examples above. Sites being entities means that your secondary profile is subject to similar issues. I would personally be very cautious in creating a secondary facet to my site's or brand's identity, ensuring it would be nearly as dominant as my core profile before venturing out.
A Small Stain On an Otherwise Excellent Record
I want to be clear, Google can do amazing things with recognizing entities. Yes, they have a hard time with sub-profiles. But that does not take away from the leaps Google has made when it comes to understanding entities. In fact, in the coming weeks, I hope to do a similar case study showing some of the advancements Google has made in forming a nuanced understanding of an entity.
In either case, I think it behooves us to start considering Google's focus on entities and their ability to understand them. There is so much that hinges on how Google both understands entities and is able to relate them to other entities. With evidence that Google is profiling sites themselves as entities (and given what we've seen here), strong consideration to your site's identity is all the more pertinent.