Leaked Google Search Algorithm Documentation Gives Clues To How Your Content Is Found

Is Google S.E.O. Gaslighting the Internet?

Leaked documents provide a glimpse into the inner workings of Google Search—and contradict the company’s public claims.

Illustration of a giant Google Search bar leaking data.
Illustration by Ariel Davis

In March, Gisele Navarro watched Google Search traffic to her Web site, HouseFresh, disappear. HouseFresh evaluates and reviews air purifiers. Her husband, Danny Ashton, launched the site in 2020, when the pandemic created a spike in demand for air purification, and at its peak the business had fifteen paid contributors. (Navarro and Ashton also work together at NeoMam, a content studio that Ashton founded.) Google traffic to HouseFresh had been slowly declining since last October, but the recent drop was far more dramatic—from around four thousand daily search referrals, or click-throughs from Google results, to around three hundred. The site makes money from affiliate fees, taking a small cut when a reader follows a link from HouseFresh to purchase an air purifier online; less traffic means less revenue, and the site can now only afford to pay one full-time employee. Navarro told me, “We are living our lives like Google is gone for us.”

The drop in traffic to HouseFresh has coincided with internal changes to Google’s search function. In late 2023, Google rolled out a series of algorithm modifications; with a “core update” in March, it made those changes permanent. HouseFresh reviews previously ranked highly on Google searches for air purifiers, but lately its articles have been buried below recommendations from brand-name publications—Better Homes and Gardens, People, Architectural Digest (which is owned by Condé Nast, the parent company of The New Yorker). Navarro even noticed Rolling Stone, the music magazine owned by Penske Media, recommending anti-mold humidifiers. To her, it seemed as if media companies were making a grab for affiliate revenue without the expertise that her own site had worked hard to cultivate—and it looked as if Google was rewarding them for doing so. HouseFresh followed Google’s guidelines for search-engine optimization, or S.E.O.s—the company suggests that Web sites “provide original information” and demonstrate “experience, expertise, authoritativeness, and trustworthiness”—but this no longer seemed to have any effect. “There are people who feel that Google is obfuscating the truth,” Navarro said. “It’s lying to our faces, or gaslighting.” She began publishing articles on HouseFresh about the decline in search traffic, with headlines such as “How Google Is Killing Independent Sites Like Ours.” The articles got more search traffic than the reviews did.

In May, we got a glimpse into the inner workings of Google Search, from a leak of twenty-five hundred pages of the company’s internal documentation. The files seem to have been uploaded to GitHub by an unknown party, in March, but gained attention only when Erfan Azimi, a search-engine-optimization consultant, sent it to Rand Fishkin, a veteran S.E.O. expert and a commentator on the industry. The leak is from Google Search’s A.P.I., or application programming interface, a kind of directory of labels that external developers can refer to in their code in order to call up information from Google’s internal infrastructure. It is a vast list of coding tags incomprehensible to the lay reader. But the documents identify many of the variables that Google’s search algorithm takes into account, without going so far as to specify how those variables are weighted or how a site’s ranking is ultimately determined.

Some of the information revealed appears to contradict claims that the company has made publicly. One variable that Google Search apparently tracks is when and where users click, not just on Google’s core site but any page that is accessed within Google’s Chrome browser. In the past, Google has repeatedly denied factoring that data into its search algorithm. Fishkin told me that, among S.E.O. experts, this “reinforces an already long-held belief that Google’s public representatives regularly lie, mislead, and omit key information.” The algorithm also denotes personal sites or blogs with the tag “smallPersonalSite,” which some have interpreted as a sign that the company down-ranks them in search results in favor of larger publications. (A Google spokesperson denied that the company is negatively targeting small sites, and said that the leaked documents may contain “out-of-context, outdated, or incomplete information.”) A dominant factor in Google Search rankings appears to be a company or site’s existing name recognition. This represents a shift. Fishkin wrote on his blog, which is hosted by his company, SparkToro, that “Google no longer rewards scrappy, clever, SEO-savvy operators who know all the right tricks. They reward established brands, search-measurable forms of popularity, and established domains that searchers already know and click.” Hence, perhaps, the way that HouseFresh lost out to the likes of Better Homes and Gardens—though, anecdotally, even legacy publications have recently taken a hit on Google search traffic.

The decline of Google Search dates back farther than the recent round of algorithm changes. I wrote in 2022 about the deterioration of search results as authoritative sites were increasingly crowded out by overly optimized, clickbait results and text from Google’s “Quick Answers” feature. The promise of Google’s search engine is that it will answer queries with the most “relevant results”; if Web-site builders offer enough high-quality content on a given subject, readers searching for that subject will find their way to it. Business models have been built on this promise—indeed, much of the Internet is structured around it. But S.E.O., in a way, has turned out to be a failure, in part because its best practices have proved too easily manipulable. Drawing on hacker vocabulary, Navarro made a distinction between “white hat” S.E.O., which tries to follow the rules by creating valuable content that is correctly formatted, and “black hat” S.E.O., which dresses up shoddy content with formatting tricks in order to game search results. An excess of the latter has accelerated S.E.O.’s collapse. Artificial intelligence, meanwhile, is threatening to upend the search model altogether. Google’s recently launched Gemini products aspire to answer queries within the browser, so that a user doesn’t have to visit any external Web sites at all; this model seems highly likely to further decrease search traffic. (The company quickly backpedalled its rollout after the Gemini-powered answers proved unreliable.) Navarro compared the search engine to real-world infrastructure: “Google, they own the roads. They closed our road; they closed loads of roads. No one can get to where we are.”

Google recently reached out to HouseFresh and held a call between members of the search team and Navarro, plus the head of Retro Dodo, an indie gaming site that has also been affected by S.E.O. changes. Navarro told me that she tried to convey to Google the dire impact that its algorithm changes can have on a site like hers. The company asked how HouseFresh researches and writes its articles, Navarro said, presumably to better evaluate how the algorithm should treat the site; they were apologetic, Navarro said, but did not commit to any specific changes. (The Google spokesperson told me, “We take feedback from creators seriously, using their insights to improve our systems.”) HouseFresh is already developing other ways to bring audiences to its content. Ironically, one method they’ve landed on is turning the site’s text reviews into videos and then posting them on YouTube, a platform that is also owned by Google. The videos tend to be ranked highly in search, even when the corresponding articles aren’t. (It’s hard to escape Google, and that’s why the U.S. Department of Justice recently took the company to trial for monopolistic practices.) Navarro is also working with other independent Web sites to create D.I.Y. recommendations that do not rely on search engines, reminiscent of the early Web’s blog rolls and themed directories, which served as Internet yellow pages. “We need to start building something that is human, that has no algorithms involved,” she said.

At the time of its founding, in 1998, Google declared a mission “to organize the world’s information and make it universally accessible and useful.” With the company’s new range of products and updates, however, it seems content to bury the same material that it once was in the business of surfacing. What is most accessible is no longer necessarily what is most relevant, and so a major breakdown might be looming: if proprietors of Web sites don’t trust Google to serve them traffic, and consumers don’t trust Google to deliver them answers, then are its search-engine results really optimal for anyone? “They made all of us believe in their mission,” Navarro said. “Now I don’t even know if they believe in their mission.” ♦

Art

Products You May Like