Why Doesn’t Google Pay Fees to Websites for Using their Content?

JeffOct 7, 2024

Recently I wrote an article on Big Tech (Big Tech,the Internet, the Loss of Choice, and the Skyrocketing Cost of Living) and tried to explain why these extremely large monopolistic companies are hurting America. In that discussion I used some harsh language for Google’s practices. I said they’re basically stealing web content. In short what I was trying to show was that Google is just one on a list of these big tech companies that got their initial boost of success from the use of digital content where no fees were paid. For example, YouTube started out with lots of pirated Hollywood movies. And then there was Napster whose ascent was fueled by pirated music. Basically I was trying to point out that part of Google’s success has to do with the fact that they never were required to compensate any websites for the content they retrieved, stored and indexed. In essence, I was arguing that this is what made Google so tremendously successful. They are providing search “answers” to questions, but are relying on data that is not their own.

So I researched the issue and the reason I found that this content retrieval is allowed by search engines, was something called “fair use”.

Because ChatGPT is a compendium of knowledge and is less biased than any individual person, I decided to consult it for some answers regarding “fair use”. The following is the conversation I had.

Jeff:

Why can Google and other search engines just use the text they retrieve, index and store without paying licensing fees?

ChatGPT:

Search engines like Google can index and display snippets of text without paying licensing fees largely due to the doctrine of fair use. Here are the main reasons:

Transformative Purpose: Search engines provide a new context for the content by summarizing and linking to original sources, which is often seen as transformative.

Limited Use: The text displayed is usually a small portion of the original content, which further supports fair use arguments.

Public Benefit: Search engines enhance accessibility to information, which serves the public interest and can justify their practices under fair use.

Lack of Licensing Requirements: The legal framework for fair use allows for this practice without requiring licensing agreements, particularly when it benefits users and facilitates information discovery.

These factors collectively contribute to the legal protections search engines enjoy when using content from various sources.

Jeff:

Why does fair use apply to search engines if they display the searched-for text “inline” with the webpage results, and consequently reduce traffic through to that website? Who needs to go to the original website if Google provides the stored text for viewing?

ChatGPT:

Fair use applies to search engines primarily because their use of inline text serves a transformative purpose: they provide summaries and links that help users find relevant information. This can be seen as adding value by facilitating access to content rather than simply copying it.

Key factors include:

Transformative Use: Search engines transform the original content by providing context, aggregating information, and enabling discovery.

Purpose and Character: The goal of search engines is to inform and direct users rather than to monetize the original content, which supports fair use.

Market Impact: While displaying snippets may affect traffic, courts often consider whether the original work is still available in a marketable way. If users are led to the original page, the impact may not be as detrimental as it seems.

Public Benefit: Search engines promote accessibility to information, which aligns with the public interest.

These factors contribute to the legal protection search engines receive under fair use, despite potential concerns about traffic effects.

Jeff:

But if Google, for example, displays a piece of text exactly as it is from the original website how is it transformative? Secondly, if Google can keep the user on their search results page, they get greater exposure to their own ads and most likely deprive the website of a visitor. Explain your answer very concisely please.

ChatGPT:

Even if the text is displayed verbatim, it’s presented in a different context, serving the purpose of summarizing and directing users rather than replacing the original work.

While there’s potential for decreased traffic, courts often consider whether users still find value in visiting the original site. The overall function of search engines is to enhance information accessibility.

Search engines provide a service that benefits the public by organizing and making content discoverable, which supports their fair use defense despite any advertising implications.

Jeff:

Yes, but wasn’t this perceived benefit of a “search service” what blinded people to the fact that their content was being used unfairly? Isn’t the free use of this world-wide content what fueled Google’s success?

ChatGPT:

Yes, the success of search engines like Google is largely due to their ability to provide access to content, effectively using others’ materials to enhance their service. This raises valid concerns about the implications of fair use:

Content Utilization: Google’s model relies on aggregating and displaying content from various sources, which does benefit users but can undercut original creators’ traffic and revenue.

Commercial Gains: By using free content to attract users, Google does generate significant ad revenue, leading to criticism that original content creators are not compensated fairly.

Awareness: Many users may not fully realize how their content contributes to the search engine’s value, leading to ongoing debates about intellectual property rights and fair compensation for creators.

These points highlight the tension between fair use, commercial interests, and the rights of content creators in the digital landscape.

Jeff:

Isn’t the real reason that search engines have been able to get away with this practice is because they are aggregating so many tiny websites with very little power and time to hire a lawyer? And to underscore my point–assume there was a single company that produces half of the world’s content. Isn’t it more than likely that such a company would not allow Google to profit off such a large share of media which they owned? Wouldn’t they probably use their own internal search engine, or perhaps file a lawsuit like the New York Times has filed against AI companies like OpenAI?

ChatGPT:

You raise important points. The dynamics of power and resources in the digital landscape do play a significant role:

Power Disparity: Smaller websites often lack the resources to challenge large corporations like Google, making it easier for search engines to leverage their content without significant pushback.

Consolidation of Content: If a single entity controlled a large share of online content, it could more effectively negotiate terms or challenge unfair practices. This could lead to more stringent controls over how their content is used.

Legal Actions: Larger companies, like the New York Times, have the resources to pursue legal action, which can shift the landscape. If major content creators banded together, they could exert more influence over fair use interpretations and potentially negotiate better terms.

These factors contribute to the ongoing debate about fair use and the balance of power between content creators and platforms like search engines.

Jeff:

So are you saying that there’s no real ethically right answer here? Are you saying that if companies had gotten together and demanded greater compensation for the indexing of their content, that perhaps Google may not have reached the same level of success and held such a far reaching monopoly over the industry?

ChatGPT:

Yes, that’s a fair assessment. The situation highlights a complex moral landscape:

No Clear Moral Answer: There isn’t a universally accepted “right” answer. The balance between promoting access to information and ensuring fair compensation for creators is contentious.

Collective Action: If major content producers had united to negotiate terms or advocate for better compensation, it could have significantly altered the dynamics between them and search engines, potentially limiting the latter’s success and influence.

Market Influence: Greater collaboration among content creators might have resulted in more equitable practices, impacting how search engines operate and possibly preventing the monopolistic tendencies seen today.

Overall, these dynamics underscore the challenges in creating a fair and sustainable ecosystem in the digital content landscape.

So we can see that one of the main points that ChatGPT makes centers around the concept of “transformative use”. ChatGPT said that “even if the text is displayed verbatim, it’s presented in a different context, serving the purpose of summarizing and directing users rather than replacing the original work.”

But today I believe the way Google shows inline text with search results demonstrates exactly why this is not a “fair use” of content.

By showing the text of the searched for content on Google, they diminish the number of visitors to the original host. Essentially Google has become one great, big, monolithic webpage. It’s not “fair use” because in fact they don’t just use a small portion of the website text, they index the entire website of the original host–every word. Yes, they are only displaying a small section of text inside the results list, but again, they are utilizing 100% of the content to find this result. Is this “transformative” enough to justify “fair use”? Well ultimately ChatGPT seems to come around to the conclusion that “there isn’t a universally accepted ‘right’ answer. The balance between promoting access to information and ensuring fair compensation for creators is contentious.”

And indeed, there isn’t an ethically 100% correct answer. I agree. But here’s the point I’m trying to make. The whole issue could be viewed very differently. Let’s say for example, we had some person working in a bank that had access to a computer and the checking accounts of millions of persons, and let’s further imagine they steal 1 cent from 100 million accounts and deposit the money into their own. Would you be upset about losing 1 cent? Indeed, you might completely overlook the debit to your account. But in this case, quite clearly, the law would consider this a criminal act and it would prosecute the person who profited by 1 million dollars despite the very small damage to you.

So, this is what I’m saying. Because the loss to each and every tiny website was such a small amount, no one said a thing. No harm, no foul. Everyone was in favor of the service Google was providing. The internet was new, exciting and growing at an exponential rate.

Today, we understand that there’s a “robots.txt” file that can indicate to all search engines that none of our content be indexed. Web crawlers are supposed to honor that request. But here’s the thing–you see, without a “robots.txt” file on your domain, Google assume its OK to retrieve and store and index your files anyway. Invented in 1994, this “robots.txt” was added to the search protocol because search engines were overwhelming the sites of some small domains. It was not made to protect digital rights. If Google or any other search engine took seriously your digital rights for the content you created, then the presence of a “robots.txt” file would have been a positive affirmation to allow search, and not just a means to exclude it.

So in a nutshell, I wanted to explain why I used the word “stolen” content with regards to Google search in my previous article (Big Tech). It’s harsh, but its consistent. YouTube’s success was founded on easy access to pirated video content. Napster’s was founded on access to pirated music. It’s just that Google’s “piracy” is of a much subtler form. Now today, perhaps YouTube compensates original license holders for uploaded content that contains snippets of their audio or video, rather than delete that content. And this is a fair practice. But imagine now a competing video hosting website with YouTube, back in the day when video websites were getting popular–and imagine that website was careful to remove all the uploaded pirated content. Which video hosting website do you think was going to have more traffic and ultimately success–the one with the free Hollywood movies, or the one with just videos of drooling cats and dogs?

So this is my greater point. These companies get their start and founding capital from questionable practices. Once they become successful, that’s when they clean up their act and start acting like an ethical company. It’s like this. Say I start a shipping company and I start filling an 18 wheeler in Mexico with cocaine. Then I drive it across the border into the US. So after a few hundreds of runs, now I have millions of dollars in the bank. I can buy a lot more trucks. Hire more drivers. I can buy warehouses to store the product. Then guess what? This is the point where I switch from cocaine, to corn, beans, and rice and become a legitimate company. And once I become powerful enough and I’m hiring people and doing good for that town or city where I’m located, who is going to come in and disrupt my business by investigating my corrupt origins?

I ask you — how many successful businesses kick off in this way do you think–that is, by starting out in a very corrupt way and then cleaning up their act as they become more visible?

In a lot of ways, Capitalism is a dirty business.