Does Big Data Favor Big Companies?

Research Summary 2017-2

Does Big Data give large companies an unfair competitive advantage? New machine learning technologies depend on access to large amounts of data. This means that large companies might be able to use their huge stores of data to provide better products and services than smaller rivals and startups. For example, Google might be able to use its huge store of information about users’ searches to provide better search results than rivals like Yahoo and Bing and thus increase its dominance in the search market. Generally, Big Data might create barriers to entry that reduce competition and so perhaps Big Data should be a concern for antitrust regulation. Some people have proposed drastic remedies such as breaking up the large tech companies.

A new empirical research paper, “Search Engines and Data Retention: Implications for Privacy and Antitrust” by Lesley Chiou (Occidental) and Catherine Tucker (MIT), suggests Big Data might not always be as big a concern as one might think. They looked at what happened to search quality when search engines changed the amount of time they retained user search data. They found that the amount of data retained had no measurable effect on the quality of search.

A natural economic experiment occurred after the European Commission recommended in 2008 that search engine companies keep user data for shorter periods of time. Both Yahoo and Microsoft made changes to their data retention; Yahoo later lengthened its data retention. These changes gave the researchers a means of testing the significance of the amount of data retained. They measured search quality by looking at how UK residents searched; if users repeated their searches, it indicated that they were not satisfied with the initial set of results provided. Comparing search behavior before and after the changes in data retention, Chiou and Tucker found no significant difference in the rate of repeated searches.

In other words, less data did not mean inferior search capability in this case. Perhaps the quality of search depends more on the quality of the software algorithms that are used rather than on the raw amount of data available. Hal Varian, Chief Economist at Google, describes in an economics paper how Google improves its search through thousands of experiments. While those experiments surely require large amounts of data, the benefits of additional data might be minor after a certain threshold is reached. Bigger is better, but only up to a point.

On the other hand, more data does provide clear advantages in other situations. Catherine Tucker in another paper with Avi Goldfarb shows that personal data about web users—data that were restricted by European privacy laws—improves the effectiveness of web advertising. The emerging picture is complicated, posing challenges for privacy and antitrust policy.