The National Interest

Big Data, Public and Private

The collection and maintaining of huge files of information on our communications, our movements, our online searching, and much else about our individual lives is, as Laura Bate notes, hardly something that the National Security Agency or any other arm of government originated. By far the greater share of the assembling, and the exploitation, of storehouses of data about the activities of individual Americans occurs in the private sector. So why should there be so much fuss about what a government agency may be doing along this line, while there is equanimity about the much greater amount of such activity by non-government enterprises? Is there something intrinsic to government that ought to make us more worried about such data mining? Let us consider the possible bases for concluding that there might be.

Potentially the strongest such basis has to do with the presence or absence of a free market, and related to that, whether or not the activity of the individuals on which data are being collected is voluntary. When I use a search engine on the Internet I am voluntarily using a free service in return for being exposed to some advertising and allowing the operator of the search engine or my Internet service provider to collect, and exploit, data about my interests. Most interactions with government agencies and especially security agencies do not involve as much voluntarism. So maybe it is logical to be more persnickety for this reason about what government entities are doing.

That makes sense as far as it goes. But in practice the logic quickly runs up against the fallacy of equating the private sector with free markets and free will. If I want land-line telephone service at my home (and I very much do), I'm stuck with Verizon. I am forced to let Verizon collect comprehensive records of my calls—the “metadata” we've heard so much about. And of course, if someone at Verizon wanted to listen in on the substance of my calls that could be done as well, although it is a reputable company and I would be surprised if that were happening. The point is that there is much less free will and free choice in private sector data-generating activity than we might like to think, and in many cases little or no more free choice than when a government agency is involved.< p/>

This is true not just of local utility monopolies such as land-line telephone systems but to a large degree of other services in the Internet age. Some such services, including online access itself, have quickly transitioned from being seen as nifty innovations to being regarded as necessities. And again, free choice is often much less than we would like. This fact was recognized with the antitrust action against Microsoft, which was using its commanding position in operating systems to muscle into a bigger share of the market for browsers and other applications.

When there is enough market competition for users theoretically to vote with their feet—or with their fingers on the keyboard—if they are worried about what is being done with data collected on them, in practice any market correction mechanism would be very slow and clumsy. Imagine that a rogue employee at Google started using information about embarrassing web searches to ruin the reputations of particular people he was out to get. If that sort of abuse happened enough times, then perhaps significant numbers of users would abandon Google's wonderfully effective search engine in favor of Bing or something else, and Google would become less able to sell as much advertising as it does now. But the corrective process would be slow and awkward, and in the meantime a bunch of people would have their reputations ruined.

Another possible basis for distinguishing the amassing of data in the public and private sectors is to ask what controls or checks apply to each. Here there is indeed a big difference, and the difference is in the direction of there being far more controls and checks applied to government agencies than to private sector enterprises. For the security agencies there is the whole legal structure, dating back to the 1970s and strengthened since then, of restrictions and Congressional oversight. Nothing remotely resembling those sorts of external controls exists for data mining in the private sector. Then there are all the internal checks and controls, which as Bate mentions in the case of NSA are extensive. These include compartmentation of information—second nature to the security agencies, which use compartmentation to protect sensitive national security information even if there is no issue of the personal privacy of U.S. citizens. NSA senior management says publicly that only 22 people at their agency are able to query the telephone metadata that are of concern. How many people at Verizon can do something with the comprehensive record of my telephone calls? I don't have the faintest idea, and probably no one else outside Verizon does either.

Another question to ask is how the public and private sectors may differ regarding the potential for abuse, in terms of not just access and capability but also incentives. For most conceivable types of individual abuse, there is no reason to expect the incentives for individual abuse to appear more in one type of organization than the other. A potential abuser thinking of, say, looking at an ex-spouse's calling record may pop up in either the public or private or sector. Disincentives to this kind of abuse probably are stronger in the security agencies, given the regular reinvestigation regimen that people with security clearances undergo.

As for incentives that are more institutional than individual, there are further differences. As an example of a mistaken and destructive use of data mining, think of an innocent person being put on a no-fly list and, as a result, having his business damaged because of his inability to fly. Government agencies have no conceivable incentive for this to happen. For them, false positives merely add clutter and make it more difficult to accomplish their assigned mission, such as keeping real terrorists off airplanes. And when a mistake of this sort does happen and becomes public, such as putting Ted Kennedy on a no-fly list, it is an embarrassment to the agencies responsible. In the private sector, however, there always are commercial and financial interests in play. Those interests may well provide an incentive—such as for competitors in the same line of business—to damage the business of someone else.

In addition to all of these criteria, one also should ask what benefit or greater good is going to the person about whom data are being collected, as well as perhaps to others. What is being bought, in other words, in return for whatever risks or intrusions are involved in amassing the data? With the sort of data mining that NSA does, the presumed benefit is in the form of greater protection against terrorists, or perhaps other contributions to national security. There has been debate, of course, about just how much of this type of benefit is being obtained, but at least the objective is one that most Americans would consider important. The corresponding answer for private sector use of big data is harder to come up with. It would seem to consist of something like better tailoring of ads that appear on the user's computer screen, which might streamline online shopping. Nice, perhaps, but hardly in the same league as national security.

Two overall conclusions follow. One is that there are substantially stronger reasons to worry about the collection and use of big data in the private sector than in government agencies.

The other is that the prevailing pattern of public consternation about this subject being nevertheless focused on government agencies indicates that the consternation is not driven by any careful consideration of risks, costs, benefits, incentives, and choices. Instead it is driven by a crude image of government agencies, and especially certain types of government agencies, as Big Brothers worthy of suspicion or even loathing. Sentiments toward private sector enterprises vary, but the biggest contrast to the image of government is enjoyed by the titans of Silicon Valley and the enterprises they run, having the status of heroes.

The crudeness driving the sentiments is one of the main reasons (inconsistency over time in what the American public expects from the government agencies involved is another big reason) we should not be surprised if morale at a place such as NSA is low.

This article originally appeared on The National Interest.