Skip to main content
Discovery Communications Wellness Center Medical Director Liz Sequeira arrives to examine patient Bonnary Lek (L) during a medical appointment at the clinic in the Discovery headquarters in Silver Spring, Maryland December 3, 2009. Sequeira says seeing patients in her office at Discovery Communications Inc's modern headquarters is like stepping back in time and being an old-time small town doctor. As Democratic lawmakers in Washington inch ahead with plans to overhaul the U.S. healthcare system, companies like Discovery are striking out on their own -- whether through on-site doctors or diet plans -- to rein in soaring costs in a nation where employers still pay for the bulk of medical care. Picture taken December 3, 2009. To match Special Report USA-HEALTHCARE/WELLNESS REUTERS/Jim Bourg (UNITED STATES HEALTH BUSINESS SOCIETY) - GM1E5C90O2501

The opportunities and challenges of data analytics in health care

, , , and
Editor's Note:

This report is part of "A Blueprint for the Future of AI," a series from the Brookings Institution that analyzes the new challenges and potential policy solutions introduced by artificial intelligence and other emerging technologies.

Data analytics tools have the potential to transform health care in many different ways. In the near future, routine doctor’s visits may be replaced by regularly monitoring one’s health status and remote consultations. The inpatient setting will be improved by more sophisticated quality metrics drawn from an ecosystem of interconnected digital health tools. The care patients receive may be decided in consultation with decision support software that is informed not only by expert judgments but also by algorithms that draw on information from patients around the world, some of whom will differ from the “typical” patient. Support may be customized for an individual’s personal genetic information, and doctors and nurses will be skilled interpreters of advanced ways to diagnose, track, and treat illnesses. In a number of different ways, policymakers are likely to have new tools that provide valuable insights into complicated health, treatment, and spending trends.


Caitlin Brandt

Assistant Director and Senior Research Analyst - Center for Health Policy, Brookings

Abigail Durak

Center Coordinator - Center for Health Policy, Brookings

However, recent developments in data analytics also suggest barriers to change that might be more substantial in the health care field than in other parts of the economy. Despite the immense promise of health analytics, the industry lags behind other major sectors in taking advantage of cutting-edge tools. Most health care organizations, for example, have yet to devise a clear approach for integrating data analytics into their regular operations. One study even showed that 56 percent of hospitals have no strategies for data governance or analytics.

Compared to other industries, the slow pace of innovation reflects challenges that are unique to health care in implementing and applying “big data” tools. These barriers include the nature of health care decisions, problematic data conventions, institutionalized practices in care delivery, and the misaligned incentives of various actors in the industry. To address these barriers, federal policy should emphasize interoperability of health data and prioritize payment reforms that will encourage providers to develop data analytics capabilities.

Despite the immense promise of health analytics, the industry lags behind other major sectors in taking advantage of cutting-edge tools.

Sensitivity of care decisions

A major barrier to the widespread application of data analytics in health care is the nature of the decisions and the data themselves. Unlike many other industries, health care decisions deal with hugely sensitive information, require timely information and action, and sometimes have life or death consequences. Each of these features creates a barrier to the pervasive use of data analytics.

The immediacy of health care decisions requires regular monitoring of data and extensive staffing and infrastructure to collect and tabulate information. The nature of health care decisions are more immediate and intrinsic than those made in other settings, creating a hesitancy about overhauling any major aspect of care provision. Health care decisions must take into account patient preferences, which at times differ from expert recommendations.

The importance and complexity of these decisions means physicians and patients insist on very high standards for data-analytics tools in health care. That has proven very challenging to designers of these tools, as health providers are more accustomed to dealing with either broad knowledge or narrow choices rather than complex predictions that require careful identification of decisions and calibration of predictions. As a result, clinical decision support software has struggled to make better insights than physicians. Even one of the most advanced systems, IBM’s Watson, made a series of “unsafe and incorrect treatment recommendations” because it was calibrated based on synthetic cases rather than real patient data. There is risk even when training software uses real patient data because decision support software may overfit its models and thereby make less useful suggestions, such as prescribing an inappropriate treatment plan. Sometimes, the clinically best medical decision is not always what a patient wants to pursue.

The sensitive nature of health care decisions and data furthermore creates major concerns about privacy. Patients are rightfully concerned about the security of their data and concerned about it being used in ways that are detrimental to them, damage their reputations, or disadvantage them in the rating and marketing decisions of insurers. This isn’t limited to medical record data. Recent news coverage of the capture of the Golden State Killer, for example, has raised new questions about the privacy of direct-to-consumer genetic testing. And while the growth of “wearables” such as FitBit and Nike+ FuelBand have made health status monitoring accessible to patients, these data are not subjected to federal patient privacy laws, allowing these companies to design their own internal privacy policies and share information with third-parties.

Problematic data conventions

Several data conventions in health care hinder the widespread use of data analytics. Currently, health care data are split among different entities and have different formats such that building an insightful, granular database is next to impossible. These qualities greatly increase the cost of using data to provide value, even when all the relevant information has been recorded in some form. Furthermore, even well-structured data are often not available to researchers or providers who could use them in useful ways.

Several data conventions in health care hinder the widespread use of data analytics. Currently, health care data are split among different entities and have different formats such that building an insightful, granular database is next to impossible.

In general, the health care industry has been resistant to making information available as open data commons, which are up-to-date data provided in accessible format and available to all. That resistance comes in part from fear of violating privacy, even though existing strategies for protecting confidentiality greatly mitigate that risk. A larger reason is that data commons are a public good and will naturally be undersupplied by the market. A third data challenge is data quality. For analysis or predictions to have any value, they must be based on good data. One of the most hyped applications of big data in epidemiology, Google Flu Trends, turned out to underperform far more basic models, despite analyzing far more data, because its analysts were extrapolating from the behavior of Google users—an unrepresentative group of people. The experience illustrated that the success of data analytics in health care is dependent upon the availability and utilization of quality data.

Institutional practices

Entrenched practices in the delivery of health care also create several barriers to the full adoption of data analytics. One clear illustration of the challenge is in one of the most promising areas of data analytics: clinical decision support. While data analytics could greatly improve the clinical decision-making process, the development of decision support tools hasn’t paid sufficient attention to how decisions are actually made and the related workflows supporting those decisions. The tools often assume that putting the right information on a single person’s dashboard can induce them to make the right decision, but in reality, most difficult clinical decisions involve many actors and often follow institutional guidelines designed by committees. Data tools that do not fit into existing work and decision-making structures add burdens to physicians and are much less effective than they could be. For example, many attempts to bring data analytics or other information technology into health care have created a large data entry burden for physicians. This had led to high-profile mistakes, physician burnout, and general dissatisfaction with the tools.

As a consequence, most of the major reasons physicians cite for their resistance to adoption of new data tools are related to workflow disruption. For data analytics to truly transform care, the designers of tools need to cognizant of the context their tools will be used in and health care organizations must be willing to reorganize some elements of their practice to empower patients and providers to use data-driven care.

Misaligned incentives

Arguably the largest barrier to the implementation and application of data analytics in health care is the splintered landscape of the industry, with separate components having their own incentives that diverge from what might be best for the entire system. At the moment, physicians or delivery systems may not know that their patients have visited emergency rooms, for example, unless told by the insurer—because claims data are held by the payer. Meanwhile, care providers may hold clinical data that could help insurers better manage their patient’s costs. The responsibility for managing any given patient is split between their insurer and various providers, each with different incentives and needs and neither functioning as an ideal agent for the patient.

Insurers have incentives to invest in better health for their covered population, but these incentives are mitigated by annual contracts with employers or individuals and employee turnover, which moves many enrollees to a different insurer before the payer’s investments in their health pay off. There are also serious concerns with expecting insurers to take the lead on data analytics in health care. First, data tools designed for insurers are likely to center on costs, which may leave some quality-enhancing insights unexplored. Second, insurer data analytics may impose an externality on hospitals and physicians, which have to bear the administrative costs of complying with the data practices of various insurers. Third, insurers may not conduct their data analytics on a clinically useful timetable. Unless they feed data to providers continuously, it may not be timely enough to affect how patients receive care. The limited degree to which insurers provide claims data to providers that they contract with may reflect the expense of doing so, limitations in their legacy IT systems, or a desire to retain more of the care management responsibility.

The responsibility for managing any given patient is split between their insurer and various providers, each with different incentives and needs and neither functioning as an ideal agent for the patient.

Health care providers have their own particular incentives. Under the most common payment schemes, providers typically have little incentive to control patient costs. However, they likely do care about quality of care, even if they are hesitant to change their institutional practices and norms. Despite seeming like a more logical locus for data decisions, hospitals are often unwilling to undertake the costs of developing data capabilities or the disruption of implementing their use into regular practice. Hospitals also have an incentive to slow health information exchange standards because the lack of interoperability binds physicians into referral patterns favorable to them. Similarly, vendors of health information technology often don’t want standardization of data tools and practices because differentiation of their products and high costs for providers that switch vendors create substantial monopoly power for vendors. Finally, patients themselves often don’t support data practices that can improve care for all. The fear of data breaches or misuse leads patients to oppose data sharing arrangements that may have widespread positive externalities. In short, no individual actor in the health care space has the incentives or means to fully embrace the most revolutionary data analytics practices.

Policy recommendations

Because of the systemic challenges described above, we need policy changes that diminish the barriers to health analytics. While there is potential for radical overhaul, the initial priority should be making sure all hospitals can record, use, and share patient data in useful ways. One critical component of that agenda is ensuring interoperability of Electronic Medical Records (EMRs). Federal policy has contributed a great deal to the adoption of EMRs and other health IT practices through incentives under the Medicare program, but providers still struggle with sharing that data. As discussed above, neither hospitals nor EMR vendors have a strong incentive to standardize health information exchanges, despite the fact that interoperable EMRs can improve care and save money. The 2009 Health Information Technology for Economic and Clinical Health (HITECH) Act included health information exchange as one of the required capabilities for certified EMR systems. However, this requirement was included at a later implementation stage, allowing EMR systems to be designed and integrated into health systems without these capabilities, making interoperability even more difficult. In 2016, the 21st Century Cures Act increased incentives and penalties specifically promoting EMR interoperability.

These incentives need not aim to establish one universal EMR. Applications that can access and transfer health data from different kinds of EMRs can achieve interoperability, but they are not used as widely or thoroughly as possible, risking a situation where the applications meant to bridge different EMRs themselves fail to adopt uniform data conventions. Federal policy could standardize the way EMR data are accessed and transferred by applications, like Fast Healthcare Interoperability Resources (FHIR), that exist to facilitate interoperability. It could also revise HITECH and the Health Insurance Portability and Accountability Act (HIPAA) to allow fees for data exchange, thus creating incentives to improve data exchange that could potentially counteract the existing disincentives. Federal support for best practices in data management and use would go a long way in helping the industry develop its own capabilities.

The federal government can also indirectly support the development of health data analytics by continuing to encourage payment based on the value of care, typically through the Medicare program, encouraging alternative payment approaches, and by working to align quality measures and payment approaches with private insurers. Under value-based care models, providers are typically paid some amount per beneficiary based on the package of care they are expected to deliver, with payment at least partially tied to quality-of-care metrics. These models aim to create the incentive for providers to provide high-quality care at lower costs, which often involves closer coordination of care and careful revision of many practices. All these features make hospitals operating under value-based care models better loci for data-backed decisions. Kaiser Permanente has demonstrated the power of a well-integrated data strategy aimed at managing costs and quality. Conversely, improved data analytics capabilities may be precisely what health care providers need to better coordinate and improve value of care. Medicare could improve the usability of its data for a wider audience with a varying degree of analytic capabilities to help more of these providers successfully implement these new health care models. Coupling these systemic health care reforms can allow them to complement each other and reduce administrative confusion.

Federal support for best practices in data management and use would go a long way in helping the industry develop its own capabilities.

One factor that is holding back progress toward value-based payment is risk adjustment—varying the payment on the basis of how challenging one provider’s patients are in comparison to other providers. Much of the energy in improving risk adjustment has focused on contracts between purchasers and insurers—for example, between the Medicare program and Medicare Advantage plans. But the risk adjustment challenges for contracts between insurers and providers are distinct from these and, if ignored, pose grave challenges to some of the best providers, who inevitably attract patients with the most challenging conditions.

Despite the disruptions to conventional practices, all actors in health care should be excited about the possibilities that new data tools will bring. But obtaining this enormous potential is not around the corner and will require overcoming challenges by all of the relevant components of the health care system.

The Brookings Initiative on artificial intelligence and emerging technologies seeks to establish a proper societal framework for the impending “digitalization of everything.”

Get daily updates from Brookings