Hospital intensive care units (ICUs) are a case in point. They generate an abundance of information about their closely monitored patients: vital signs, medications, lab results, providers’ notes, fluid balance, diagnostic codes, imaging reports, and more. Of course, this information is tracked primarily because it can make a life-or-death difference in steering an individual patient’s care. But through the application of data science and machine learning technology, it has the potential to improve the quality of intensive care as a whole.
Led by Roger Mark ’60, PhD ’66—Distinguished Professor of Health Sciences and Technology and of Electrical Engineering and Computer Science — MIT’s Laboratory of Computational Physiology (LCP) has unlocked the aggregate power of such data, by building and maintaining the Medical Information Mart for Intensive Care, or MIMIC. The most accessible database of its kind, MIMIC archives clinical data going back to 2001 from nearly 60,000 patient stays in intensive care units at Boston’s Beth Israel Deaconess Medical Center (BIDMC), where Mark is senior physician. The updated version released last year, MIMIC-III, is the culmination of a decade-long collaboration among that hospital, MIT, and Philips Healthcare, with support from the National Institute of Biomedical Imaging and Bioengineering.
Among the challenges the project faced early on was the hospital’s switch to a new information system. Data elements such as blood pressure or laboratory results have had to be mapped across incompatible systems. Leo Anthony Celi SM ’09, a LCP research scientist and BIDMC physician, ruefully labels such clean-up work “the bane of medical informatics.” In fact, the goals of LCP, which is part of MIT’s Institute for Medical Engineering and Science, begin rather than end with the onerous task of curating MIMIC’s data. The group’s interest in the ICU is tied to the acute need for reproducible studies that support best practices in that space. Decisions about treatments or interventions for critically ill patients can vary widely, often based on an individual clinician’s training, knowledge, and habits.
Several of LCP’s researchers have medical backgrounds like Mark and Celi; others have expertise in computer science, electrical engineering, physics, mathematics, or some combination thereof. Pooling their expertise, teams of clinicians, engineers, and scientists design studies using MIMIC that evaluate variations in clinical practice, weigh the effectiveness of diagnostics and therapies, and create predictive models for patient outcomes.
“Our group has engendered a cross-disciplinary research ecosystem around MIMIC with activities such as ‘datathons,’ and an upcoming fall course on secondary analysis of health records that will be streamed live globally free of charge,” says Celi, adding that the group will publish an open-access textbook to coincide with the course.
LCP’s computer scientists have stripped MIMIC’s data of patient-identifying characteristics in accordance with Health Insurance Portability and Accountability Act (HIPAA) standards—the key to providing colleagues around the world with entrée to its treasures. More than 2,500 credentialed researchers from some 32 countries have signed a basic data-use agreement to gain full access to MIMIC since it first became widely available in 2007. A subset of the data—consisting of multichannel waveform recordings of physiologic signals and vital signs sampled hundreds of times each second—is available to the public without restrictions, and heavily used for education as well as research, including in Health Sciences and Technology coursework at MIT itself.
Despite its unprecedented scope, the fact that MIMIC taps into the data of a single hospital does put a caveat on the findings that can be gleaned from it. “The use of a single-center database is always problematic in that the findings may only apply to patients at BIDMC and may not be generalizable to other intensive care unit populations,” Celi notes. Last year, Philips and LCP launched a new, separate database compiling close to 3 million ICU admissions from across the US. Researchers may apply to Philips for use of that full database, but the team plans to publicly release a subset representing 200,000 admissions. Meanwhile, plans to expand MIMIC are international in scope. According to Celi, the researchers are in discussion with colleagues building similar ICU databases in the UK, France, Belgium, Brazil, and Greece.
Even in its current form, MIMIC has produced an average of one publication per month over the past three years. “There have been a number of findings that have changed the way I and my colleagues practice in the intensive care unit,” says Celi. “For example, we learned that in the 10-year period covered by the database, not a single patient with advanced liver disease who required dialysis during his or her hospitalization has survived to discharge. We previously viewed kidney failure as a complication that we can treat with dialysis while the patient recovers from an acute illness. It turns out that among those with advanced liver disease, it is a marker of unsurvivable critical illness. This information is powerful when we discuss goals of care with these patients and their families.
“We have evaluated numerous interventions in the intensive care unit that have been widely implemented with little evidence that they improve outcomes—arterial catheterization for invasive blood pressure monitoring, echocardiography to assess heart function, routine blood testing—and found that their effect varies across patient subsets: some benefit and some are harmed,” he continues. “This type of research is one of the building blocks of precision medicine. We are now in the process of creating decision-support tools based on our findings.”