Mercy's Big Data Project Aims To Boost Operations
Mercy, the St. Louis-based Catholic health system, is plugging into big data to improve the quality and efficiency of its administrative and clinical operations.
Mercy continually collects data—such as lab tests, prescriptions and payments—on patients at its 35 acute-care hospitals and 700 outpatient facilities and physician’s offices in Arkansas, Kansas, Missouri, and Oklahoma.
But it didn’t have a data-management infrastructure that would allow it to leverage all of that information to the fullest extent possible to improve the quality and efficiency—or overall value—of the healthcare services delivered to patients.
Mercy felt it needed to upgrade to a big data environment.
Mercy was using an enterprise data warehouse from Epic, which also is the vendor of its electronic medical-records system (EMR) called EpicCare. Epic designed its enterprise data warehouse, called Clarity (which is now part of the Cognito product family), to be updated in batches from Epic’s operational database, which sits underneath the software company’s EMR.
“It is a day behind what is happening in real-time,” but, in some cases, Mercy needs “to get data sooner than a day old,” says Alda Mizaku, analytics program leader at Mercy.
Mercy also wanted to work with different types of data, such as from social media, medical devices or apps on patients’ smartphones, MIzaku adds.
So Mercy is migrating to a Hadoop infrastructure using Hortonworks Data Platform 2.2, an open source, enterprise-wide data-management software framework that facilitates distributed storage and processing of many types of data in its native format across clusters of commodity hardware.
The health system completed the core Hadoop infrastructure in fall 2014; it began adding real-time data in July 2015.
Four clusters of servers house the Hadoop environment: a production cluster with 25 servers, a testing cluster with 10 servers, a research and development cluster with eight servers, and an engineering cluster, also with eight servers.
Five primary sources feed Mercy’s Hadoop environment, which is about 40 terabytes in size and includes information on between 8 million and 9 million patients. Those sources are:
- Real-time data, which is captured as clinicians click on buttons in the Epic EMR during their interactions with patients, and includes such information as lab test orders and results, vital signs, and medications.
- Batch SQL data, fed into Hadoop nightly from Epic’s Clarity data warehouse, including such information as demographic details, medical history and billing and insurance.
- Batch data from Epic’s log files, which track all of the patient data that Mercy’s users access.
- Batch data from Mercy’s enterprise resource planning system.
- And a separate database with inventory information, such as for medications.
Using the Hadoop infrastructure, Mercy has begun to improve administrative and clinical processes. One example is a project to improve medical documentation. (Health Data Management named Mercy as the 2015 Analytic All Star for the category of revenue-cycle project based on the first phase of this work.)
Creating an accurate claim for a hospital stay is tied to how well physicians document every diagnosis and medical complication. While physicians typically admit patients to a hospital to treat a specific, acute medical problem—such as a heart attack or pneumonia—they may diagnose other medical problems during patients’ hospital stays.
But if a physician does not adequately document the complete picture of a patient’s hospital stay, medical documentation specialists may prepare a claim that doesn’t reflect all of the clinical resources used to treat a patient or the true complexity of a case, leading the health system to bill for less money than it is entitled to receive.
Historically, Mercy’s documentation specialists had prepared claims after patients were discharged from the hospital, but often found it difficult to get additional information from physicians to resolve issues with documentation.
That is why Mercy adopted an automated chart-review process in which medical documentation specialists begin work on some cases while the patient is still in the hospital, and “while that information is still fresh in the physician’s head,” says Paul Boal, who was director of data management and analytics at Mercy until August 2015 when he joined Amitech Solutions, an information management and business analytics consulting firm in St. Louis.
Mercy’s data analytics team worked with documentation specialists to develop a list of more than 18 secondary diagnoses or significant complications—including sepsis, anemia, and acute kidney injury—that are often under-documented. They then developed clinical rules—such as a lab test value that might be indicative of a secondary diagnosis—to flag patient charts for review.
The analytics and documentation project teams first launched the automated review process in May 2014 by running queries against the data available in Epic’s Clarity data warehouse.
Now they want to migrate the process to Hadoop, so they can take advantage of access to real-time data available in that environment. The team is testing this now, and expects to be in production mode sometime later this fall.
The Hadoop process combines the real-time data (the changes happening to patients’ records throughout the day) on top of the batch data (the contextual information on the patient, such as their name and primary reason for being in the hospital).
The real-time data is stored in HBase—a distributed non-relational data structure that Mercy uses for real-time data—while the batch data is stored in Hive—an SQL-like data structure compatible with Hadoop.
“The blending of base batch data and real-time updates happens on demand when a query is run against the system,” Boal says. “We use Hive on top to merge a composite view.”
With either the SQL-based or the Hadoop-based data process, reports prioritizing patient records for review based on documentation omissions are created using SAP Business Objects, which is Mercy’s standard BI tool. The reports also are integrated directly into Epic’s EMR, including hyperlinks to the pertinent patient records within the EMR.
Documentation specialists log in to Epic’s EMR to review the reports every morning. They then seek out the appropriate physicians in person during morning clinical rounds on the inpatient hospital units.
The documentation specialists expect to bill more than $1 million annually in new revenue through claims that accurately reflect hospital patients’ secondary diagnoses and medical complications.
The documentation specialists already have a second project on the to-do list. They’d like to analyze patterns in physicians’ documentation behavior, such as which “physicians we have to follow up with the most and how responsive they are to queries and questions about their documentation,” Boal says. This type of information would give the team insight into how best to interact with individual physicians or even to notice patterns in physicians’ documentation omissions.
As Hadoop is quite different from a relational database environment, the team members adopted a practical approach that leveraged their experience with SQL databases, including these tactics:
- Extensive use of Hive has allowed Mercy to take advantage of its familiarity with SQL. In many cases, they can run the same queries in Hive as they had done in a SQL environment. But not always. “When you reach a certain level of complexity, there are some times when you have to break a large query into several sub steps,” explained Adam Doyle, lead application developer at Mercy.
- Repurposing the information already in Epic’s Clarity warehouse. To do this, they add metadata around the existing data to introduce new functionality not available in a SQL environment.
The data management and analytics team members not only have administrative projects on the horizon but clinical ones as well.
For example, they’d like to mine the continuous stream of data coming from electronic monitors that track the vital signs of patients in intensive care units to refine predictive models on the early warning signs of life-threatening medical problems, such as sepsis. Mercy already has such predictive models, but Boal believes they could become more “accurate and more timely” by harnessing the real-time data in the Hadoop infrastructure.
“This is about identifying patient risk sooner,” Boal says. And saving lives.