What began as a limited use of Hadoop at Children’s Healthcare of Atlanta is becoming a full-fledged big data initiative that is helping the organization provide better care for patients and deliver information that could potentially help citizens of Georgia avoid health problems in the future.

Children’s Healthcare provides a variety of healthcare services for children throughout the state, operating three hospitals and more than 20 neighborhood locations including five urgent care centers.

The institution’s foray into Hadoop began in 2013. A clinical research project it was working on with Georgia Institute of Technology needed bedside vital-monitor data—including heart rate, blood pressure, respiratory rate and oxygen saturation—from Children’s Pediatric Intensive Care Unit (PICU). Georgia Tech wanted to leverage historical and granular data from the monitors to understand what , if any, effect the environment of care—such as noise and light—had on patient vital signs. With that information, care givers could improve the environment by instituting quiet hours, moving noisy machines or redesigning care areas to improve the environment of care.

“Their timeframe was short, and we needed a solution quickly” to gather and analyze data, says Tod Davis, manager of business intelligence at Children’s Healthcare. “After investigating the data volume, flow and processing needs, it was clear that our current systems, already stretched to their max, could not handle the workload. Enter Hadoop.” Apache Hadoop is an open-source software framework for distributed storage and processing of large data sets on computer clusters.

Also See: FDA Looks To Big Data To Protect Public Health

Davis worked with an Oracle contractor to devise a plan to assemble a workstation-based cluster over a weekend. The first proof-of-concept -- a six-node cluster built out of 20 scavenged PCs from a hardware refresh -- was created in October 2013. “We nicknamed it ‘Frankendoop,’” Davis says. “Since then, we’ve migrated to an eight-node HP cluster and are currently implementing a 23-node Cisco cluster.”

The IT staff then began collecting data and sending it Georgia Tech.

“About three months later I got a call from a nurse who said [the unit needed] to be able to understand what’s happening to babies when they have stressful procedures,” Davis says. “I said, ‘What if I told you we [collected] all this data…. She couldn’t believe that we had the data.”

With the Hadoop tools already in place, Davis and his team created a new project for gathering and analyzing bedside vitals in real time.

The data analysis conducted as part of the project resulted in improved patient outcomes. It showed that the vital signs deviated from the baseline for much longer than anyone knew, Davis says. “Erratic or elevated vitals is an indicator of patient stress,” he says. “The clinicians are now aware of these extended stressful periods and can stay with the patient to comfort and assist them in recovery from the procedure.”

The results led to a retraining of hospital staff to equip them to better assess and understand neonatal pain, agitation and sedation scale (N-PASS) scores, providing information to clinicians so they can improve pain management in premature babies.

“What really struck me the most is we took something that we had no idea what it might be and turned it into this really powerful story about technology and helping people, and specifically helping babies,” Davis says. “Much of this data was previously being thrown away. We thought it wasn’t useful or we didn’t have the storage capacity.”

In many cases the vital sign data wasn’t being saved beyond three days because of the cost of storage. “There is so much data [being gathered] that we only keep a tiny percentage in our electronic medical record and associated data warehouses,” Davis says. “We now have bedside vital data from October 2013 to the present. The word is out among the physician researchers at Children’s and demand for analysis is high and growing.”

The bedside vital data project was fortuitous from an IT standpoint, because Children’s Healthcare technology leaders had been eager to launch data analytics efforts and this proved to be a good starting point.

“We needed to get into Hadoop and a Hadoop-ready project landed in our lap,” Davis says. “What started as a minor technical solution to a temporary need has opened many doors to develop technical solutions to help take better care of our patients.”

 The next Hadoop initiative the organization launched, which is ongoing, involves an asthma research study that combines 20 years of air quality data from the Environmental Protection Agency (EPA) with the hospital’s own asthma research.The goal: reduce emergency room visits and inpatient readmissions for asthma-related issues prevalent in pediatric populations.

The EPA has sensors located throughout Georgia that collect a variety of data related to weather conditions such as temperature, humidity, wind direction, particulates and pollutants.

Children’s Healthcare received permission from the EPA to pull all of that data, dating from 1985 to the present, into its Hadoop platform so it could analyze the information and graph the data across patient visits related to asthma or other respiratory conditions.

The study, which Children’s is also conducting with Georgia Tech, is aimed at helping to find the causes of readmissions for asthma.

“Part of the study is air quality data that can be correlated with patient admissions to the emergency room and the return rate of asthma patients who had been discharged,” Davis says. “Asthma is a big focus of pediatric healthcare.”

Now that the organization has tested the Hadoop waters, big data and analytics has become a more strategic part of its IT operations, Davis says. Among the technology components supporting big data are Cloudera’s CDH5 Hadoop distribution platform; Cloudera Manager, a management application for Apache Hadoop;  HP eight-node cluster; an enterprise analytics and visualization tool from QlikView; Cloudera Hadoop CDH 5.3 Enterprise Data Hub; Cisco six-node development cluster; and Cisco 17-node production cluster.

As part of its data analysis initiatives, Children’s Healthcare is managing about 14 terabytes of data in total, and this is growing by some 50 gigabytes per week. “That number represents the projects we had the time and need to perform using Hadoop in 2014,” Davis says. “We have several projects in 2015 which will increase that number tenfold.”

These include streaming vital sign analytics with Apache Spark, which the provider calls “a fast and general processing engine” compatible with Hadoop data,integrating near real-time care events into data analysis, ingesting higher frequency vital sign data from the cardiac ICU and analyzing activity data from Fitbit wearable health and fitness products.

Meeting the Challenges

The adoption of Hadoop has come with challenges, including getting up to speed on the technology.

Children’s Healthcare’s IT infrastructure is, by and large, a Windows shop, and the HP and Cisco machines are Linux based. “We had limited numbers of Unix/Linux systems and administrators,” Davis says. “We also had limited non-Windows familiarity on the development team. Lack of familiarity with the tool set and time to learn is always a barrier to adoption of new technology.”

As of today, the organization has sent two Unix/Linux administrators to Hadoop training and has one certified Hadoop developer and administrator (Davis).

“I have begun training other members of the development team to broaden and deepen our commitment to Hadoop,” Davis says. “The learning curve can be steep when trying to stitch together a comprehensive solution.”

Children’s Healthcare has not conducted a formal cost/benefit analysis for its Hadoop deployments, but plans to do so soon.

“I know anecdotally that we can store and process much more data than we can in either of [Children’s Healthcare’s existing] databases,” Davis says. “ We don’t need to add a $30,000 server and licensing fees. We also don’t have to pay for backup storage space in our storage-area network.”

With Hadoop, Davis says, fault tolerance is built in at the software layer, “and that’s been important to us. We had two disks go bad and I got a late night call from the administrator about. Fortunately, because of Hadoop, the system still worked perfectly.”

The early experience with Hadoop is clearly creating new big data possibilities for the organization. “From a solution and development perspective, the sheer volume of storage, reduced need for preprocessing, and flexible tool set makes projects easier to begin and execute,” Davis says.

Children’s Healthcare is now developing a long-term strategy for big data.

“[But] for the near term, we’ll continue to integrate Hadoop into our technology stack while the tool sets continue to evolve,” Davis says.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access