Skepticism is a two-edged sword. Not enough of it, and an IT manager might find himself duped into investing in software "solutions" that go nowhere. Too much of it, and skepticism can leave an IT department behind as it waits for enough proof to show a particular platform will improve outcomes beyond a reasonable doubt.
Big data analytics is at that tipping point right now in the healthcare industry. Several vendors promise better quality of care and reduced expenditures, but evidence to support those claims is somewhat tentative. Similarly, some critics of the big data movement say healthcare providers need to squeeze all the intelligence they can from small data sets before moving on to larger projects.
In a recent post in The Health Care Blog, for instance, consultants David C. Kibbe, M.D., and Vince Kuraitis argue that instead of succumbing to the allure of big data analytics, providers should focus on using small data better. In other words, concentrate on the clinical data already available in digitized form and use only those health IT tools that are directly applicable to care management.
Big data analytics, on the other hand, attempts to parse mounds of data from many disparate sources to discover patterns that could be useful in problem solving. For example, researchers are employing the big data approach to study genetic and environmental factors in multiple sclerosis to search for personalized treatments.
Some of this research might lead to exciting payoffs down the road, but IT companies are not waiting. As Kibbe and Kuraitis point out, technology firms are touting big data analytics as a must-have for healthcare systems and physician groups that aim to become accountable care organizations or make ACO-like arrangements with payers. As these ACOs and healthcare organizations try to profit under shared-savings or financial risk contracts, these proponents claim, big data can help them crunch the data for quality improvement and cost reductions.
Some providers are already using big data in patient care. According to BusinessWeek, "many [providers] are turning to companies such as Microsoft, SAS, Dell, IBM, and Oracle for their data-mining expertise." And healthcare analytics is a growth business. Frost & Sullivan projects that half of hospitals will be using advanced analytics software by 2016, compared to 10% today.
Are healthcare providers ready for big data analytics, or should they be content with the more limited data analytics capabilities built into their EHR systems and relational databases to point the way to new policies and procedures?
When asked to weigh in on the big data/small data debate during a recent interview with InformationWeek Healthcare, David Blumenthal, former head of the Office of the National Coordinate of health IT, said, "It's not an either/or choice. Big data starts with small data. As we have more information on health and disease and the patterns of care ... that information will provide useful insights into what works, what doesn't. What the natural history of disease is. It will enable us to do studies faster and more efficiently ... But it's going to take a while to figure out how to use the data."
As for the skepticism heard from many big data critics, Blumethal said, "[We] take on faith that science offers opportunities. And most of the time, our faith is vindicated."
With that perspective in mind, InformationWeek Healthcare looked at seven companies and large medical centers that have already jumped into the water.
Explorys, a Cleveland Clinic spinoff, offers a cloud-based performance management platform that taps into a healthcare provider's clinical, financial and operational data to look for previously undiscovered patterns. Among its clients are St. Joseph Health System, MedStar and Catholic Health Partners.
Unlike old-school analytics, which relies on relational databases, the company has enlisted the services of Cloudera, a Hadoop-based software and services firm, to help its engineers and informaticians do the heavy lifting.
The Explorys platform allows providers to do three things: Do searches across patient populations and care venues to help identify disease trends; coordinate rules-driven patient registries; and view performance metrics -- a key ingredient if an organization plans to meet ACO requirements.
Of course, all this firepower is meaningless if it doesn't generate the hard data to demonstrate better quality of care and lower costs.
Anil Jain, M.D., chief medical information officer at Explorys, explained that because the company is relatively young, it has yet to generate those kinds of results. Put another way, there's no proof that it can reduce the number of foot amputations in diabetics or reduce the number of myocardial infarctions in patients with pre-existing heart disease.
But some of the data generated by Explorys suggests it is approaching that target. Working with Catholic Health Partners in Cincinnati, for instance, the analytics platform helped increase pneumonia vaccination rates by 14%, breast cancer screenings by 13% and increased HbA1c testing of diabetics -- a measure of long-term blood glucose control -- by 3%.
A recent report in the Journal of the American Medical Informatics Association (JAMIA) outlined an Explorys project that looked at EHR-generated patient data from nearly 1 million patients from several different healthcare systems. The analysis helped clinicians pinpoint those most at risk for blood clots in the extremities and lungs. The analysis took only 125 hours and required minimal manpower for a project that would typically take years to perform using traditional research methods.
Humedica offers a cloud-based population-wide analytics system. It connects patient information across varied medical settings -- ambulatory and inpatient -- and time periods to generate a longitudinal view of patient care. The company has data on close to 25 million patients in more than 30 states, which allows individual clients to compare their performance against a very large population.
The company's service integrates, normalizes and validates clinical data from across the continuum of care to include not only medications, lab results, vital signs, demographics, hospitalizations and outpatient visits, but also physician notes and lab results, taking advantage of both structured and unstructured data. Its client base draws from four categories: integrated delivery networks (IDN), large academic medical centers, multi-hospital health systems and large multi-practice medical groups.
A case in point: Mid Hudson Medical Group's patient-centered medical home has been using Humedica's MinedShare analytics service to measure its patient population and compare its services against industry best practices.
For instance, the 125-physician practice was able to extract data on its diabetic patients to determine which patients had a HgA1c reading above 7% on their last visit -- an indication of less-than-optimal blood glucose control -- and who had not been seen by a physician in 12 months. With that ammunition in hand, the medical home reached out to these at-risk patients and were able to see about one-third of them at least once within the first eight months of the program. In this group, one-third achieved an HgA1c under 8% and 60% of those with an HgA1c over 9% are being intensively managed through frequent visits with their primary care physician.
As further evidence that Mid Hudson is seeing a return on its investment in the clinical metrics provided by Humedica MinedShare, the provider has now achieved level 3 recognition by the National Committee on Quality (NCQA).
InterSystems likes to remind healthcare providers that even a large enterprise warehouse might not be enough to provide all the intelligence needed to improve quality care and generate significant savings. And the emergence of accountable care organizations and similar pay-for-performance models are making the need for such intelligence all the more urgent.
InterSystems offers its HealthShare healthcare informatics platform, with its embedded Active Analytics component, to address the issue. Like many other big data vendors, it collects, aggregates, normalizes and presents patient data from a variety of silos to help decision makers improve their clinical and financial outcomes.
Rhode Island is using HealthShare statewide to facilitate health information exchange, as well as aggregate and analyze patient data. This enables the state's medical practices to exchange clinical summaries to improve coordination of care, a major component of ACOs.
Gary Christensen, CIO of the Rhode Island Quality Institute, praised HealthShare in a testimonial on the InterSystems website, saying "... HealthShare gives RIQI the analytics needed to target cost savings and provide a level of quality of care that physicians can't get by looking at their own records." During a recent interview with InformationWeek Healthcare, Christensen said his team used InterSystems' analytics tools to determine that 8% to 12% of major lab tests done in more than a quarter of the population of Rhode Island were duplicative and medically unnecessary.
The nation of Sweden also has tapped into InterSystems' firepower, using HealthShare to create a national EHR system for 9 million people. The system is a browser-base display of patient demographics, medication lists, lab data, allergies and related information.
Insurance fraud, one of the healthcare industry's most vexing problems, takes up a lot of Pervasive's time and attention. Pervasive's DataRush, an application framework and analytics engine for high-speed parallel data processing on multi-core computers and multiple computer clusters, helps service providers with contracts with state agencies detect Medicaid fraud. In one case study highlighted on its website, Pervasive boasts about helping to recover reimbursement for Medicaid claims that should have been collected from private insurers.
In order to detect fraud, some service providers match insurance files using SQL Server, a long tedious process. DataRush's fast-paced fuzzy matching system searches two databases -- one containing Medicaid claims from the state and the other the names of patients enrolled in private plans -- to find overlap. The end result has been lower operational costs and quicker ROI, according to a Pervasive case report.
Pittsburgh Healthcare System Invests $100M In Big Data
Clinical Query might not have the same profit motive as commercially available big data companies, but it can certainly hold its own in the race to squeeze intelligence from mountains of untapped medical data.
Clinical Query, a medical informatics platform in use at Beth Israel Deaconess Medical Center, was designed to improve quality while reducing costs. To accomplish these twin goals, clinicians need to focus not only on the care of the patients sitting in front of them, but also the larger population with the same disease or condition -- so-called population health management. That mandate requires data analytics tools that are much more sophisticated than most.
Enter Clinical Query. John Halamka, BIDMC's CIO, refers to it as a clinical trials/clinical research business intelligence system. It's a search engine married to a huge database of patient records that lets hospital employees test hypotheses about what causes a disease, for instance, or test which drug, diet or lifestyle variables might reduce the risk of developing one.
The repository contains 200 million data points on 2.2 million patients, including medications taken, diagnoses and lab values. The query tool is capable of navigating 20,000 medical concepts through the use of Boolean expressions. All the data has been mapped to standard medical language codes. Diagnoses, for instance, have been mapped to ICD-9; medications to RxNorm codes; and lab data to Logical Observation Identifiers Names and Codes (LOINC).
With the help of Clinical Query, a clinician or researcher might, for instance, search the records to find out how many patients with breast cancer also take ACE inhibitors, a class of drugs used to treat high blood pressure. If the results reveal a strong correlation between the drug and the malignancy, the hospital could do a deeper analysis and set up a formal research project to investigate the link.
The ultimate goal would be to discover a new medical intervention that would improve the survival of the entire population of breast cancer patients.
"What's unique about Clinical Query is that it's completely self-service," Halamka said. "I didn't have to go out and hire an analyst. I didn't have to get special permission to get access or approval from our [institutional review board] to use it."
During a recent Digital Health Conference sponsored by the New York eHealth Collaborative, Martin Kohn, M.D., chief medical scientist at IBM, and Pat Skarulis, CIO at Memorial Sloan-Kettering Cancer Center (MSKCC) in New York, outlined a joint venture to use the Watson supercomputer's big data capabilities to help oncologists provide better care for MSKCC patients.
Kohn pointed out that Watson isn't just a "search engine on steroids," or even a massive database. It relies on parallel probabilistic algorithms to analyze millions of pages of unstructured text in patient records and the medical literature to locate the most relevant answers to diagnostic and treatment-related questions.
Ninety percent of the world's data has been created in the last two years, and 80% of that data is unstructured. As any clinician with a pile of unread medical journals knows, that massive collection of information includes far too many papers for any one human to read.
Watson reads it for them at lightening fast speed.
With the help of natural language processing (NLP), the computer not only pulls out relevant terms to match the search terms in a clinician's query, but it also understands the idioms and other idiosyncratic expressions in the English language. And with the help of temporal, statistical paraphrasing and geospatial algorithms, it finds meaningful relationships between the clinician's question and its massive collection of medical facts and theories.
MSKCC decided to collaborate with IBM to "build an intelligence engine to provide specific diagnostic test and treatment recommendations," Skarulis said. The two organizations now are combining data from MSKCC's massive database, called Darwin, with all of Watson's NLP capabilities. IBM is using all of the medical center's structured patient data and its NLP tools to convert the medical center's free text consultation notes into usable data. Skarulis hopes to launch a pilot shortly that will allow the supercomputer to work on real medical cases.
The University of Pittsburgh Medical Center (UPMC) is taking its big data initiative a step further, investing $100 million to create a comprehensive data warehouse that brings together data from more than 200 sources across UPMC, UPMC Health Plan and other affiliated entities.
To collect, store, manage and analyze the information maintained in the data warehouse, UPMC will use the Oracle Exadata Database Machine, a high-performance database platform; IBM's Cognos software for business intelligence and financial management; Informatica's data integration platform; and dbMotion's SOA-based interoperability platform, which integrates patient records from healthcare organizations and health information exchanges. These tools will manage the 3.2 petabytes of data that flows across UPMC's business divisions.
The goal is to help physicians tap into a more intelligent EHR; flag patients at risk for kidney failure based on subtle changes in lab results; or predict the most effective, least toxic treatment plan for an individual breast cancer patient based on her genetic and clinical information. In the case of breast cancer, much of this work will be done through analyzing groups of patients so that researchers and physicians can follow their reaction to treatments and their health status over time.
Officials at UPMC explained that they will begin using their new analytical tools on data gathered from a group of 140 breast cancer patients that were previously studied. Researchers already have both genomic and EHR data for these patients, which will give researchers a head start in their quest to understand the nuances of individuals and their response to medical treatment.
Neil de Crescenzo, senior VP and general manager of Oracle Health Sciences, said the initiative is important both for Oracle and UPMC because the enterprise healthcare analytics platform they're developing integrates data from clinical, genomics, financial, administrative and operations across the organization. These all are areas that need to drive greater efficiency into their workflows as UPMC tackles the challenges of coping with the exponential growth in data.
To sort through its data challenges, UPMC will use a wide range of Oracle tools, including Oracle Enterprise Healthcare Analytics and Oracle Health Sciences Network. UPMC also will implement Oracle Fusion Analytics, as well as multiple components of Oracle Fusion Middleware such as Oracle Hyperion Profitability and Cost Management to support cost-based accounting and Oracle Identity and Access Management Suite Plus for regulatory compliance and data protection.