How EHRs Feed The Clinical Research Pipeline

Natural language processing programs can now data mine e-records to help locate the best candidates for clinical trials. Several major healthcare organizations have taken notice.
As healthcare moves gradually from a fee-for-service to a pay-for performance model, it will be judged on how closely clinicians adhere to evidence-based guidelines. The best evidence comes from controlled clinical trials, but since so few treatment protocols are supported by these trials, the movers and shakers in clinical medicine are looking to fill the void by recruiting large groups of subjects willing to enroll in them. Not an easy task.

What has surprised many IT managers and clinicians is how valuable EHRs are proving to be in fueling that research effort.

The Mayo Clinic, for instance, has been mining its EHR data for several years to find subjects for clinical trials. When choosing patients to enroll in such experiments, one of the challenges is to find those who meet predetermined clinical criteria. Sifting through paper files is a nightmare, and even having clinicians manually search e-records for eligible candidates takes far too long.

With that in mind, Mayo Clinic has employed natural language processing (NLP) to speed things along. In one project, it needed to locate patients with heart failure to enroll in a study. The NLP-enabled algorithm was engineered to search through EHRs--including free-text clinician notes--to locate patients with cardiomyopathy, congestive heart failure, pulmonary edema, and a variety of other relevant conditions.

But the system didn't stop there. It automatically searched for synonyms in a database of 16 million problem list entries--all of which were written in unstructured natural language. The same algorithm was capable of weeding out patients with "negation indicators"--patients whose records said something like "patient denies symptoms of heart failure."

Of course, critics of such IT projects may complain: There's no need to mine EHRs. Why can't you just look at the medical billing codes to locate cardiac failure patients? Among 3,226 patients in the Mayo Clinic experiment who were identified as having heart failure, 46% were spotted by the NLP system but not with routine medical billing codes.

Here's further evidence to support the value of mining EHR data. When experienced nurses at Mayo Clinic were asked to manually review patients' e-records to locate potential research subjects, it took them between two and eight hours per patient, depending on the complexity of the case. The Clinic's study clearly demonstrated that nurses can't extract relevant data from unstructured clinicians' notes as quickly as a NLP program can.

Even more ambitious than the Mayo Clinic project are recent IT initiatives that are trying to find a less expensive way to do clinical research. Double-blind, randomized clinical trials (RCTs) may be the gold standard in medical science because they keep both doctors and patients in the dark about who's getting the active treatment and who's getting the placebo. But they're incredibly costly and time-consuming.

A team led by David J. Magid, MD, director of research at the Colorado Permanente Group, has been able to search through thousands of the group's EHRs to figure out which anti-hypertensive drugs are most effective when patients don't respond to first-line treatment with diuretics. The team managed to keep its research costs down to $200,000, a small fraction of what an RCT would cost, and still come up with useful results, namely that beta blockers and ACE inhibitors work well.

A consortium of large healthcare systems, including Kaiser Permanente and Mayo Clinic, has taken this innovative approach and kicked it into high gear, joining forces to capitalize on the power of tens of millions of e-records to generate research. For example, they recently launched programs to mine their EHRs to compare treatment protocols for diabetes.

"With these large databases and detailed clinical information, we can conduct comparative, effective research in real world settings, with a full range of patients, not just those selected for clinical trials," Joe V. Selby, director of Kaiser's research division, states in a recent issue of Scientific American.

Using natural language processing to mine EHRs may have sounded like science fiction a few short years ago. But no forward-thinking healthcare CIOs can ignore the technique now and consider themselves well-informed.

Find out how health IT leaders are dealing with the industry's pain points, from allowing unfettered patient data access to sharing electronic records. Also in the new, all-digital issue of InformationWeek Healthcare: There needs to be better e-communication between technologists and clinicians. Download the issue now. (Free registration required.)

Editor's Choice
Samuel Greengard, Contributing Reporter
Cynthia Harvey, Freelance Journalist, InformationWeek
Carrie Pallardy, Contributing Reporter
John Edwards, Technology Journalist & Author
Astrid Gobardhan, Data Privacy Officer, VFS Global
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing