Across the globe, researchers are building massive databases of health information on patients and people with neurodegenerative diseases, including Alzheimer’s and amyotrophic lateral sclerosis. What can scientists learn from these initiatives and from the piles of health data now accumulating in electronic medical records? To find out, Amber Dance attended the Meaningful Use of Complex Medical Data Conference in Los Angeles. Read the story to find out how data are being used to make medical decisions and predict illness and recovery, and learn about the challenges in turning health records into useful guidelines.
Meeting Mulls Over Use of Complex Medical Data
The crunching of masses of data has changed how people search the Internet, make friends, and purchase everything from soup to nuts. Now, the “big data” approach is poised to overhaul medicine, according to attendees at the Meaningful Use of Complex Medical Data, or MUCMD, conference, held 10-12 August 2012 at the Children’s Hospital Los Angeles in California. The name refers, tongue in cheek, to U.S. government financial incentives for doctors who ditch paper patient records in favor of the electronic versions. Only if the Centers for Medicare and Medicaid Services (CMS) deem e-health record use “meaningful” will it reward doctors up to $44,000 over five years.
MUCMD brought together all sorts of data aficionados, including statisticians, business people, clinicians, and engineers. While they did not specifically address any challenges in AD research or care, they discussed the potential application of medical data and the key obstacles in accomplishing their goals. Attendees would like to see medical recommendations based on hard data, rather than on what presenter Kenneth Mandl of Children’s Hospital Boston called “BOGSAT”—a Bunch Of Guys Sitting Around a Table.
The data are already out there—every doctor’s visit or hospital stay could be considered part of an experiment. What is missing, frequently, is a mechanism to access and analyze the data. And unlike controlled experiments, medical records provide a complex mélange of numerical and observational data, recorded at irregular intervals. The challenge is to sift through masses of records to find useful, actionable information. Computer scientists are still working on the best way to do that sifting. At the meeting’s hack night, programmers got a chance to test their ideas with a handful of datasets.
What might the data analysis yield? For starters, a clearer understanding of the full set of traits associated with a given disease, how much these vary, and how symptoms progress. It might help doctors discover biomarkers or risk factors for a variety of disorders, including neurodegenerative diseases. In the hospital, it could predict when a patient is about to suffer sepsis or another adverse event, allowing doctors to catch and treat the problem earlier. On a personal level, data might help individuals with diabetes understand why their blood sugar dips at certain points in the day. The possibilities are wide open—researchers are still speculating about what kinds of new correlations they might discover once they process the enormous datasets.
In general, however, obtaining access to medical datasets remains a serious challenge, particularly when the records were not originally collected as part of a study, but simply in the course of medical practice. MUCMD presenter Peter Szolovits of the Massachusetts Institute of Technology noted that it once took him the better part of a year to navigate the committees and institutional review boards that had to approve access. MUCMD organizers gave attendees the opportunity to wade through real-life data during the hack night. Over pizza, beer, and soda, approximately 15 participants passed around datasets on computer memory sticks (plus the associated data use agreements). They argued and scribbled their ideas on whiteboards until the lights went off at 11 p.m.
A handful of datasets were available to play with. One included records, scrubbed of anything that could identify the patients, from a diabetes prediction challenge sponsored by Practice Fusion, an electronic health record provider in San Francisco, California. The goal of the competition is to put together an algorithm to classify people as diabetic or not, “giving doctors a picture of what is the characteristic diabetic patient,” said Jake Marcus, a data scientist at Practice Fusion, in an interview with Alzforum. He hopes this kind of analysis would give doctors a fuller understanding of the traits, risks, and complications associated with diabetes.
Data-based competitions such as Practice Fusion’s are a way to “plant the seed” of ideas and new questions for working with large databases, Marcus said. Neurodegenerative disease also has a place here; Prize4Life (which funds this reporter’s position) is offering $25,000 for an algorithm that predicts the progression of amyotrophic lateral sclerosis.
Making Data Meaningful
Scientists still grapple over how best to mine these databases. Many people put air quotes around the CMS guideline “meaningful” because no one knows precisely what that means, said David Kale, a computer scientist at Children’s Hospital and one of the conference organizers. At MUCMD, “we are talking about the real ‘meaningful’ use,” Kale quipped in an interview with Alzforum—that is, not just putting data into storage, but analyzing them and getting something in return.
To understand the potential of big data, think about the Framingham Heart Study, suggested Marcus. Starting with 5,208 people in the Massachusetts town, tracking them and their descendants for six decades thus far, the Framingham dataset has made major contributions to the study of heart disease and many other conditions. “Now,” Marcus told Alzforum, “imagine following 300 million people.” For example, early signs of Alzheimer’s might emerge from appropriate data analyzed in the right way.
Kale told Alzforum that the promise of big data is getting a lot of hype, but “there is still a whole lot of work to do.” In commerce and finance, big data has already made a difference. For instance, Amazon.com uses masses of data on what people buy to recommend books and other products you might want to purchase. The hope is that collecting and crunching data could be just as informative in medicine. Kale imagines doctors using large datasets to come up with hypotheses about puzzling cases. Instead of brainstorming with individual colleagues about a patient—which takes time—a physician could log on to a large database and access the histories of 100 similar cases. “You immediately get a cohort,” Kale explained. A computer program, armed with that dataset, might suggest tests to run or diagnoses to consider.
MUCMD attendees are taking it on faith, somewhat, that big data will make a difference for physicians and patients as it has for banks and retailers, Kale said. Many of the hoped-for outcomes—better treatment strategies and the like—remain unproven. Before physicians and patients see the results, medical data miners need to sort out how to obtain, store, access, and analyze the highly complex data stored in health records. For more on the successes and difficulties discussed at the meeting, see Part 2.—Amber Dance.
This is Part 1 of a two-part story. See also Part 2.
What can big data do for medicine? At the Meaningful Use of Complex Medical Data symposium, held 10-12 August in Los Angeles, California, researchers and doctors batted around several ideas. Presenters showed how data analysis helped them pick out patterns in flu infection, predict mortality in the intensive care unit (ICU), and recommend treatment plans for attention deficit/hyperactivity disorder. But the study of big medical data is in its infancy compared to the datasets crunched by the likes of Google and Amazon.com. For starters: where do programmers find the data, and how do they make sense of it?
Electronic medical records are the basic data that many researchers are interested in, but they are clunky to use in the eyes of data hounds. “Most of the very best electronic health information systems are still data repositories, which are like libraries made by throwing books into a dumpster,” commented presenter Warren Sandberg of Vanderbilt University in Nashville, Tennessee. This makes it very hard to retrieve meaningful information.
Another problem is that healthcare providers must choose one record system, from one vendor, and then stick with what that vendor has to offer. “Shouldn’t electronic health records look more like iPhones?” asked Kenneth Mandl, Children’s Hospital, Boston (Mandl and Kohane, 2009). Smart phones provide a basic architecture and anyone can write apps that access and work with them in different ways. Mandl is working to develop just such a health record system.
Even with a better record program in place, the nature of medical data makes it challenging to develop the algorithms and databases that MUCMD participants envision, said presenter Benjamin Marlin of the University of Massachusetts in Amherst. Numerical data, such as heart rate, accumulates over a long time course, often measured at irregular intervals. Sometimes pieces of data are missing, if a person did not undergo a particular test. These factors make it difficult to process data with standard statistics; Marlin suggested binning linear data streams into individual segments of equal time periods.
Data-Driven Decisions
Despite the ongoing challenges, MUCMD presenters reported several successes in gleaning meaningful information from medical databases. For example, Mandl used data from emergency room visits to model and predict how many people would come in with the flu—and his predictions were better than those from the Centers for Disease Control and Prevention, he said. The results indicated that three to four year olds are the first to fall ill with the virus (Brownstein et al., 2005) these data influenced the government to recommend all children between two and five receive flu vaccines.
Data can also predict outcomes on an individual basis, said Peter Szolovits of the Massachusetts Institute of Technology, who discussed the role of data in the intensive care unit (ICU). Simply knowing the likelihood a person will die could help doctors and administrators make important decisions, he said. For example, a hospital could plan for nurses to have more time to attend sicker people, and predict how many beds will be open. On the medical side, knowing a person’s risk of mortality could help physicians decide whether risky interventions are worthwhile.
Using data from 7,000 health records—including vital signs, lab reports, treatments and demographics--Szolovits built a model to help predict mortality in ICU patients. He boiled down all the bits of data that describe a person’s condition into one easily digested score for how a person is doing. The researchers then used another 3,000 records to test their algorithms. They found the model was most accurate on a person’s second and third days in the ICU; the outcomes for longer stays were harder to predict (Hug and Szolovits, 2009).
Not all medical decisions are life-and-death; doctors make many choices where hard data might be useful. But it takes a special kind of clinical trial to test decision strategies. Susan Murphy of the University of Michigan described a clinical trial approach that could help doctors develop treatment policies. These policies are akin to a flow chart for treating a person over time. Every junction in the chart requires reassessment of the treatment, and then a decision about whether to try something different. Using a method called Sequential Multiple Assignment Randomized Trials, or SMART (reviewed in Almirall et al., 2012), Murphy randomized each decision point to come up with treatment recommendations. “The idea is to construct a treatment policy that will tell you how to choose the action,” she explained.
For example, there are two main options to help children manage symptoms of attention deficit/hyperactivity disorder (ADHD): medication and behavior modification therapy. Murphy analyzed 138 children with ADHD. At the start of the school year, each was randomly assigned to receive either the stimulant Ritalin or the behavioral therapy. Every month the researchers asked teachers how the children were performing. Those doing well stayed on their current treatment; the others were randomized again—to either increase the intensity of the current therapy, or add the second option.
After analyzing the data, Murphy determined the plan that works best: If a child has been on medication before, start out with medication; otherwise, use behavioral therapy. If the treatment is unsuccessful, doctors must ask if the kid is adhering to the treatment plan. If so, it is best to intensify that therapy. Otherwise, better to try the alternate treatment. This last decision point, to add a second therapy, had the strongest evidence behind it. It is important to determine how confident one can be, statistically, about the decision points before implementing a treatment plan, Murphy said.
Big Data Get Personal
While many big datasets include thousands of patients, there is another variety: oodles of data, but all on one person. This could be useful not only for that person’s medical care, but also in daily life. For example, hackers at the conference explored a dataset of blood glucose measurements from a boy who wore a continuous monitor over three years. One finding data-waders already gleaned from the database was that the boy was not eating right at school.
Another example of personal megadata: In 2003, Microsoft developed a wearable camera that automatically takes photographs, two or three per minute, to help people with memory loss such as Alzheimer’s recall their activities. Regularly reviewing the photos jogs the memory. Looking at the photos was a better memory aid than keeping a written diary, researchers found (Berry et al., 2009; Browne et al., 2011).
Presenter Mary Czerwinwki profiled a newer project at Microsoft Research in Redmond, Washington. She is working on a software/sensor package that analyzes user’s moods to help them better understand why they feel the way they do. The first application she envisions for the system is to help people who are emotional eaters identify the trigger signs of a binge and avoid raiding the fridge. Czerwinski uses sensors, such as a heart rate monitor, built into a brassiere (for now, the project is for women only). Ten women tested the system for a week, and the researchers found the emotions that lead to eating are different for different people. Boredom, stress, or happiness could all precede snacking, so the app will have to be personalized for each eater.
As researchers meet the challenges of medical information processing, they hope that big data could make this kind of customization possible, even commonplace, in daily life, the clinic, and trials research.—Amber Dance.
This is Part 2 of a two-part story. See also Part 1.
Comments
No Available Comments
Make a Comment
To make a comment you must login or register.