Gillian Hayes, Ph.D., Estrellita Principal Investigator, University of California, Irvine
Anyone doing research about records of any kind knows the challenges of handling missing data. We often hear about epidemiologists who struggle to find patterns in incomplete data sets, clinicians who carefully interview patients and family members to fill in the gaps in a medical record, and so on. This problem becomes particularly acute, however, when you are monitoring the data in real time. In some cases, missing data can mean we need to act in some way, but in other cases, it’s nothing to worry about. Two examples tell this story pretty well: developmental activities and appointments.
The Estrellita application asks parents to track the activities they do to bond with their babies, help them develop, and so on. When parents report doing these activities, we give them encouraging messages like “Great job!” and “Your baby loves it when you sing to him.” They also earn badges on their phones for each activity. In this case, because recording the data is actually part of an intervention to encourage these activities, lack of data requires additional intervention. So, if we get no data for a couple of days, then the application reminds parents to do some of the activities and to record them.
On the other hand, sometimes a lack of data doesn’t indicate a problem at all. The Estrellita application asks parents about their experiences with each clinical appointment a few hours after scheduled appointment times. If they don’t answer in that window, we dismiss the question, because self-report gets more unreliable the longer we wait. One of the mothers in our study wanted to be able to keep all her appointment information together, so she entered old appointments from before the study started. Because the appointments were so long ago, they didn’t trigger the additional questions about how they went. So, when one of our case managers logged in to check on this parent, it looked like she had missed a lot of appointments. The case manager was moments away from intervening when she realized that these appointments all predated the study enrollment date. If that baby had really missed that many appointments, there could have been something wrong. In this case, however, the missing data were misleading. Luckily, we were able to make a fix for this situation quickly, and in the case of appointment data, we now have three categories: attended, did not attend, and no information.
Getting complete medical records in a setting in which people are trained to create and manage them (like hospitals) is difficult and complex. Doing the same in homes, schools, and other non-clinical settings can be downright impossible. Through our work, we hope to learn a little more about what to do when we have imperfect or missing data, and we are interested to see how the other projects handle this challenge as well.