If only there was no missing data…
Speaker: Matthew Schofield (Applied Statistics Center, Columbia University )
Abstract
If only there was no missing data…
Imagine that we knew certain biological features of our population of interest. For example, suppose that the times of birth and death were known without error, as well as associated covariates. In this ideal world, our inference would be straightforward and our focus would be on specifying models and relationships of scientific interest that describe the dynamics of the population. In reality, however, values such as the times of birth and death, as well as covariates are unknown. Therefore, we construct elaborate sampling schemes (e.g. capture-recapture experiments) in order to get partial information about these quantities of interest. This means that the capture-recapture experiment can be thought of as a missing data problem. Classical approaches to overcoming this problem focus on accounting for the complex sampling process that generated the missing data. This wrestles our focus away from the scientific questions of interest, evidenced by the proliferation of literature that describe how to account for subtle (as well as not so subtle) sampling differences.
Here we will show how the missing data inherent in capture-recapture data can be accounted for using data augmentation where `we model the data we wish we had’. This allows our focus to return to modeling relationships that describe the dynamics of the population. We will proceed step-by-step through this modeling framework using several examples.