Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsies) of a deceased individual. CCVA algorithms are typically trained on non-local data, then used to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if the non-local training data is different from the local population of interest. This problem is a special case of transfer learning which is now commonly deployed for classifying images, videos, texts, and other complex data. Most transfer learning classification approaches are concerned with individual (e.g. a person's) classification within a target domain (e.g. a particular population) with training performed in data from a source domain. Social and health scientists such as epidemiologists are often more interested with understanding etiological distributions at the population-level rather than classifying individuals. The sample sizes of their datasets are typically orders of magnitude smaller than those used for image classification and related tasks. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain, using any baseline classifier trained on source domain data, and a relatively smaller labeled target domain dataset. To address the small sample size issue, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target domain data or when the baseline classifier is perfectly accurate, the domain-adapted (calibrated) estimate of class probabilities coincides with the naive estimates from the baseline classifier, thereby subsuming the default practice as a special case. A novel Gibbs sampler using data augmentation enables fast implementation. We then extend our approach to use not one, but an ensemble of baseline classifiers. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. Simulated and real data analyses reveal dramatic improvement in the estimates of class probabilities from our transfer learning approach. We also present extensions that allow the class probabilities to vary as functions of covariates, and an EM-algorithm-based MAP estimation as an alternate to MCMC. An R-package implementing this method for verbal autopsy data is available on Github.
|Original language||English (US)|
|State||Published - Oct 24 2018|
- Hierarchical modeling
- Transfer learning
- Verbal Autopsy
ASJC Scopus subject areas