Complexity and bias in cross-sectional data with binary disease outcome in observational studies

Mei Cheng Wang, Yuchen Yang

Research output: Contribution to journalArticlepeer-review


A cross sectional population is defined as a population of living individuals at the sampling or observational time. Cross-sectionally sampled data with binary disease outcome are commonly analyzed in observational studies for identifying how covariates correlate with disease occurrence. It is generally understood that cross-sectional binary outcome is not as informative as longitudinally collected time-to-event data, but there is insufficient understanding as to whether bias can possibly exist in cross-sectional data and how the bias is related to the population risk of interest. As the progression of a disease typically involves both time and disease status, we consider how the binary disease outcome from the cross-sectional population is connected to birth-illness-death process in the target population. We argue that the distribution of cross-sectional binary outcome is different from the risk distribution from the target population and that bias would typically arise when using cross-sectional data to draw inference for population risk. In general, the cross-sectional risk probability is determined jointly by the population risk probability and the ratio of duration of diseased state to the duration of disease-free state. Through explicit formulas we conclude that bias can almost never be avoided from cross-sectional data. We present age-specific risk probability (ARP) and argue that models based on ARP offers a compromised but still biased approach to understand the population risk. An analysis based on Alzheimer's disease data is presented to illustrate the ARP model and possible critiques for the analysis results.

Original languageEnglish (US)
Pages (from-to)950-962
Number of pages13
JournalStatistics in Medicine
Issue number4
StatePublished - Feb 20 2021
Externally publishedYes


  • birth-illness-death process
  • current status data
  • logistic model
  • sampling bias
  • stationary disease process

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'Complexity and bias in cross-sectional data with binary disease outcome in observational studies'. Together they form a unique fingerprint.

Cite this