Reporting and handling of missing data in predictive research for prevalent undiagnosed type 2 diabetes mellitus

A systematic review

Katya L. Masconi, Tandi E. Matsha, Justin Echouffo Tcheugui, Rajiv T. Erasmus, Andre P. Kengne

Research output: Contribution to journalReview article

Abstract

Missing values are common in health research and omitting participants with missing data often leads to loss of statistical power, biased estimates and, consequently, inaccurate inferences. We critically reviewed the challenges posed by missing data in medical research and approaches to address them. To achieve this more efficiently, these issues were analyzed and illustrated through a systematic review on the reporting of missing data and imputation methods (prediction of missing values through relationships within and between variables) undertaken in risk prediction studies of undiagnosed diabetes. Prevalent diabetes risk models were selected based on a recent comprehensive systematic review, supplemented by an updated search of English-language studies published between 1997 and 2014. Reporting of missing data has been limited in studies of prevalent diabetes prediction. Of the 48 articles identified, 62.5% (n = 30) did not report any information on missing data or handling techniques. In 21 (43.8%) studies, researchers opted out of imputation, completing case-wise deletion of participants missing any predictor values. Although imputation methods are encouraged to handle missing data and ensure the accuracy of inferences, this has seldom been the case in studies of diabetes risk prediction. Hence, we elaborated on the various types and patterns of missing data, the limitations of case-wise deletion and state-of the-art methods of imputations and their challenges. This review highlights the inexperience or disregard of investigators of the effect of missing data in risk prediction research. Formal guidelines may enhance the reporting and appropriate handling of missing data in scientific journals.

Original languageEnglish (US)
Article number7
JournalEPMA Journal
Volume6
Issue number1
DOIs
StatePublished - Mar 11 2015
Externally publishedYes

Fingerprint

Medical problems
Type 2 Diabetes Mellitus
Research
Research Design
Research Personnel
Biomedical Research
Language
Guidelines
Health

Keywords

  • Diabetes mellitus
  • Guidelines
  • Modeling
  • Patient Stratification
  • Patterns
  • Predictive
  • Preventive and Personalized Medicine
  • Risk
  • Screening

ASJC Scopus subject areas

  • Drug Discovery
  • Health Policy
  • Biochemistry, medical

Cite this

Reporting and handling of missing data in predictive research for prevalent undiagnosed type 2 diabetes mellitus : A systematic review. / Masconi, Katya L.; Matsha, Tandi E.; Echouffo Tcheugui, Justin; Erasmus, Rajiv T.; Kengne, Andre P.

In: EPMA Journal, Vol. 6, No. 1, 7, 11.03.2015.

Research output: Contribution to journalReview article

@article{57c98e5490544fe2b2cc45e93e3d00b5,
title = "Reporting and handling of missing data in predictive research for prevalent undiagnosed type 2 diabetes mellitus: A systematic review",
abstract = "Missing values are common in health research and omitting participants with missing data often leads to loss of statistical power, biased estimates and, consequently, inaccurate inferences. We critically reviewed the challenges posed by missing data in medical research and approaches to address them. To achieve this more efficiently, these issues were analyzed and illustrated through a systematic review on the reporting of missing data and imputation methods (prediction of missing values through relationships within and between variables) undertaken in risk prediction studies of undiagnosed diabetes. Prevalent diabetes risk models were selected based on a recent comprehensive systematic review, supplemented by an updated search of English-language studies published between 1997 and 2014. Reporting of missing data has been limited in studies of prevalent diabetes prediction. Of the 48 articles identified, 62.5{\%} (n = 30) did not report any information on missing data or handling techniques. In 21 (43.8{\%}) studies, researchers opted out of imputation, completing case-wise deletion of participants missing any predictor values. Although imputation methods are encouraged to handle missing data and ensure the accuracy of inferences, this has seldom been the case in studies of diabetes risk prediction. Hence, we elaborated on the various types and patterns of missing data, the limitations of case-wise deletion and state-of the-art methods of imputations and their challenges. This review highlights the inexperience or disregard of investigators of the effect of missing data in risk prediction research. Formal guidelines may enhance the reporting and appropriate handling of missing data in scientific journals.",
keywords = "Diabetes mellitus, Guidelines, Modeling, Patient Stratification, Patterns, Predictive, Preventive and Personalized Medicine, Risk, Screening",
author = "Masconi, {Katya L.} and Matsha, {Tandi E.} and {Echouffo Tcheugui}, Justin and Erasmus, {Rajiv T.} and Kengne, {Andre P.}",
year = "2015",
month = "3",
day = "11",
doi = "10.1186/s13167-015-0028-0",
language = "English (US)",
volume = "6",
journal = "EPMA Journal",
issn = "1878-5077",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Reporting and handling of missing data in predictive research for prevalent undiagnosed type 2 diabetes mellitus

T2 - A systematic review

AU - Masconi, Katya L.

AU - Matsha, Tandi E.

AU - Echouffo Tcheugui, Justin

AU - Erasmus, Rajiv T.

AU - Kengne, Andre P.

PY - 2015/3/11

Y1 - 2015/3/11

N2 - Missing values are common in health research and omitting participants with missing data often leads to loss of statistical power, biased estimates and, consequently, inaccurate inferences. We critically reviewed the challenges posed by missing data in medical research and approaches to address them. To achieve this more efficiently, these issues were analyzed and illustrated through a systematic review on the reporting of missing data and imputation methods (prediction of missing values through relationships within and between variables) undertaken in risk prediction studies of undiagnosed diabetes. Prevalent diabetes risk models were selected based on a recent comprehensive systematic review, supplemented by an updated search of English-language studies published between 1997 and 2014. Reporting of missing data has been limited in studies of prevalent diabetes prediction. Of the 48 articles identified, 62.5% (n = 30) did not report any information on missing data or handling techniques. In 21 (43.8%) studies, researchers opted out of imputation, completing case-wise deletion of participants missing any predictor values. Although imputation methods are encouraged to handle missing data and ensure the accuracy of inferences, this has seldom been the case in studies of diabetes risk prediction. Hence, we elaborated on the various types and patterns of missing data, the limitations of case-wise deletion and state-of the-art methods of imputations and their challenges. This review highlights the inexperience or disregard of investigators of the effect of missing data in risk prediction research. Formal guidelines may enhance the reporting and appropriate handling of missing data in scientific journals.

AB - Missing values are common in health research and omitting participants with missing data often leads to loss of statistical power, biased estimates and, consequently, inaccurate inferences. We critically reviewed the challenges posed by missing data in medical research and approaches to address them. To achieve this more efficiently, these issues were analyzed and illustrated through a systematic review on the reporting of missing data and imputation methods (prediction of missing values through relationships within and between variables) undertaken in risk prediction studies of undiagnosed diabetes. Prevalent diabetes risk models were selected based on a recent comprehensive systematic review, supplemented by an updated search of English-language studies published between 1997 and 2014. Reporting of missing data has been limited in studies of prevalent diabetes prediction. Of the 48 articles identified, 62.5% (n = 30) did not report any information on missing data or handling techniques. In 21 (43.8%) studies, researchers opted out of imputation, completing case-wise deletion of participants missing any predictor values. Although imputation methods are encouraged to handle missing data and ensure the accuracy of inferences, this has seldom been the case in studies of diabetes risk prediction. Hence, we elaborated on the various types and patterns of missing data, the limitations of case-wise deletion and state-of the-art methods of imputations and their challenges. This review highlights the inexperience or disregard of investigators of the effect of missing data in risk prediction research. Formal guidelines may enhance the reporting and appropriate handling of missing data in scientific journals.

KW - Diabetes mellitus

KW - Guidelines

KW - Modeling

KW - Patient Stratification

KW - Patterns

KW - Predictive

KW - Preventive and Personalized Medicine

KW - Risk

KW - Screening

UR - http://www.scopus.com/inward/record.url?scp=84927766865&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84927766865&partnerID=8YFLogxK

U2 - 10.1186/s13167-015-0028-0

DO - 10.1186/s13167-015-0028-0

M3 - Review article

VL - 6

JO - EPMA Journal

JF - EPMA Journal

SN - 1878-5077

IS - 1

M1 - 7

ER -