Data management in substance use disorder treatment research: Implications from data harmonization of National Institute on Drug Abuse-funded randomized controlled trials

Ryoko Susukida, Masoumeh Amin-Esmaeili, Evan R Mayo-Wilson, Ramin Mojtabai

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Secondary analysis of data from completed randomized controlled trials is a critical and efficient way to maximize the potential benefits from past research. De-identified primary data from completed randomized controlled trials have been increasingly available in recent years; however, the lack of standardized data products is a major barrier to further use of these valuable data. Pre-statistical harmonization of data structure, variables, and codebooks across randomized controlled trials would facilitate secondary data analysis, including meta-analyses and comparative effectiveness studies. We describe a pre-statistical data harmonization initiative to standardize de-identified primary data from substance use disorder treatment randomized controlled trials funded by the National Institute on Drug Abuse available on the National Institute on Drug Abuse Data Share website. Methods: Standardized datasets and codebooks with consistent data structures, variable names, labels, and definitions were developed for 36 completed randomized controlled trials. Common data domains were identified to bundle data files from individual randomized controlled trials according to relevant concepts. Variables were harmonized if at least two randomized controlled trials used the same instruments. The structures of the harmonized data were determined based on the feedback from clinical trialists and substance use disorder research experts. Results: We have created a harmonized database of variables across 36 randomized controlled trials with a build-in label and a brief definition for each variable. Data files from the randomized controlled trials have been consistently categorized into eight domains (enrollment, demographics, adherence, adverse events, physical health measures, mental-behavioral-cognitive health measures, self-reported substance use measures, and biologic substance use measures). Standardized codebooks and concordance tables have also been developed to help identify instruments and variables of interest more easily. Conclusion: The harmonized data of randomized controlled trials of substance use disorder treatments can potentially promote future secondary data analysis of completed randomized controlled trials, allowing combining data from multiple randomized controlled trials and provide guidance for future randomized controlled trials in substance use disorder treatment research.

Original languageEnglish (US)
JournalClinical Trials
DOIs
StateAccepted/In press - 2020

Keywords

  • data harmonization
  • Randomized controlled trials
  • secondary data analysis
  • substance abuse treatment

ASJC Scopus subject areas

  • Pharmacology

Fingerprint Dive into the research topics of 'Data management in substance use disorder treatment research: Implications from data harmonization of National Institute on Drug Abuse-funded randomized controlled trials'. Together they form a unique fingerprint.

Cite this