Data management in substance use disorder treatment research: Implications from data harmonization of National Institute on Drug Abuse-funded randomized controlled trials

Ryoko Susukida, Masoumeh Aminesmaeili, Ramin Mojtabai

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Secondary analysis of data from completed randomized controlled trials (RCTs) is a critical and efficient way to maximize the potential benefit from past research. De-identified primary data from completed RCTs have been increasingly available in recent years; however, the lack of standardized data products is a major barrier to further use of these valuable data. Pre-statistical harmonization of data structure, variables and codebooks across RCTs would facilitate secondary data analysis including meta-analysis and comparative effectiveness studies. We describe a data harmonization initiative to harmonize de-identified primary data from substance use disorder (SUD) treatment RCTs funded by the National Institute on Drug Abuse (NIDA) available on the NIDA Data Share website. Methods: Harmonized datasets with standardized data structures, variable names, labels, and definitions and harmonized codebooks were developed for 36 completed RCTs. Common data domains were identified to bundle data files from individual RCTs according to relevant subject areas. Variables within the same instrument were harmonized if at least two RCTs used the same instrument. The structures of the harmonized data were determined based on the feedback from clinical trialists and SUD research experts. Results: We have created a harmonized database of variables across 36 RCTs with a build-in label, and a brief definition for each variable. Data files from the RCTs have been consistently categorized into eight domains (enrollment, demographics, adherence, adverse events, physical health measures, mental-behavioral-cognitive health measures, self-reported substance use measures, and biologic substance use measures). Harmonized codebooks and instrument/variable concordance tables have also been developed to help identify instruments and variables of interest more easily. Conclusions: The harmonized data of RCTs of SUD treatments can potentially promote future secondary data analysis of completed RCTs, allowing combining data from multiple RCTs and provide guidance for future RCTs in SUD treatment research.

Original languageEnglish (US)
JournalUnknown Journal
DOIs
StatePublished - May 3 2020

Keywords

  • Data harmonization
  • Randomized controlled trials
  • Secondary data analysis
  • Substance abuse treatment

ASJC Scopus subject areas

  • Medicine(all)

Fingerprint Dive into the research topics of 'Data management in substance use disorder treatment research: Implications from data harmonization of National Institute on Drug Abuse-funded randomized controlled trials'. Together they form a unique fingerprint.

Cite this