Addressing Inaccurate Nosology in Mental Health: A Multilabel Data Cleansing Approach for Detecting Label Noise From Structural Magnetic Resonance Imaging Data in Mood and Psychosis Disorders

Hooman Rokham, Godfrey Pearlson, Anees Abrol, Haleh Falakshahi, Sergey Plis, Vince D. Calhoun

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Mental health diagnostic approaches are seeking to identify biological markers to work alongside advanced machine learning approaches. It is difficult to identify a biological marker of disease when the traditional diagnostic labels themselves are not necessarily valid. Methods: We worked with T1 structural magnetic resonance imaging data collected from 1493 individuals comprising healthy control subjects, patients with psychosis, and their unaffected first-degree relatives. Specifically, the dataset included 176 bipolar disorder probands, 134 schizoaffective disorder probands, 240 schizophrenia probands, 362 control subjects, and 581 patient relatives. We assumed that there might be noise in the diagnostic labeling process. We detected label noise by classifying the data multiple times using a support vector machine classifier, and then we flagged those individuals in which all classifiers unanimously mislabeled those subjects. Next, we assigned a new diagnostic label to these individuals, based on the biological data (magnetic resonance imaging), using an iterative data cleansing approach. Results: Simulation results showed that our method was highly accurate in identifying label noise. Both diagnostic and biotype categories showed about 65% and 63% of noisy labels, respectively, with the largest amount of relabeling occurring between the healthy control subjects and individuals with bipolar disorder and schizophrenia as well as in unaffected close relatives. The extraction of imaging features highlighted regional brain changes associated with each group. Conclusions: This approach represents an initial step toward developing strategies that need not assume that existing mental health diagnostic categories are always valid but rather allows us to leverage this information while also acknowledging that there are misassignments.

Original languageEnglish (US)
Pages (from-to)819-832
Number of pages14
JournalBiological Psychiatry: Cognitive Neuroscience and Neuroimaging
Volume5
Issue number8
DOIs
StatePublished - Aug 2020

Keywords

  • Data cleansing
  • Deep learning
  • Label noise
  • Machine learning
  • Psychosis disorders
  • Structural MRI

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging
  • Cognitive Neuroscience
  • Clinical Neurology
  • Biological Psychiatry

Fingerprint Dive into the research topics of 'Addressing Inaccurate Nosology in Mental Health: A Multilabel Data Cleansing Approach for Detecting Label Noise From Structural Magnetic Resonance Imaging Data in Mood and Psychosis Disorders'. Together they form a unique fingerprint.

Cite this