Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process

Zafer Iscan, Tony B. Jin, Alexandria Kendrick, Bryan Szeglin, Hanzhang Lu, Madhukar Trivedi, Maurizio Fava, Patrick J. Mcgrath, Myrna Weissman, Benji T. Kurian, Phillip Adams, Sarah Weyandt, Marisa Toups, Thomas Carmody, Melvin Mcinnis, Cristina Cusin, Crystal Cooper, Maria A. Oquendo, Ramin V. Parsey, Christine Delorenzo

Research output: Contribution to journalArticle

Abstract

In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

Original languageEnglish (US)
Pages (from-to)3472-3485
Number of pages14
JournalHuman Brain Mapping
Volume36
Issue number9
DOIs
StatePublished - Sep 1 2015
Externally publishedYes

Fingerprint

Atlases
Reproducibility of Results
Magnetic Resonance Imaging
Sample Size
Software
Health

Keywords

  • Cerebral cortical surface area
  • Cerebral cortical thickness
  • Cerebral cortical volume
  • FreeSurfer
  • Multisite MRI
  • Test-retest reliability

ASJC Scopus subject areas

  • Clinical Neurology
  • Anatomy
  • Neurology
  • Radiology Nuclear Medicine and imaging
  • Radiological and Ultrasound Technology

Cite this

Test-retest reliability of freesurfer measurements within and between sites : Effects of visual approval process. / Iscan, Zafer; Jin, Tony B.; Kendrick, Alexandria; Szeglin, Bryan; Lu, Hanzhang; Trivedi, Madhukar; Fava, Maurizio; Mcgrath, Patrick J.; Weissman, Myrna; Kurian, Benji T.; Adams, Phillip; Weyandt, Sarah; Toups, Marisa; Carmody, Thomas; Mcinnis, Melvin; Cusin, Cristina; Cooper, Crystal; Oquendo, Maria A.; Parsey, Ramin V.; Delorenzo, Christine.

In: Human Brain Mapping, Vol. 36, No. 9, 01.09.2015, p. 3472-3485.

Research output: Contribution to journalArticle

Iscan, Z, Jin, TB, Kendrick, A, Szeglin, B, Lu, H, Trivedi, M, Fava, M, Mcgrath, PJ, Weissman, M, Kurian, BT, Adams, P, Weyandt, S, Toups, M, Carmody, T, Mcinnis, M, Cusin, C, Cooper, C, Oquendo, MA, Parsey, RV & Delorenzo, C 2015, 'Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process', Human Brain Mapping, vol. 36, no. 9, pp. 3472-3485. https://doi.org/10.1002/hbm.22856
Iscan, Zafer ; Jin, Tony B. ; Kendrick, Alexandria ; Szeglin, Bryan ; Lu, Hanzhang ; Trivedi, Madhukar ; Fava, Maurizio ; Mcgrath, Patrick J. ; Weissman, Myrna ; Kurian, Benji T. ; Adams, Phillip ; Weyandt, Sarah ; Toups, Marisa ; Carmody, Thomas ; Mcinnis, Melvin ; Cusin, Cristina ; Cooper, Crystal ; Oquendo, Maria A. ; Parsey, Ramin V. ; Delorenzo, Christine. / Test-retest reliability of freesurfer measurements within and between sites : Effects of visual approval process. In: Human Brain Mapping. 2015 ; Vol. 36, No. 9. pp. 3472-3485.
@article{bc420ce8134b4b8fa8c253e8b94df5c0,
title = "Test-retest reliability of freesurfer measurements within and between sites: Effects of visual approval process",
abstract = "In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.",
keywords = "Cerebral cortical surface area, Cerebral cortical thickness, Cerebral cortical volume, FreeSurfer, Multisite MRI, Test-retest reliability",
author = "Zafer Iscan and Jin, {Tony B.} and Alexandria Kendrick and Bryan Szeglin and Hanzhang Lu and Madhukar Trivedi and Maurizio Fava and Mcgrath, {Patrick J.} and Myrna Weissman and Kurian, {Benji T.} and Phillip Adams and Sarah Weyandt and Marisa Toups and Thomas Carmody and Melvin Mcinnis and Cristina Cusin and Crystal Cooper and Oquendo, {Maria A.} and Parsey, {Ramin V.} and Christine Delorenzo",
year = "2015",
month = "9",
day = "1",
doi = "10.1002/hbm.22856",
language = "English (US)",
volume = "36",
pages = "3472--3485",
journal = "Human Brain Mapping",
issn = "1065-9471",
publisher = "Wiley-Liss Inc.",
number = "9",

}

TY - JOUR

T1 - Test-retest reliability of freesurfer measurements within and between sites

T2 - Effects of visual approval process

AU - Iscan, Zafer

AU - Jin, Tony B.

AU - Kendrick, Alexandria

AU - Szeglin, Bryan

AU - Lu, Hanzhang

AU - Trivedi, Madhukar

AU - Fava, Maurizio

AU - Mcgrath, Patrick J.

AU - Weissman, Myrna

AU - Kurian, Benji T.

AU - Adams, Phillip

AU - Weyandt, Sarah

AU - Toups, Marisa

AU - Carmody, Thomas

AU - Mcinnis, Melvin

AU - Cusin, Cristina

AU - Cooper, Crystal

AU - Oquendo, Maria A.

AU - Parsey, Ramin V.

AU - Delorenzo, Christine

PY - 2015/9/1

Y1 - 2015/9/1

N2 - In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

AB - In the last decade, many studies have used automated processes to analyze magnetic resonance imaging (MRI) data such as cortical thickness, which is one indicator of neuronal health. Due to the convenience of image processing software (e.g., FreeSurfer), standard practice is to rely on automated results without performing visual inspection of intermediate processing. In this work, structural MRIs of 40 healthy controls who were scanned twice were used to determine the test-retest reliability of FreeSurfer-derived cortical measures in four groups of subjects-those 25 that passed visual inspection (approved), those 15 that failed visual inspection (disapproved), a combined group, and a subset of 10 subjects (Travel) whose test and retest scans occurred at different sites. Test-retest correlation (TRC), intraclass correlation coefficient (ICC), and percent difference (PD) were used to measure the reliability in the Destrieux and Desikan-Killiany (DK) atlases. In the approved subjects, reliability of cortical thickness/surface area/volume (DK atlas only) were: TRC (0.82/0.88/0.88), ICC (0.81/0.87/0.88), PD (0.86/1.19/1.39), which represent a significant improvement over these measures when disapproved subjects are included. Travel subjects' results show that cortical thickness reliability is more sensitive to site differences than the cortical surface area and volume. To determine the effect of visual inspection on sample size required for studies of MRI-derived cortical thickness, the number of subjects required to show group differences was calculated. Significant differences observed across imaging sites, between visually approved/disapproved subjects, and across regions with different sizes suggest that these measures should be used with caution.

KW - Cerebral cortical surface area

KW - Cerebral cortical thickness

KW - Cerebral cortical volume

KW - FreeSurfer

KW - Multisite MRI

KW - Test-retest reliability

UR - http://www.scopus.com/inward/record.url?scp=84939473755&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939473755&partnerID=8YFLogxK

U2 - 10.1002/hbm.22856

DO - 10.1002/hbm.22856

M3 - Article

C2 - 26033168

AN - SCOPUS:84939473755

VL - 36

SP - 3472

EP - 3485

JO - Human Brain Mapping

JF - Human Brain Mapping

SN - 1065-9471

IS - 9

ER -