Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts

Peter J. Castaldi, Marta Benet, Hans Petersen, Nicholas Rafaels, James Finigan, Matteo Paoletti, H. Marike Boezen, Judith M. Vonk, Russell Bowler, Massimo Pistolesi, Milo A. Puhan, Josep Anto, Els Wauters, Diether Lambrechts, Wim Janssens, Francesca Bigazzi, Gianna Camiciottoli, Michael H. Cho, Craig P. Hersh, Kathleen BarnesStephen Rennard, Meher Preethi Boorgula, Jennifer Dy, Nadia Hansel, James D. Crapo, Yohannes Tesfaigzi, Alvar Agusti, Edwin K. Silverman, Judith Garcia-Aymerich

Research output: Contribution to journalArticle

Abstract

Background: COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of 'unbiased' data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.Objective: We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.Methods: We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.Results: The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17-0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32-0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.Conclusions: Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.

Original languageEnglish (US)
JournalThorax
DOIs
StateAccepted/In press - Jun 21 2017

Fingerprint

Chronic Obstructive Pulmonary Disease
Cluster Analysis
Principal Component Analysis
Cerebral Palsy
North America
Reproducibility of Results
Biomedical Research
Body Mass Index
Cardiovascular Diseases
Asthma

Keywords

  • COPD epidemiology

ASJC Scopus subject areas

  • Pulmonary and Respiratory Medicine

Cite this

Castaldi, P. J., Benet, M., Petersen, H., Rafaels, N., Finigan, J., Paoletti, M., ... Garcia-Aymerich, J. (Accepted/In press). Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts. Thorax. https://doi.org/10.1136/thoraxjnl-2016-209846

Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts. / Castaldi, Peter J.; Benet, Marta; Petersen, Hans; Rafaels, Nicholas; Finigan, James; Paoletti, Matteo; Marike Boezen, H.; Vonk, Judith M.; Bowler, Russell; Pistolesi, Massimo; Puhan, Milo A.; Anto, Josep; Wauters, Els; Lambrechts, Diether; Janssens, Wim; Bigazzi, Francesca; Camiciottoli, Gianna; Cho, Michael H.; Hersh, Craig P.; Barnes, Kathleen; Rennard, Stephen; Boorgula, Meher Preethi; Dy, Jennifer; Hansel, Nadia; Crapo, James D.; Tesfaigzi, Yohannes; Agusti, Alvar; Silverman, Edwin K.; Garcia-Aymerich, Judith.

In: Thorax, 21.06.2017.

Research output: Contribution to journalArticle

Castaldi, PJ, Benet, M, Petersen, H, Rafaels, N, Finigan, J, Paoletti, M, Marike Boezen, H, Vonk, JM, Bowler, R, Pistolesi, M, Puhan, MA, Anto, J, Wauters, E, Lambrechts, D, Janssens, W, Bigazzi, F, Camiciottoli, G, Cho, MH, Hersh, CP, Barnes, K, Rennard, S, Boorgula, MP, Dy, J, Hansel, N, Crapo, JD, Tesfaigzi, Y, Agusti, A, Silverman, EK & Garcia-Aymerich, J 2017, 'Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts', Thorax. https://doi.org/10.1136/thoraxjnl-2016-209846
Castaldi, Peter J. ; Benet, Marta ; Petersen, Hans ; Rafaels, Nicholas ; Finigan, James ; Paoletti, Matteo ; Marike Boezen, H. ; Vonk, Judith M. ; Bowler, Russell ; Pistolesi, Massimo ; Puhan, Milo A. ; Anto, Josep ; Wauters, Els ; Lambrechts, Diether ; Janssens, Wim ; Bigazzi, Francesca ; Camiciottoli, Gianna ; Cho, Michael H. ; Hersh, Craig P. ; Barnes, Kathleen ; Rennard, Stephen ; Boorgula, Meher Preethi ; Dy, Jennifer ; Hansel, Nadia ; Crapo, James D. ; Tesfaigzi, Yohannes ; Agusti, Alvar ; Silverman, Edwin K. ; Garcia-Aymerich, Judith. / Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts. In: Thorax. 2017.
@article{6f84fcaea0764e9abdd04fd1e7723fb7,
title = "Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts",
abstract = "Background: COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of 'unbiased' data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.Objective: We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.Methods: We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.Results: The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17-0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32-0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.Conclusions: Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.",
keywords = "COPD epidemiology",
author = "Castaldi, {Peter J.} and Marta Benet and Hans Petersen and Nicholas Rafaels and James Finigan and Matteo Paoletti and {Marike Boezen}, H. and Vonk, {Judith M.} and Russell Bowler and Massimo Pistolesi and Puhan, {Milo A.} and Josep Anto and Els Wauters and Diether Lambrechts and Wim Janssens and Francesca Bigazzi and Gianna Camiciottoli and Cho, {Michael H.} and Hersh, {Craig P.} and Kathleen Barnes and Stephen Rennard and Boorgula, {Meher Preethi} and Jennifer Dy and Nadia Hansel and Crapo, {James D.} and Yohannes Tesfaigzi and Alvar Agusti and Silverman, {Edwin K.} and Judith Garcia-Aymerich",
year = "2017",
month = "6",
day = "21",
doi = "10.1136/thoraxjnl-2016-209846",
language = "English (US)",
journal = "Thorax",
issn = "0040-6376",
publisher = "BMJ Publishing Group",

}

TY - JOUR

T1 - Do COPD subtypes really exist? COPD heterogeneity and clustering in 10 independent cohorts

AU - Castaldi, Peter J.

AU - Benet, Marta

AU - Petersen, Hans

AU - Rafaels, Nicholas

AU - Finigan, James

AU - Paoletti, Matteo

AU - Marike Boezen, H.

AU - Vonk, Judith M.

AU - Bowler, Russell

AU - Pistolesi, Massimo

AU - Puhan, Milo A.

AU - Anto, Josep

AU - Wauters, Els

AU - Lambrechts, Diether

AU - Janssens, Wim

AU - Bigazzi, Francesca

AU - Camiciottoli, Gianna

AU - Cho, Michael H.

AU - Hersh, Craig P.

AU - Barnes, Kathleen

AU - Rennard, Stephen

AU - Boorgula, Meher Preethi

AU - Dy, Jennifer

AU - Hansel, Nadia

AU - Crapo, James D.

AU - Tesfaigzi, Yohannes

AU - Agusti, Alvar

AU - Silverman, Edwin K.

AU - Garcia-Aymerich, Judith

PY - 2017/6/21

Y1 - 2017/6/21

N2 - Background: COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of 'unbiased' data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.Objective: We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.Methods: We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.Results: The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17-0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32-0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.Conclusions: Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.

AB - Background: COPD is a heterogeneous disease, but there is little consensus on specific definitions for COPD subtypes. Unsupervised clustering offers the promise of 'unbiased' data-driven assessment of COPD heterogeneity. Multiple groups have identified COPD subtypes using cluster analysis, but there has been no systematic assessment of the reproducibility of these subtypes.Objective: We performed clustering analyses across 10 cohorts in North America and Europe in order to assess the reproducibility of (1) correlation patterns of key COPD-related clinical characteristics and (2) clustering results.Methods: We studied 17 146 individuals with COPD using identical methods and common COPD-related characteristics across cohorts (FEV1, FEV1/FVC, FVC, body mass index, Modified Medical Research Council score, asthma and cardiovascular comorbid disease). Correlation patterns between these clinical characteristics were assessed by principal components analysis (PCA). Cluster analysis was performed using k-medoids and hierarchical clustering, and concordance of clustering solutions was quantified with normalised mutual information (NMI), a metric that ranges from 0 to 1 with higher values indicating greater concordance.Results: The reproducibility of COPD clustering subtypes across studies was modest (median NMI range 0.17-0.43). For methods that excluded individuals that did not clearly belong to any cluster, agreement was better but still suboptimal (median NMI range 0.32-0.60). Continuous representations of COPD clinical characteristics derived from PCA were much more consistent across studies.Conclusions: Identical clustering analyses across multiple COPD cohorts showed modest reproducibility. COPD heterogeneity is better characterised by continuous disease traits coexisting in varying degrees within the same individual, rather than by mutually exclusive COPD subtypes.

KW - COPD epidemiology

UR - http://www.scopus.com/inward/record.url?scp=85026306577&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026306577&partnerID=8YFLogxK

U2 - 10.1136/thoraxjnl-2016-209846

DO - 10.1136/thoraxjnl-2016-209846

M3 - Article

C2 - 28637835

AN - SCOPUS:85026306577

JO - Thorax

JF - Thorax

SN - 0040-6376

ER -