The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports

Taxiarchis Botsis, E. J. Woo, R. Ball

Research output: Contribution to journalArticle

Abstract

Background: We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of anaphylaxis for post-marketing safety surveillance of vaccines. Objective: To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). Methods: We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. Results: MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRAbased approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. Conclusion: For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.

Original languageEnglish (US)
Pages (from-to)88-99
Number of pages12
JournalApplied Clinical Informatics
Volume4
Issue number1
DOIs
StatePublished - Sep 26 2013
Externally publishedYes

Fingerprint

Data Mining
Vaccines
Medical Dictionaries
Glossaries
Space Simulation
Information Storage and Retrieval
Anaphylaxis
Vector spaces
Marketing
Information retrieval
Signs and Symptoms
Area Under Curve

Keywords

  • And analysis
  • Biosurveillance and case reporting
  • Data access
  • Data mining
  • Data repositories
  • Integration
  • Natural language processing

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Health Information Management

Cite this

The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports. / Botsis, Taxiarchis; Woo, E. J.; Ball, R.

In: Applied Clinical Informatics, Vol. 4, No. 1, 26.09.2013, p. 88-99.

Research output: Contribution to journalArticle

@article{836ddb57d68f4a02b7b58e9243e23ee8,
title = "The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barr{\'e} syndrome reports",
abstract = "Background: We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of anaphylaxis for post-marketing safety surveillance of vaccines. Objective: To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barr{\'e} Syndrome (GBS). Methods: We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. Results: MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRAbased approach (94.96{\%} vs. 87.65{\%}). Most of the VaeTM-based information was contained in the secondary diagnostic features. Conclusion: For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.",
keywords = "And analysis, Biosurveillance and case reporting, Data access, Data mining, Data repositories, Integration, Natural language processing",
author = "Taxiarchis Botsis and Woo, {E. J.} and R. Ball",
year = "2013",
month = "9",
day = "26",
doi = "10.4338/ACI-2012-11-RA-0049",
language = "English (US)",
volume = "4",
pages = "88--99",
journal = "Applied Clinical Informatics",
issn = "1869-0327",
publisher = "Schattauer GmbH",
number = "1",

}

TY - JOUR

T1 - The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports

AU - Botsis, Taxiarchis

AU - Woo, E. J.

AU - Ball, R.

PY - 2013/9/26

Y1 - 2013/9/26

N2 - Background: We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of anaphylaxis for post-marketing safety surveillance of vaccines. Objective: To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). Methods: We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. Results: MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRAbased approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. Conclusion: For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.

AB - Background: We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of anaphylaxis for post-marketing safety surveillance of vaccines. Objective: To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). Methods: We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. Results: MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRAbased approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. Conclusion: For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.

KW - And analysis

KW - Biosurveillance and case reporting

KW - Data access

KW - Data mining

KW - Data repositories

KW - Integration

KW - Natural language processing

UR - http://www.scopus.com/inward/record.url?scp=84884469573&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84884469573&partnerID=8YFLogxK

U2 - 10.4338/ACI-2012-11-RA-0049

DO - 10.4338/ACI-2012-11-RA-0049

M3 - Article

C2 - 23650490

AN - SCOPUS:84884469573

VL - 4

SP - 88

EP - 99

JO - Applied Clinical Informatics

JF - Applied Clinical Informatics

SN - 1869-0327

IS - 1

ER -