Constructing evaluation corpora for automated clinical named entity recognition

Philip V. Ogren, Guergana K. Savova, Christopher Chute

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We report on the construction of a gold-standard dataset consisting of annotated clinical notes suitable for evaluating our biomedical named entity recognition system. The dataset is the result of consensus between four human annotators and contains 1, 556 annotations on 160 clinical notes using 658 unique concept codes from SNOMED-CT corresponding to human disorders. Inter-annotator agreement was calculated on annotations from 100 of the documents for span (90.9%), concept code (81.7%), context (84.8%), and status (86.0%) agreement. Complete agreement for span, concept code, context, and status was 74.6%. We found that creating a consensus set based on annotations from two independently-created annotation sets can reduce inter-annotator disagreement by 32.3%. We found little benefit to pre-annotating the corpus with a third-party named entity recognizer.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PublisherEuropean Language Resources Association (ELRA)
Pages3143-3150
Number of pages8
ISBN (Electronic)2951740840, 9782951740846
StatePublished - Jan 1 2008
Externally publishedYes
Event6th International Conference on Language Resources and Evaluation, LREC 2008 - Marrakech, Morocco
Duration: May 28 2008May 30 2008

Other

Other6th International Conference on Language Resources and Evaluation, LREC 2008
CountryMorocco
CityMarrakech
Period5/28/085/30/08

Fingerprint

evaluation
gold standard
Evaluation
Entity
Annotation
Gold Standard

ASJC Scopus subject areas

  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics
  • Education

Cite this

Ogren, P. V., Savova, G. K., & Chute, C. (2008). Constructing evaluation corpora for automated clinical named entity recognition. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (pp. 3143-3150). European Language Resources Association (ELRA).

Constructing evaluation corpora for automated clinical named entity recognition. / Ogren, Philip V.; Savova, Guergana K.; Chute, Christopher.

Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. p. 3143-3150.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ogren, PV, Savova, GK & Chute, C 2008, Constructing evaluation corpora for automated clinical named entity recognition. in Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), pp. 3143-3150, 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 5/28/08.
Ogren PV, Savova GK, Chute C. Constructing evaluation corpora for automated clinical named entity recognition. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA). 2008. p. 3143-3150
Ogren, Philip V. ; Savova, Guergana K. ; Chute, Christopher. / Constructing evaluation corpora for automated clinical named entity recognition. Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. pp. 3143-3150
@inproceedings{e15e223541e5426d9e72f5898ca041fb,
title = "Constructing evaluation corpora for automated clinical named entity recognition",
abstract = "We report on the construction of a gold-standard dataset consisting of annotated clinical notes suitable for evaluating our biomedical named entity recognition system. The dataset is the result of consensus between four human annotators and contains 1, 556 annotations on 160 clinical notes using 658 unique concept codes from SNOMED-CT corresponding to human disorders. Inter-annotator agreement was calculated on annotations from 100 of the documents for span (90.9{\%}), concept code (81.7{\%}), context (84.8{\%}), and status (86.0{\%}) agreement. Complete agreement for span, concept code, context, and status was 74.6{\%}. We found that creating a consensus set based on annotations from two independently-created annotation sets can reduce inter-annotator disagreement by 32.3{\%}. We found little benefit to pre-annotating the corpus with a third-party named entity recognizer.",
author = "Ogren, {Philip V.} and Savova, {Guergana K.} and Christopher Chute",
year = "2008",
month = "1",
day = "1",
language = "English (US)",
pages = "3143--3150",
booktitle = "Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - Constructing evaluation corpora for automated clinical named entity recognition

AU - Ogren, Philip V.

AU - Savova, Guergana K.

AU - Chute, Christopher

PY - 2008/1/1

Y1 - 2008/1/1

N2 - We report on the construction of a gold-standard dataset consisting of annotated clinical notes suitable for evaluating our biomedical named entity recognition system. The dataset is the result of consensus between four human annotators and contains 1, 556 annotations on 160 clinical notes using 658 unique concept codes from SNOMED-CT corresponding to human disorders. Inter-annotator agreement was calculated on annotations from 100 of the documents for span (90.9%), concept code (81.7%), context (84.8%), and status (86.0%) agreement. Complete agreement for span, concept code, context, and status was 74.6%. We found that creating a consensus set based on annotations from two independently-created annotation sets can reduce inter-annotator disagreement by 32.3%. We found little benefit to pre-annotating the corpus with a third-party named entity recognizer.

AB - We report on the construction of a gold-standard dataset consisting of annotated clinical notes suitable for evaluating our biomedical named entity recognition system. The dataset is the result of consensus between four human annotators and contains 1, 556 annotations on 160 clinical notes using 658 unique concept codes from SNOMED-CT corresponding to human disorders. Inter-annotator agreement was calculated on annotations from 100 of the documents for span (90.9%), concept code (81.7%), context (84.8%), and status (86.0%) agreement. Complete agreement for span, concept code, context, and status was 74.6%. We found that creating a consensus set based on annotations from two independently-created annotation sets can reduce inter-annotator disagreement by 32.3%. We found little benefit to pre-annotating the corpus with a third-party named entity recognizer.

UR - http://www.scopus.com/inward/record.url?scp=85007328483&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85007328483&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85007328483

SP - 3143

EP - 3150

BT - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

PB - European Language Resources Association (ELRA)

ER -