A randomized controlled trial of automated term composition.

P. L. Elkin, K. R. Bailey, Christopher Chute

Research output: Contribution to journalArticle

Abstract

OBJECTIVE: To compare the ability of an Automated Term Composition (ATC) algorithm with non-compositional mappings to provide coverage (exact mappings to a controlled vocabulary) for a randomly selected set of free text entries which were entered as headings to the Impression section of the clinical notes system at the Mayo Foundation. We also compare the results of four evaluators to determine the inter-observer variability and the variance between term sets, with respect to the accuracy of the mappings and the reliability of the failure analysis. METHODS: From a corpus of approximately 1,000,000 unique terms entered into the Impression/Report/Plan section of the clinical notes system in the calendar year 1997, we randomly selected 1,000 terms. We then further randomized these 1,000 terms into two groups of 500 (Sets A and B). We constructed two copies of the same term matching interface, one without ATC (alpha) and one with ATC (beta). We took four expert Indexers and assigned them to one of the following tasks. The first reviewer (R1) compared set A using the alpha program and then set B using the beta program (R1(Aalpha + Bbeta)). The second compared set A using the alpha program and then set B using the alpha program (R2(A + B) alpha). The third compared set B using the beta program and then set A using the beta program (R3(B + A) beta). The fourth compared set A using the beta program and then set B using the alpha program (R4(Abeta + Balpha)). RESULTS: The program with Automated Term Composition mapped 540 out of the 1,000 Concepts correctly (54.0%). The same program without ATC mapped only 276 out of the 1,000 Concepts correctly (27.6%). Therefore the program with ATC was significantly more effective at matching concepts in our problem lists than the same search engine without ATC (p <0.0001; McNemar Method). These figures result from the comparison of the alpha program with the beta program by reviewers one and four. Failure analysis showed that with the alpha version 425 out of the 724 mismatches were because a base concept was missing from the retrieval set (58.7%) and 299 mismatches were from missing qualifiers or modifiers or both (41.3%). In the beta version of the program (with ATC) 340 out of the 460 mismatches were secondary to there being a missing base concept in the retrieval set (73.9%) and only 120 mismatches due to missing modifiers and or qualifiers (26.1%). CONCLUSIONS: Automated term composition provided significantly better coverage of a randomly chosen set of patient problems, diagnosed at the Mayo Clinic during the 1997 calendar year, when compared with the same information retrieval system without ATC. We believe that these results speak further to the excellent content coverage provided by the UMLS metathesaurus. These authors believe that increased structure, normalization of UMLS content and semantics, and better tools to make use of the currently available content such as automated term composition, are what is needed to leverage the production of commercially viable tools that provide access to controlled vocabularies for medicine.

Original languageEnglish (US)
Pages (from-to)765-769
Number of pages5
JournalProceedings / AMIA ... Annual Symposium. AMIA Symposium
StatePublished - 1998
Externally publishedYes

Fingerprint

Unified Medical Language System
Controlled Vocabulary
Randomized Controlled Trials
Search Engine
Observer Variation
Semantics
Information Systems
Medicine

Cite this

A randomized controlled trial of automated term composition. / Elkin, P. L.; Bailey, K. R.; Chute, Christopher.

In: Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 1998, p. 765-769.

Research output: Contribution to journalArticle

@article{d01d3805a27841e9811d96374e27a9e2,
title = "A randomized controlled trial of automated term composition.",
abstract = "OBJECTIVE: To compare the ability of an Automated Term Composition (ATC) algorithm with non-compositional mappings to provide coverage (exact mappings to a controlled vocabulary) for a randomly selected set of free text entries which were entered as headings to the Impression section of the clinical notes system at the Mayo Foundation. We also compare the results of four evaluators to determine the inter-observer variability and the variance between term sets, with respect to the accuracy of the mappings and the reliability of the failure analysis. METHODS: From a corpus of approximately 1,000,000 unique terms entered into the Impression/Report/Plan section of the clinical notes system in the calendar year 1997, we randomly selected 1,000 terms. We then further randomized these 1,000 terms into two groups of 500 (Sets A and B). We constructed two copies of the same term matching interface, one without ATC (alpha) and one with ATC (beta). We took four expert Indexers and assigned them to one of the following tasks. The first reviewer (R1) compared set A using the alpha program and then set B using the beta program (R1(Aalpha + Bbeta)). The second compared set A using the alpha program and then set B using the alpha program (R2(A + B) alpha). The third compared set B using the beta program and then set A using the beta program (R3(B + A) beta). The fourth compared set A using the beta program and then set B using the alpha program (R4(Abeta + Balpha)). RESULTS: The program with Automated Term Composition mapped 540 out of the 1,000 Concepts correctly (54.0{\%}). The same program without ATC mapped only 276 out of the 1,000 Concepts correctly (27.6{\%}). Therefore the program with ATC was significantly more effective at matching concepts in our problem lists than the same search engine without ATC (p <0.0001; McNemar Method). These figures result from the comparison of the alpha program with the beta program by reviewers one and four. Failure analysis showed that with the alpha version 425 out of the 724 mismatches were because a base concept was missing from the retrieval set (58.7{\%}) and 299 mismatches were from missing qualifiers or modifiers or both (41.3{\%}). In the beta version of the program (with ATC) 340 out of the 460 mismatches were secondary to there being a missing base concept in the retrieval set (73.9{\%}) and only 120 mismatches due to missing modifiers and or qualifiers (26.1{\%}). CONCLUSIONS: Automated term composition provided significantly better coverage of a randomly chosen set of patient problems, diagnosed at the Mayo Clinic during the 1997 calendar year, when compared with the same information retrieval system without ATC. We believe that these results speak further to the excellent content coverage provided by the UMLS metathesaurus. These authors believe that increased structure, normalization of UMLS content and semantics, and better tools to make use of the currently available content such as automated term composition, are what is needed to leverage the production of commercially viable tools that provide access to controlled vocabularies for medicine.",
author = "Elkin, {P. L.} and Bailey, {K. R.} and Christopher Chute",
year = "1998",
language = "English (US)",
pages = "765--769",
journal = "Proceedings / AMIA . Annual Symposium. AMIA Symposium",
issn = "1531-605X",
publisher = "Hanley & Belfus",

}

TY - JOUR

T1 - A randomized controlled trial of automated term composition.

AU - Elkin, P. L.

AU - Bailey, K. R.

AU - Chute, Christopher

PY - 1998

Y1 - 1998

N2 - OBJECTIVE: To compare the ability of an Automated Term Composition (ATC) algorithm with non-compositional mappings to provide coverage (exact mappings to a controlled vocabulary) for a randomly selected set of free text entries which were entered as headings to the Impression section of the clinical notes system at the Mayo Foundation. We also compare the results of four evaluators to determine the inter-observer variability and the variance between term sets, with respect to the accuracy of the mappings and the reliability of the failure analysis. METHODS: From a corpus of approximately 1,000,000 unique terms entered into the Impression/Report/Plan section of the clinical notes system in the calendar year 1997, we randomly selected 1,000 terms. We then further randomized these 1,000 terms into two groups of 500 (Sets A and B). We constructed two copies of the same term matching interface, one without ATC (alpha) and one with ATC (beta). We took four expert Indexers and assigned them to one of the following tasks. The first reviewer (R1) compared set A using the alpha program and then set B using the beta program (R1(Aalpha + Bbeta)). The second compared set A using the alpha program and then set B using the alpha program (R2(A + B) alpha). The third compared set B using the beta program and then set A using the beta program (R3(B + A) beta). The fourth compared set A using the beta program and then set B using the alpha program (R4(Abeta + Balpha)). RESULTS: The program with Automated Term Composition mapped 540 out of the 1,000 Concepts correctly (54.0%). The same program without ATC mapped only 276 out of the 1,000 Concepts correctly (27.6%). Therefore the program with ATC was significantly more effective at matching concepts in our problem lists than the same search engine without ATC (p <0.0001; McNemar Method). These figures result from the comparison of the alpha program with the beta program by reviewers one and four. Failure analysis showed that with the alpha version 425 out of the 724 mismatches were because a base concept was missing from the retrieval set (58.7%) and 299 mismatches were from missing qualifiers or modifiers or both (41.3%). In the beta version of the program (with ATC) 340 out of the 460 mismatches were secondary to there being a missing base concept in the retrieval set (73.9%) and only 120 mismatches due to missing modifiers and or qualifiers (26.1%). CONCLUSIONS: Automated term composition provided significantly better coverage of a randomly chosen set of patient problems, diagnosed at the Mayo Clinic during the 1997 calendar year, when compared with the same information retrieval system without ATC. We believe that these results speak further to the excellent content coverage provided by the UMLS metathesaurus. These authors believe that increased structure, normalization of UMLS content and semantics, and better tools to make use of the currently available content such as automated term composition, are what is needed to leverage the production of commercially viable tools that provide access to controlled vocabularies for medicine.

AB - OBJECTIVE: To compare the ability of an Automated Term Composition (ATC) algorithm with non-compositional mappings to provide coverage (exact mappings to a controlled vocabulary) for a randomly selected set of free text entries which were entered as headings to the Impression section of the clinical notes system at the Mayo Foundation. We also compare the results of four evaluators to determine the inter-observer variability and the variance between term sets, with respect to the accuracy of the mappings and the reliability of the failure analysis. METHODS: From a corpus of approximately 1,000,000 unique terms entered into the Impression/Report/Plan section of the clinical notes system in the calendar year 1997, we randomly selected 1,000 terms. We then further randomized these 1,000 terms into two groups of 500 (Sets A and B). We constructed two copies of the same term matching interface, one without ATC (alpha) and one with ATC (beta). We took four expert Indexers and assigned them to one of the following tasks. The first reviewer (R1) compared set A using the alpha program and then set B using the beta program (R1(Aalpha + Bbeta)). The second compared set A using the alpha program and then set B using the alpha program (R2(A + B) alpha). The third compared set B using the beta program and then set A using the beta program (R3(B + A) beta). The fourth compared set A using the beta program and then set B using the alpha program (R4(Abeta + Balpha)). RESULTS: The program with Automated Term Composition mapped 540 out of the 1,000 Concepts correctly (54.0%). The same program without ATC mapped only 276 out of the 1,000 Concepts correctly (27.6%). Therefore the program with ATC was significantly more effective at matching concepts in our problem lists than the same search engine without ATC (p <0.0001; McNemar Method). These figures result from the comparison of the alpha program with the beta program by reviewers one and four. Failure analysis showed that with the alpha version 425 out of the 724 mismatches were because a base concept was missing from the retrieval set (58.7%) and 299 mismatches were from missing qualifiers or modifiers or both (41.3%). In the beta version of the program (with ATC) 340 out of the 460 mismatches were secondary to there being a missing base concept in the retrieval set (73.9%) and only 120 mismatches due to missing modifiers and or qualifiers (26.1%). CONCLUSIONS: Automated term composition provided significantly better coverage of a randomly chosen set of patient problems, diagnosed at the Mayo Clinic during the 1997 calendar year, when compared with the same information retrieval system without ATC. We believe that these results speak further to the excellent content coverage provided by the UMLS metathesaurus. These authors believe that increased structure, normalization of UMLS content and semantics, and better tools to make use of the currently available content such as automated term composition, are what is needed to leverage the production of commercially viable tools that provide access to controlled vocabularies for medicine.

UR - http://www.scopus.com/inward/record.url?scp=0032248681&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032248681&partnerID=8YFLogxK

M3 - Article

SP - 765

EP - 769

JO - Proceedings / AMIA . Annual Symposium. AMIA Symposium

JF - Proceedings / AMIA . Annual Symposium. AMIA Symposium

SN - 1531-605X

ER -