Text mining for hypotheses and results in translational medicine studies

Terry H. Tsai, Niels Kasch, Craig Pfeifer, Tim Oates

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Most common and complex diseases, such as diabetes and cancer, are influenced at some level by variation in the genome. To truly address the goal of translational research, genetic variation must be taken into consideration. Research done in public health genetics, specifically in the area of single nucleotide polymorphisms (SNPs), is the first step to understanding human genetic variation. In addition, novel methods are needed to represent and to conduct text mining over textual genotypic data sources. In this paper, we describe the development and evaluation, in the context of a genetic study, of a translational-informatics method that supports both machine-learning text mining (e.g., Conditional random fields) and automated inference for identifying key concepts (e.g., Hypotheses and results). After scaling for inter-annotator agreement, our adjusted overall precision was 64%, with a range of 48% to 80%. While other biological text mining systems have focused on named-entity recognition, the development of tools for genetic studies focusing on hypotheses and results has been relatively rare.

Original languageEnglish (US)
Title of host publicationProceedings - 14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
EditorsZhi-Hua Zhou, Wei Wang, Ravi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
PublisherIEEE Computer Society
Pages127-132
Number of pages6
EditionJanuary
ISBN (Electronic)9781479942749
DOIs
StatePublished - Jan 26 2015
Event14th IEEE International Conference on Data Mining Workshops, ICDMW 2014 - Shenzhen, China
Duration: Dec 14 2014 → …

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
NumberJanuary
Volume2015-January
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
Country/TerritoryChina
CityShenzhen
Period12/14/14 → …

Keywords

  • Biomedical informatics
  • Gene-environment interaction studies
  • Natural language processing
  • Text mining
  • Translational informatics

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Text mining for hypotheses and results in translational medicine studies'. Together they form a unique fingerprint.

Cite this