Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants

Pornpat Athamanolap, Vishwa Parekh, Stephanie I. Fraley, Vatsal Agarwal, Dong J. Shin, Michael A. Jacobs, Tza Huei Wang, Samuel Yang

Research output: Contribution to journalArticle

Abstract

High resolution melt (HRM) is gaining considerable popularity as a simple and robust method for genotyping sequence variants. However, accurate genotyping of an unknown sample for which a large number of possible variants may exist will require an automated HRM curve identification method capable of comparing unknowns against a large cohort of known sequence variants. Herein, we describe a new method for automated HRM curve classification based on machine learning methods and learned tolerance for reaction condition deviations. We tested this method in silico through multiple cross-validations using curves generated from 9 different simulated experimental conditions to classify 92 known serotypes of Streptococcus pneumoniae and demonstrated over 99% accuracy with 8 training curves per serotype. In vitro verification of the algorithm was tested using sequence variants of a cancer-related gene and demonstrated 100% accuracy with 3 training curves per sequence variant. The machine learning algorithm enabled reliable, scalable, and automated HRM genotyping analysis with broad potential clinical and epidemiological applications.

Original languageEnglish (US)
Article number0109094
JournalPloS one
Volume9
Issue number10
DOIs
StatePublished - Oct 2 2014

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • General

Fingerprint Dive into the research topics of 'Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants'. Together they form a unique fingerprint.

  • Cite this