Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer

Bahman Afsari; Theresa Guo; Michael Considine; Liliana Florea; Luciane T. Kagohara; Genevieve L. Stein-O'Brien; Dylan Kelley; Emily Flam; Kristina D. Zambo; Patrick K. Ha; Donald Geman; Michael F. Ochs; Joseph A. Califano; Daria A. Gaykalova; Alexander V. Favorov; Elana J. Fertig

doi:10.1093/bioinformatics/bty004

Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer

Bahman Afsari, Theresa Guo, Michael Considine, Liliana Florea, Luciane T. Kagohara, Genevieve L. Stein-O'Brien, Dylan Kelley, Emily Flam, Kristina D. Zambo, Patrick K. Ha, Donald Geman, Michael F. Ochs, Joseph A. Califano, Daria A. Gaykalova, Alexander V. Favorov, Elana J. Fertig

School of Medicine

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data.

Original language	English (US)
Pages (from-to)	1859-1867
Number of pages	9
Journal	Bioinformatics
Volume	34
Issue number	11
DOIs	https://doi.org/10.1093/bioinformatics/bty004
State	Published - Jun 1 2018

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/bty004

Cite this

Afsari, B., Guo, T., Considine, M., Florea, L., Kagohara, L. T., Stein-O'Brien, G. L., Kelley, D., Flam, E., Zambo, K. D., Ha, P. K., Geman, D., Ochs, M. F., Califano, J. A., Gaykalova, D. A., Favorov, A. V., & Fertig, E. J. (2018). Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer. Bioinformatics, 34(11), 1859-1867. https://doi.org/10.1093/bioinformatics/bty004

Afsari, B, Guo, T, Considine, M, Florea, L , Kagohara, LT , Stein-O'Brien, GL, Kelley, D, Flam, E, Zambo, KD, Ha, PK, Geman, D, Ochs, MF, Califano, JA, Gaykalova, DA, Favorov, AV & Fertig, EJ 2018, 'Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer', Bioinformatics, vol. 34, no. 11, pp. 1859-1867. https://doi.org/10.1093/bioinformatics/bty004

@article{a767b5c378a249fc9002e0e34e5765b0,

title = "Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer",

abstract = "Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data.",

author = "Bahman Afsari and Theresa Guo and Michael Considine and Liliana Florea and Kagohara, {Luciane T.} and Stein-O'Brien, {Genevieve L.} and Dylan Kelley and Emily Flam and Zambo, {Kristina D.} and Ha, {Patrick K.} and Donald Geman and Ochs, {Michael F.} and Califano, {Joseph A.} and Gaykalova, {Daria A.} and Favorov, {Alexander V.} and Fertig, {Elana J.}",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2018. Published by Oxford University Press.",

year = "2018",

month = jun,

day = "1",

doi = "10.1093/bioinformatics/bty004",

language = "English (US)",

volume = "34",

pages = "1859--1867",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "11",

}

TY - JOUR

T1 - Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer

AU - Afsari, Bahman

AU - Guo, Theresa

AU - Considine, Michael

AU - Florea, Liliana

AU - Kagohara, Luciane T.

AU - Stein-O'Brien, Genevieve L.

AU - Kelley, Dylan

AU - Flam, Emily

AU - Zambo, Kristina D.

AU - Ha, Patrick K.

AU - Geman, Donald

AU - Ochs, Michael F.

AU - Califano, Joseph A.

AU - Gaykalova, Daria A.

AU - Favorov, Alexander V.

AU - Fertig, Elana J.

PY - 2018/6/1

Y1 - 2018/6/1

N2 - Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data.

AB - Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data.

UR - http://www.scopus.com/inward/record.url?scp=85048048465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048048465&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bty004

DO - 10.1093/bioinformatics/bty004

M3 - Article

C2 - 29342249

AN - SCOPUS:85048048465

SN - 1367-4803

VL - 34

SP - 1859

EP - 1867

JO - Bioinformatics

JF - Bioinformatics

IS - 11

ER -

Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this