From ambiguities to insights: Query-based comparisons of high-dimensional data

Jeanne Kowalski; Conover Talbot; Hua L. Tsai; Nijaguna Prasad; Christopher Umbricht; Martha A. Zeiger

doi:10.1063/1.2816635

From ambiguities to insights: Query-based comparisons of high-dimensional data

Jeanne Kowalski, Conover Talbot, Hua L. Tsai, Nijaguna Prasad, Christopher Umbricht, Martha A. Zeiger

School of Medicine

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Genomic technologies will revolutionize drug discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.

Original language	English (US)
Title of host publication	Computational Models For Life Sciences (CMLS '07) - 2007 International Symposium
Pages	305-314
Number of pages	10
DOIs	https://doi.org/10.1063/1.2816635
State	Published - 2007
Event	2007 International Symposium on Computational Models for Life Sciences, CMLS '07 - Gold Coast, QLD, Australia Duration: Dec 17 2007 → Dec 19 2007

Publication series

Name	AIP Conference Proceedings
Volume	952
ISSN (Print)	0094-243X
ISSN (Electronic)	1551-7616

Other

Other	2007 International Symposium on Computational Models for Life Sciences, CMLS '07
Country/Territory	Australia
City	Gold Coast, QLD
Period	12/17/07 → 12/19/07

Keywords

Gene expression
Query
Thyroid cancer

ASJC Scopus subject areas

General Physics and Astronomy

Access to Document

10.1063/1.2816635

Cite this

Kowalski, J, Talbot, C, Tsai, HL, Prasad, N, Umbricht, C & Zeiger, MA 2007, From ambiguities to insights: Query-based comparisons of high-dimensional data. in Computational Models For Life Sciences (CMLS '07) - 2007 International Symposium. AIP Conference Proceedings, vol. 952, pp. 305-314, 2007 International Symposium on Computational Models for Life Sciences, CMLS '07, Gold Coast, QLD, Australia, 12/17/07. https://doi.org/10.1063/1.2816635

@inproceedings{aa620a31671247d4b93af5eb2b9e1b3f,

title = "From ambiguities to insights: Query-based comparisons of high-dimensional data",

abstract = "Genomic technologies will revolutionize drug discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as {"}which one of these (compounds, stimuli, etc.) is not like the others?{"} We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.",

keywords = "Gene expression, Query, Thyroid cancer",

author = "Jeanne Kowalski and Conover Talbot and Tsai, {Hua L.} and Nijaguna Prasad and Christopher Umbricht and Zeiger, {Martha A.}",

year = "2007",

doi = "10.1063/1.2816635",

language = "English (US)",

isbn = "9780735404663",

series = "AIP Conference Proceedings",

pages = "305--314",

booktitle = "Computational Models For Life Sciences (CMLS '07) - 2007 International Symposium",

note = "2007 International Symposium on Computational Models for Life Sciences, CMLS '07 ; Conference date: 17-12-2007 Through 19-12-2007",

}

TY - GEN

T1 - From ambiguities to insights

T2 - 2007 International Symposium on Computational Models for Life Sciences, CMLS '07

AU - Kowalski, Jeanne

AU - Talbot, Conover

AU - Tsai, Hua L.

AU - Prasad, Nijaguna

AU - Umbricht, Christopher

AU - Zeiger, Martha A.

PY - 2007

Y1 - 2007

N2 - Genomic technologies will revolutionize drug discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.

AB - Genomic technologies will revolutionize drug discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.

KW - Gene expression

KW - Query

KW - Thyroid cancer

UR - http://www.scopus.com/inward/record.url?scp=71449121238&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=71449121238&partnerID=8YFLogxK

U2 - 10.1063/1.2816635

DO - 10.1063/1.2816635

M3 - Conference contribution

AN - SCOPUS:71449121238

SN - 9780735404663

T3 - AIP Conference Proceedings

SP - 305

EP - 314

BT - Computational Models For Life Sciences (CMLS '07) - 2007 International Symposium

Y2 - 17 December 2007 through 19 December 2007

ER -

From ambiguities to insights: Query-based comparisons of high-dimensional data

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this