Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells

Anil K. Madugundu; Chan Hyun Na; Raja Sekhar Nirujogi; Santosh Renuse; Kwang Pyo Kim; Kathleen H. Burns; Christopher Wilks; Ben Langmead; Shannon E. Ellis; Leonardo Collado-Torres; Marc K. Halushka; Min Sik Kim; Akhilesh Pandey

doi:10.1002/pmic.201800315

Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells

Anil K. Madugundu, Chan Hyun Na, Raja Sekhar Nirujogi, Santosh Renuse, Kwang Pyo Kim, Kathleen H. Burns, Christopher Wilks, Ben Langmead, Shannon E. Ellis, Leonardo Collado-Torres, Marc K. Halushka, Min Sik Kim, Akhilesh Pandey

School of Medicine

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next-generation sequencing and high-resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired-end reads from next-generation sequencing confirmed expression of 12 186 protein-coding genes (FPKM ≥0.1), 439 novel long non-coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation of N-termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post-translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono-, di- and tri-methylation events. Evidence for 11 “missing proteins,” which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.

Original language	English (US)
Article number	1800315
Journal	Proteomics
Volume	19
Issue number	15
DOIs	https://doi.org/10.1002/pmic.201800315
State	Published - Aug 2019

Keywords

RNA-seq
allelic expression
coding SNP
mass-spectrometry
proteoform
proteogenomics
splice variants
transcriptome

ASJC Scopus subject areas

Biochemistry
Molecular Biology

Access to Document

10.1002/pmic.201800315

Cite this

@article{d1a5a66fd4ea4421b98b5cd71fcef013,

title = "Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells",

abstract = "Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next-generation sequencing and high-resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired-end reads from next-generation sequencing confirmed expression of 12 186 protein-coding genes (FPKM ≥0.1), 439 novel long non-coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation of N-termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post-translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono-, di- and tri-methylation events. Evidence for 11 “missing proteins,” which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.",

keywords = "RNA-seq, allelic expression, coding SNP, mass-spectrometry, proteoform, proteogenomics, splice variants, transcriptome",

author = "Madugundu, {Anil K.} and Na, {Chan Hyun} and Nirujogi, {Raja Sekhar} and Santosh Renuse and Kim, {Kwang Pyo} and Burns, {Kathleen H.} and Christopher Wilks and Ben Langmead and Ellis, {Shannon E.} and Leonardo Collado-Torres and Halushka, {Marc K.} and Kim, {Min Sik} and Akhilesh Pandey",

note = "Funding Information: This work was supported by the Wellcome Trust/DBT India Alliance Margdarshi Fellowship (Grant number IA/M/15/1/502023) awarded to A.P. This study was supported by NIH grants (R01GM124531 to K.H.B.; R01CA184165 and P50NS038377 to A.P.), NCI's Clinical Proteomic Tumor Analysis Consortium Initiative (U24CA210985), and a shared instrumentation grant (S10OD021844) to A.P. This study was supported by the Brain Research Program (Grant number: NRF-2017M3C7A1027472) and the Collaborative Genome Program for Fostering New Post-Genome Industry (NRF-2017M3C9A5031597) through the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) of Republic of Korea to M.S.K. The authors thank Snehal Raskar for assistance with cell culture. Funding Information: Figure 5. Schematic representation of alternative transcript expression. A) HNRNPA0 protein identified with an upstream alternate N-terminus in-frame with annotated start site (bent arrow) is shown. B) Alternative splice donor in BIRC6 is supported by RNA-seq and novel junctional peptide. The annotated MS/MS spectra supporting these finding is also shown. Known and novel transcript models are shown in brown and black colors, respectively. Track in red color shows the sashimi plot with thick curves connecting the exon–exon boundaries. Amino acid that span the splicing junction are marked in red. Funding Information: This work was supported by the Wellcome Trust/DBT India Alliance Margdarshi Fellowship (Grant number IA/M/15/1/502023) awarded to A.P. This study was supported by NIH grants (R01GM124531 to K.H.B.; R01CA184165 and P50NS038377 to A.P.), NCI{\textquoteright}s Clinical Proteomic Tumor Analysis Consortium Initiative (U24CA210985), and a shared instrumentation grant (S10OD021844) to A.P. This study was supported by the Brain Research Program (Grant number: NRF-2017M3C7A1027472) and the Collaborative Genome Program for Fostering New Post-Genome Industry (NRF-2017M3C9A5031597) through the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) of Republic of Korea to M.S.K. The authors thank Snehal Raskar for assistance with cell culture. Publisher Copyright: {\textcopyright} 2019 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim",

year = "2019",

month = aug,

doi = "10.1002/pmic.201800315",

language = "English (US)",

volume = "19",

journal = "Proteomics",

issn = "1615-9853",

publisher = "Wiley-VCH Verlag",

number = "15",

}

TY - JOUR

T1 - Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells

AU - Madugundu, Anil K.

AU - Na, Chan Hyun

AU - Nirujogi, Raja Sekhar

AU - Renuse, Santosh

AU - Kim, Kwang Pyo

AU - Burns, Kathleen H.

AU - Wilks, Christopher

AU - Langmead, Ben

AU - Ellis, Shannon E.

AU - Collado-Torres, Leonardo

AU - Halushka, Marc K.

AU - Kim, Min Sik

AU - Pandey, Akhilesh

N1 - Funding Information: This work was supported by the Wellcome Trust/DBT India Alliance Margdarshi Fellowship (Grant number IA/M/15/1/502023) awarded to A.P. This study was supported by NIH grants (R01GM124531 to K.H.B.; R01CA184165 and P50NS038377 to A.P.), NCI's Clinical Proteomic Tumor Analysis Consortium Initiative (U24CA210985), and a shared instrumentation grant (S10OD021844) to A.P. This study was supported by the Brain Research Program (Grant number: NRF-2017M3C7A1027472) and the Collaborative Genome Program for Fostering New Post-Genome Industry (NRF-2017M3C9A5031597) through the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) of Republic of Korea to M.S.K. The authors thank Snehal Raskar for assistance with cell culture. Funding Information: Figure 5. Schematic representation of alternative transcript expression. A) HNRNPA0 protein identified with an upstream alternate N-terminus in-frame with annotated start site (bent arrow) is shown. B) Alternative splice donor in BIRC6 is supported by RNA-seq and novel junctional peptide. The annotated MS/MS spectra supporting these finding is also shown. Known and novel transcript models are shown in brown and black colors, respectively. Track in red color shows the sashimi plot with thick curves connecting the exon–exon boundaries. Amino acid that span the splicing junction are marked in red. Funding Information: This work was supported by the Wellcome Trust/DBT India Alliance Margdarshi Fellowship (Grant number IA/M/15/1/502023) awarded to A.P. This study was supported by NIH grants (R01GM124531 to K.H.B.; R01CA184165 and P50NS038377 to A.P.), NCI’s Clinical Proteomic Tumor Analysis Consortium Initiative (U24CA210985), and a shared instrumentation grant (S10OD021844) to A.P. This study was supported by the Brain Research Program (Grant number: NRF-2017M3C7A1027472) and the Collaborative Genome Program for Fostering New Post-Genome Industry (NRF-2017M3C9A5031597) through the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) of Republic of Korea to M.S.K. The authors thank Snehal Raskar for assistance with cell culture. Publisher Copyright: © 2019 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

PY - 2019/8

Y1 - 2019/8

N2 - Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next-generation sequencing and high-resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired-end reads from next-generation sequencing confirmed expression of 12 186 protein-coding genes (FPKM ≥0.1), 439 novel long non-coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation of N-termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post-translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono-, di- and tri-methylation events. Evidence for 11 “missing proteins,” which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.

AB - Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next-generation sequencing and high-resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired-end reads from next-generation sequencing confirmed expression of 12 186 protein-coding genes (FPKM ≥0.1), 439 novel long non-coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation of N-termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post-translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono-, di- and tri-methylation events. Evidence for 11 “missing proteins,” which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.

KW - RNA-seq

KW - allelic expression

KW - coding SNP

KW - mass-spectrometry

KW - proteoform

KW - proteogenomics

KW - splice variants

KW - transcriptome

UR - http://www.scopus.com/inward/record.url?scp=85068105583&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068105583&partnerID=8YFLogxK

U2 - 10.1002/pmic.201800315

DO - 10.1002/pmic.201800315

M3 - Article

C2 - 30983154

AN - SCOPUS:85068105583

SN - 1615-9853

VL - 19

JO - Proteomics

JF - Proteomics

IS - 15

M1 - 1800315

ER -

Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this