TY - JOUR
T1 - A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry
AU - Chaerkady, Raghothama
AU - Kelkar, Dhanashree S.
AU - Muthusamy, Babylakshmi
AU - Kandasamy, Kumaran
AU - Dwivedi, Sutopa B.
AU - Sahasrabuddhe, Nandini A.
AU - Kim, Min Sik
AU - Renuse, Santosh
AU - Pinto, Sneha M.
AU - Sharma, Rakesh
AU - Pawar, Harsh
AU - Sekhar, Nirujogi Raja
AU - Mohanty, Ajeet Kumar
AU - Getnet, Derese
AU - Yang, Yi
AU - Zhong, Jun
AU - Dash, Aditya P.
AU - MacCallum, Robert M.
AU - Delanghe, Bernard
AU - Mlambo, Godfree
AU - Kumar, Ashwani
AU - Prasad, T. S.Keshava
AU - Okulate, Mobolaji
AU - Kumar, Nirbhay
AU - Pandey, Akhilesh
PY - 2011/11
Y1 - 2011/11
N2 - Anopheles gambiae is a major mosquito vector responsible for malaria transmission, whose genome sequence was reported in 2002. Genome annotation is a continuing effort, and many of the approximately 13,000 genes listed in VectorBase for Anopheles gambiae are predictions that have still not been validated by any other method. To identify protein-coding genes of An. gambiae based on its genomic sequence, we carried out a deep proteomic analysis using high-resolution Fourier transform mass spectrometry for both precursor and fragment ions. Based on peptide evidence, we were able to support or correct more than 6000 gene annotations including 80 novel gene structures and about 500 translational start sites. An additional validation by RT-PCR and cDNA sequencing was successfully performed for 105 selected genes. Our proteogenomic analysis led to the identification of 2682 genome search-specific peptides. Numerous cases of encoded proteins were documented in regions annotated as intergenic, introns, or untranslated regions. Using a database created to contain potential splice sites, we also identified 35 novel splice junctions. This is a first report to annotate the An. gambiae genome using high-accuracy mass spectrometry data as a complementary technology for genome annotation.
AB - Anopheles gambiae is a major mosquito vector responsible for malaria transmission, whose genome sequence was reported in 2002. Genome annotation is a continuing effort, and many of the approximately 13,000 genes listed in VectorBase for Anopheles gambiae are predictions that have still not been validated by any other method. To identify protein-coding genes of An. gambiae based on its genomic sequence, we carried out a deep proteomic analysis using high-resolution Fourier transform mass spectrometry for both precursor and fragment ions. Based on peptide evidence, we were able to support or correct more than 6000 gene annotations including 80 novel gene structures and about 500 translational start sites. An additional validation by RT-PCR and cDNA sequencing was successfully performed for 105 selected genes. Our proteogenomic analysis led to the identification of 2682 genome search-specific peptides. Numerous cases of encoded proteins were documented in regions annotated as intergenic, introns, or untranslated regions. Using a database created to contain potential splice sites, we also identified 35 novel splice junctions. This is a first report to annotate the An. gambiae genome using high-accuracy mass spectrometry data as a complementary technology for genome annotation.
UR - http://www.scopus.com/inward/record.url?scp=80555142954&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80555142954&partnerID=8YFLogxK
U2 - 10.1101/gr.127951.111
DO - 10.1101/gr.127951.111
M3 - Article
C2 - 21795387
AN - SCOPUS:80555142954
SN - 1088-9051
VL - 21
SP - 1872
EP - 1881
JO - Genome research
JF - Genome research
IS - 11
ER -