Serial Analysis of Gene Expression (SAGE) and Massively Parallel Signature Sequencing (MPSS) are powerful techniques for gene expression analysis. A crucial step in analyzing SAGE and MPSS data is the assignment of experimentally obtained tags to a known transcript. However, tag to transcript assignment is not a straightforward process since alternative tags for a given transcript can also be experimentally obtained. Here, we have evaluated the impact of Single Nucleotide Polymorphisms (SNPs) on the generation of alternative SAGE and MPSS tags. This was achieved through the construction of a reference database of SNP-associated alternative tags, which has been integrated with SAGE Genie. A total of 2020 SNP-associated alternative tags were catalogued in our reference database and at least one SNP-associated alternative tag was observed for ∼8.6% of all known human genes. A significant fraction (61.9%) of these alternative tags matched a list of experimentally obtained tags, validating their existence. In addition, the origin of four out of five SNP-associated alternative MPSS tags was experimentally confirmed through the use of the GLGI-MPSS protocol (Generation of Long cDNA fragments for Gene Identification). The availability of our SNP-associated alternative tag database will certainly improve the interpretation of SAGE and MPSS experiments.
ASJC Scopus subject areas