TY - JOUR
T1 - Detecting patterns of protein distribution and gene expression in silico
AU - Geraghty, Michael T.
AU - Bassett, Doug
AU - Morrell, James C.
AU - Gatto, Gregory J.
AU - Bai, Jianwu
AU - Geisbrecht, Brian V.
AU - Hieter, Phil
AU - Gould, Stephen J.
PY - 1999/3/16
Y1 - 1999/3/16
N2 - Most biological information is contained within gene and genome sequences. However, current methods for analyzing these data are limited primarily to the prediction of coding regions and identification of sequence similarities. We have developed a computer algorithm, CoSMoS (for context sensitive motif searches), which adds context sensitivity to sequence motif searches. CoSMoS was challenged to identify genes encoding peroxisome- associated and oleate-induced genes in the yeast Saccharomyces cerevisiae. Specifically, we searched for genes capable of encoding proteins with a type 1 or type 2 peroxisomal targeting signal and for genes containing the oleate- response element, a cis-acting element common to fatty acid-regulated genes. CoSMoS successfully identified 7 of 8 known PTS-containing peroxisomal proteins and 13 of 14 known oleate-regulated genes. More importantly, CoSMoS identified an additional 18 candidate peroxisomal proteins and 300 candidate oleate-regulated genes. Preliminary localization studies suggest that these include at least 10 previously unknown peroxisomal proteins. Phenotypic studies of selected gene disruption mutants suggests that several of these new peroxisomal proteins play roles in growth on fatty acids, one is involved in peroxisome biogenesis and at least two are required for synthesis of lysine, a heretofore unrecognized role for peroxisomes. These results expand our understanding of peroxisome content and function, demonstrate the utility of CoSMoS for context-sensitive motif scanning, and point to the benefits of improved in silico genome analysis.
AB - Most biological information is contained within gene and genome sequences. However, current methods for analyzing these data are limited primarily to the prediction of coding regions and identification of sequence similarities. We have developed a computer algorithm, CoSMoS (for context sensitive motif searches), which adds context sensitivity to sequence motif searches. CoSMoS was challenged to identify genes encoding peroxisome- associated and oleate-induced genes in the yeast Saccharomyces cerevisiae. Specifically, we searched for genes capable of encoding proteins with a type 1 or type 2 peroxisomal targeting signal and for genes containing the oleate- response element, a cis-acting element common to fatty acid-regulated genes. CoSMoS successfully identified 7 of 8 known PTS-containing peroxisomal proteins and 13 of 14 known oleate-regulated genes. More importantly, CoSMoS identified an additional 18 candidate peroxisomal proteins and 300 candidate oleate-regulated genes. Preliminary localization studies suggest that these include at least 10 previously unknown peroxisomal proteins. Phenotypic studies of selected gene disruption mutants suggests that several of these new peroxisomal proteins play roles in growth on fatty acids, one is involved in peroxisome biogenesis and at least two are required for synthesis of lysine, a heretofore unrecognized role for peroxisomes. These results expand our understanding of peroxisome content and function, demonstrate the utility of CoSMoS for context-sensitive motif scanning, and point to the benefits of improved in silico genome analysis.
KW - Green fluorescent protein
KW - Lysine synthesis
KW - Membrane proteins
KW - Peroxisome biogenesis
KW - Peroxisomes
UR - http://www.scopus.com/inward/record.url?scp=0033020728&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033020728&partnerID=8YFLogxK
U2 - 10.1073/pnas.96.6.2937
DO - 10.1073/pnas.96.6.2937
M3 - Article
C2 - 10077615
AN - SCOPUS:0033020728
SN - 0027-8424
VL - 96
SP - 2937
EP - 2942
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 6
ER -