PI: Steven E. Brenner
Gene regulation by alternative splicing and nonsense-mediated mRNA decay. Nonsense-mediated mRNA decay (NMD) is a cellular RNA surveillance system that recognizes transcripts with premature termination codons and degrades them. We discovered large numbers of natural alternative splice forms that appear to be targets for NMD, and we speculated that this might be a mode of gene regulation which we termed RUST (regulated unproductive splicing and translation). All conserved members of the SR family of splice regulators have an unproductive alternative mRNA isoform targeted for NMD1. Strikingly, the splice pattern for each is conserved in mouse and always associated with an ultraconserved or highly-conserved region of perfect identity between human and mouse. Remarkably, this seems to have evolved independently in every one of the genes, suggesting that this is a natural mode of regulation. We are using RNA-Seq to explore the pervasiveness of NMD in numerous species2, and to understand its behavior. As part of a modENCODE consortium, we discovered the repertoire of targets for alternative splicing in the fly, as well as unexpected relationships between the development of fly and worm3, 4. We are now detailing the regulators in the SR family and exploring the evolution of this gene-expression regulation mechanism.
Prediction of protein function using Bayesian phylogenomics. We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Unfortunately, these predictions have littered the databases with erroneous information, for a variety of reasons including the propagation of errors and the systematic flaws in BLAST and related methods. In collaboration with Michael Jordan's group, we have developed a statistical approach to predicting protein function that uses a protein family's phylogenetic tree, as the natural structure for representing protein relationships. We overlay on this all known protein functions in the family. We use a model of function evolution to then infer the functions of all other protein functions. Even our initial implementations of this method, called SIFTER (statistical inference of function through evolutionary relationships) have performed better than other methods in widespread use5. We are presently making numerous improvements to the underlying SIFTER algorithm and enhancing its ability to work on a wide range of data and to incorporate more experimental association data. SIFTER was honored as a top performing method in the Critical Assessment of Function Annotation9. We are collaborating with the ENIGMA project at LBNL to improve annotation on a large scale. In collaboration with Jack Kirsch, we are also experimentally validating the function predictions, with a focus on the Nudix family. We are also involved in maintaining the SCOP: Structural Classification of Proteins, a key resource for understanding protein structure data. We therefore analyze structural genomics efforts and guide their future directions8. Using kernel methods and selected features, we are building systems to recognize ancient protein evolutionary relationships.
Personal genomics. We have a longstanding interest in personal genome interpretation, including developing a genome commons6, understanding the basis of Mendelian diseases from sequenced genomes10, and organizing the Critical Assessment of Genome Interpretation (CAGI) project7.
Recent selected publications
1. Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446:926-929.
2. Hansen KD, Lareau LF, Blanchette M, Green RE, Meng Q, Rehwinkel J, Gallusser FL, Izaurralde E, Rio DC, Dudoit S, Brenner SE. 2009. Genome-wide identification of alternative splice forms down-regulated by nonsense-mediated mRNA decay in Drosophila. PLoS Genetics 5:e1000525. doi:10.1371/journal.pgen.1000525 [PDF .5M]
3. Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin J, Yang L, et al. 2011. The developmental transcriptome of Drosophila melanogaster. Nature. published online. doi:10.1038/nature09715 [PDF 1.9M]
4. Brooks AN, Yang L, Duff MO, Hansen KD, Dudoit S, Brenner SE, Graveley BR. 2011. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Research 21:193-202. doi:10.1101/gr.108662.110 [PDF .8M]
5. Engelhardt BE, Jordan MI, Muratore KE, Brenner SE. 2005. Protein molecular function prediction by Bayesian phylogenomics. PLoS Comput Biol 1:432-445. doi:10.1371/journal.pcbi.0010045 [PDF 1.4M]
6. Brenner SE. 2007. Common sense for our genomes. Nature 449:783-784. doi:10.1038/449783a [PDF .2M]
7. Callaway E. 2010. Mutation-prediction software rewarded. Nature. published online. doi:10.1038/news.2010.679
8. Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311:347-351. doi:10.1126/science.1121018 [PDF .2M]
9. Radivojac P, et al. 2013. A large-scale evaluation of computational protein function prediction. Nature Methods. published online. doi:10.1038/nmeth.2340 [PDF 720K]
10. Mallott J, et al. 2012. Newborn screening for SCID identifies patients with ataxia telangiectasia. Journal of Clinical Immunology. published online. doi:10.1007/s10875-012-9846-1 [PDF 445K]
Resources & Projects
How to find us...