Archive for January, 2010

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

January 2nd, 2010

Figure1_183mmThere are now nearly 1,000 completed bacterial and archaeal genomes available (see my recent editorial for New Phytologist). However, we have only scratch the surface of the genomic diversity among these microbes and most species were chosen for sequencing on the basis of their physiology and ability to grow in the lab. The current genome port-folio is therefore limited by a highly biased phylogenetic distribution. This will change quickly thanks to coordinated massive sequencing efforts. Sequencing of cultured micro-organisms is an appropriate place to begin, since not only are their DNA available, but they are also accompanied by (meta)data on their environment and physiology that can be used to understand the resulting genomic data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into these coordinated sequencing efforts. Within the framework of the JGI Genomic Encyclopedia of Bacteria and Archaea (GEBA) programme, Jonathan Eisen’s group and his JGI and DSMZ collaborators are sequencing and analysing the genomes of hundred species of cultured Bacteria and Archaea selected throughout the Tree of Life to maximize phylogenetic coverage, i.e. they identified clades in the bacterial and archaeal portions of  TOL where there were no genome sequences available. From hundreds of candidates, 200 type culturable strains were selected both to obtain a broad coverage across Bacteria and Archaea and to perform in-depth sampling of a single phylum.

The genomes of 159 species are being sequenced, assembled, annotated and finished, and relevant data are being released through a dedicated Integrated Microbial Genomes database portal. In the Nature issues of December 24th, Jonathan and his colleagues discuss the results obtained from the first 56 genomes for which the shotgun phase of sequencing was completed.

Analysis of these phylogenetically-selected genomes showed clear benefits in diverse areas, such as the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. These results highlight the utility of tree-based genome selection as a means to maximize the identification of novel protein families and argues against lateral gene transfer significantly redistributing genetic novelty between distantly related lineages. The GEBA genomes also show significant phylogenetic expansions within known protein families (e.g. glycoside hydrolases), transposable elements and non-coding RNAs. Unexpectedly, a bacterial actin-related protein (BARP) gene was found within the genome of the marine Deltaproteobacterium H. ochraceum. Based on the phylogenetic diversity unraveled by the 56 GEBA genomes, the authors estimated that sequencing only 1,520 phylogenetically selected isolates would encompass half of the phylogenetic diversity represented by known cultured bacteria and archaea; an additional 9,218 genome sequences from currently uncultured species would be required to capture 50% of the diversity of uncultured bacterial and archaeal diversity.

After reading this first account of the on-going large scale sequencing of Bacteria and Archea we may dare to ask: Are there any benefits that come from this “phylogeny driven” approach to sequencing genomes compared to what one might find with sequencing just any random genome as was done so far?  It appears clearly from the cornucupia of novel results discussed above that there are in fact many benefits that come from sequencing genomes from branches in the TOL for which genomes are not available. A similar approach is now applied to the fungal genomes within the Genome Encyclopedia of Fungi programme, a project spearheaded by the US Department of Energy’s Joint Genome Institute (JGI) in Walnut Creek, California, which aims to sequence the genomes of 500 or so fungi.

In a recent blog post, Jonathan Eisen tells the story behind this fascinating and challenging GEBA project.

There is also an excellent  report in the New York Times on this work as well: Scientists Start a Genomic Catalog of Earth’s Abundant Microbes

Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N., Kunin, V., Goodwin, L., Wu, M., Tindall, B., Hooper, S., Pati, A., Lykidis, A., Spring, S., Anderson, I., D’haeseleer, P., Zemla, A., Singer, M., Lapidus, A., Nolan, M., Copeland, A., Han, C., Chen, F., Cheng, J., Lucas, S., Kerfeld, C., Lang, E., Gronow, S., Chain, P., Bruce, D., Rubin, E., Kyrpides, N., Klenk, H., & Eisen, J. (2009). A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea Nature, 462 (7276), 1056-1060 DOI: 10.1038/nature08656

A New Bench Top Sequencer

January 2nd, 2010

gsTo address the growing demand for next-generation sequencing data in most current biological research projects, 454 Life Sciences unveils its new ‘low cost’ sequencer, the GS Junior System, an affordable bench top sequencing platform slated for release in 2010. The platform will launch with long-read GS Junior Titanium chemistry, offering 100,000 reads of 400 – 500 bp lengths (35 megabases) in 10 hours.The GS Junior will cost about a fourth to a fifth of the GS FLX, which has a list price on the order of $500,000. Let’s hope that this new ‘bench top’ sequencer will bring genome resequencing, transcriptomics and metagenomics to small teams.

Winter Melancholy

January 2nd, 2010

Time will heal all ills

Winter Melancholy