A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

January 2nd, 2010 by Francis Martin No comments »

Figure1_183mmThere are now nearly 1,000 completed bacterial and archaeal genomes available (see my recent editorial for New Phytologist). However, we have only scratch the surface of the genomic diversity among these microbes and most species were chosen for sequencing on the basis of their physiology and ability to grow in the lab. The current genome port-folio is therefore limited by a highly biased phylogenetic distribution. This will change quickly thanks to coordinated massive sequencing efforts. Sequencing of cultured micro-organisms is an appropriate place to begin, since not only are their DNA available, but they are also accompanied by (meta)data on their environment and physiology that can be used to understand the resulting genomic data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into these coordinated sequencing efforts. Within the framework of the JGI Genomic Encyclopedia of Bacteria and Archaea (GEBA) programme, Jonathan Eisen’s group and his JGI and DSMZ collaborators are sequencing and analysing the genomes of hundred species of cultured Bacteria and Archaea selected throughout the Tree of Life to maximize phylogenetic coverage, i.e. they identified clades in the bacterial and archaeal portions of  TOL where there were no genome sequences available. From hundreds of candidates, 200 type culturable strains were selected both to obtain a broad coverage across Bacteria and Archaea and to perform in-depth sampling of a single phylum.

The genomes of 159 species are being sequenced, assembled, annotated and finished, and relevant data are being released through a dedicated Integrated Microbial Genomes database portal. In the Nature issues of December 24th, Jonathan and his colleagues discuss the results obtained from the first 56 genomes for which the shotgun phase of sequencing was completed.

Analysis of these phylogenetically-selected genomes showed clear benefits in diverse areas, such as the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. These results highlight the utility of tree-based genome selection as a means to maximize the identification of novel protein families and argues against lateral gene transfer significantly redistributing genetic novelty between distantly related lineages. The GEBA genomes also show significant phylogenetic expansions within known protein families (e.g. glycoside hydrolases), transposable elements and non-coding RNAs. Unexpectedly, a bacterial actin-related protein (BARP) gene was found within the genome of the marine Deltaproteobacterium H. ochraceum. Based on the phylogenetic diversity unraveled by the 56 GEBA genomes, the authors estimated that sequencing only 1,520 phylogenetically selected isolates would encompass half of the phylogenetic diversity represented by known cultured bacteria and archaea; an additional 9,218 genome sequences from currently uncultured species would be required to capture 50% of the diversity of uncultured bacterial and archaeal diversity.

After reading this first account of the on-going large scale sequencing of Bacteria and Archea we may dare to ask: Are there any benefits that come from this “phylogeny driven” approach to sequencing genomes compared to what one might find with sequencing just any random genome as was done so far?  It appears clearly from the cornucupia of novel results discussed above that there are in fact many benefits that come from sequencing genomes from branches in the TOL for which genomes are not available. A similar approach is now applied to the fungal genomes within the Genome Encyclopedia of Fungi programme, a project spearheaded by the US Department of Energy’s Joint Genome Institute (JGI) in Walnut Creek, California, which aims to sequence the genomes of 500 or so fungi.

In a recent blog post, Jonathan Eisen tells the story behind this fascinating and challenging GEBA project.

There is also an excellent  report in the New York Times on this work as well: Scientists Start a Genomic Catalog of Earth’s Abundant Microbes

Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N., Kunin, V., Goodwin, L., Wu, M., Tindall, B., Hooper, S., Pati, A., Lykidis, A., Spring, S., Anderson, I., D’haeseleer, P., Zemla, A., Singer, M., Lapidus, A., Nolan, M., Copeland, A., Han, C., Chen, F., Cheng, J., Lucas, S., Kerfeld, C., Lang, E., Gronow, S., Chain, P., Bruce, D., Rubin, E., Kyrpides, N., Klenk, H., & Eisen, J. (2009). A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea Nature, 462 (7276), 1056-1060 DOI: 10.1038/nature08656

A New Bench Top Sequencer

January 2nd, 2010 by Francis Martin No comments »

gsTo address the growing demand for next-generation sequencing data in most current biological research projects, 454 Life Sciences unveils its new ‘low cost’ sequencer, the GS Junior System, an affordable bench top sequencing platform slated for release in 2010. The platform will launch with long-read GS Junior Titanium chemistry, offering 100,000 reads of 400 – 500 bp lengths (35 megabases) in 10 hours.The GS Junior will cost about a fourth to a fifth of the GS FLX, which has a list price on the order of $500,000. Let’s hope that this new ‘bench top’ sequencer will bring genome resequencing, transcriptomics and metagenomics to small teams.

Winter Melancholy

January 2nd, 2010 by Francis Martin No comments »

Time will heal all ills

Winter Melancholy

In Vino Veritas

December 25th, 2009 by Francis Martin No comments »

cpfChristmas is a great time to try new wines with some of your favorite recipes. For Christmas eve dinner, I enjoyed an excellent Saint Emilion Grand Cru, a Pontet Fumet 2004 — a concentrated, hearty wine consisting of Merlot Noir, Cabernet Franc and Cabernet Sauvignon. Philippe Bardet, the winegrower of this family estate has done a great job in producing a wine with a dense, garnet colour and with a powerful, complex nose revealing very ripe dark and red berry aromas and an elegant oakiness. But this delicacy is also the outcome of entangled metabolic activities of a cortege of yeasts, including the well-known Saccharomyces cerevisiae. This yeast has been used for millennia in winemaking, but did you know that natural genetic engineering plays a key role in the evolution of this microbial winemaker? In a recent study published in PNAS, Novo et al. analyzed the selective forces acting on the wine yeast genome.

They sequenced the complete genome of the diploid commercial wine yeast EC1118 using a Sanger/454 pyrosequencing hybrid approach, resulting in an assembly covering 97% of the S. cerevisiae S288c reference genome. The wine yeast EC1118 differed strikingly from the other S. cerevisiae isolates in possessing several unique large regions encompassing 34 genes involved in key wine fermentation functions. Phylogeny and synteny analyses suggest that one of these genomic regions originated from a species closely related to the Saccharomyces genus, whereas another region was acquired by a eukaryote-eukaryote transfer event from Zygosaccharomyces bailii, a major contaminant of wine fermentations. These data suggest that constant remodeling of fungal genome, through the contribution of exogenous genes, may be favored by ecologic proximity.   These processes led to the molecular adaptation of wine yeasts to conditions of high sugar, low nitrogen, and high ethanol concentrations found in the must.

This study and previous studies published earlier this year (Liti et al. 2009, Schacherer et al., 2009) highlight population genomics of domestic and wild yeasts. In their survey of seventy domestic and wild yeasts (S. cerevisiae, S. paradoxus), Liti et al. revealed extensive differences in genomic (variation in gene content, SNPs, indels, copy numbers and transposable elements) and phenotypic variation despite ecological similarities. Their results could be interpreted in two ways. One is as a domestication of one or two groups of yeasts, the Wine/European and Sake strains, with selection for improved fermentation properties. These domesticated groups then gave rise to feral and clinical derivatives and were involved in the generation of out-crossed derivatives found in all sources. The alternative interpretation is that human activity simply may have used existing strains from populations that had appropriate fermentation properties providing the opportunity to out-breed through movement of strains and supplying a novel disturbed environment.

Liti et al. (2009) Population genomics of domestic and wild yeasts. Nature 458: 337-341.

Novo M, Bigey F, Beyne E, Galeote V, Gavory F, Mallet S, Cambon B, Legras J-L, Wincker P, Casaregola S, Dequin S. (2009) Eukaryote-to-eukaryote gene transfer events revealed by the genome sequence of the wine yeast Saccharomyces cerevisiae EC1118. Proc Natl Acad Sci U S A. 22: 16333-16338.

Postdoctoral Position in Microbial Metagenomics

December 24th, 2009 by Francis Martin No comments »

Treeline on SnowPOST DOC POSITION

Microbial Metagenomics in Forest Soils

The Martin lab (INRA Nancy) is hiring 1 motivated postdoc scientist in the computational analysis of metagenomics data, with a particular emphasis on forest-associated microbial communities.

Bacteria and fungi play fundamental roles in the ecology of forest ecosystems. A powerful new tool in microbial ecology studies is metagenomics wherein one uses next generation sequencing methods on DNA or RNA isolated directly from environmental samples. Metagenomics involves sampling and sequencing the genome sequences of a community of organisms that inhabit a common environment, such as the ocean, the soil or the human gut. Metagenomics provides an unbiased picture of the community structure (species richness and distribution) and its functional potential. It is rapidly moving from being a description tool to an experimental tool as a result of comparisons now being made of metagenomes submitted to environmental perturbations.

We are seeking a post-doctoral bioinformatician to work on methodology for analysis of metagenomic data as part of a new series of projects aiming to describe the spatiotemporal dynamics of bacteria and fungi in forest soils. The proposed research projects focus on bioinformatics tool development for comparative metagenomics of next-generation sequencing data [Buée et al. (2009) New Phytologist, Uroz et al. (2009) Environmental Microbiology, Martin & Martin (2010) New Phytologist] as well as further development and/or application of multivariate analysis methods towards forest ecosystems.

The Tree-Microbe Interactions Department is an exciting, interdisciplinary research group with excellent facilities and research groups in fungal genomics, bacteriology, microbial ecology and molecular biology. The Martin lab is located on the campus of INRA in Nancy, a vibrant, international city in Northeastern France.

Qualifications

We are looking for people with a demonstrated interest in working at the interface between bioinformatics, microbial ecology, and fungal biology.

Applicants should have a PhD in a microbial genomics or computational field. Applicants should have substantial experience with database programming (e.g. SQL), scripting (e.g. Perl or Python), and bioinformatics tools.

Term: Appointments will last 1 year beginning in February-March 2010.

Interested candidates are encouraged to send their CV, along with a letter stating their interest and contact details of two references to Francis Martin (fmartin@nancy.inra.fr). Informal enquiries can also be addressed to the same email address.

Applications close: 31 January 2010

IMC9: The Biology of Fungi

December 24th, 2009 by Francis Martin No comments »

edinb

With the surge of fungal genome release, studying the biology of the Mycota has never been as exciting as it is today. The International Mycological Congress represents one of the largest scientific forum to provide an up-to-date perspective of mycology in all its guises. The 9th International Mycological Congress (IMC9: the Biology of Fungi) will be held in August 2010 in Edinburgh, Scotland. The conference themes will include:

  • Cell biology, biochemistry and physiology
  • Environment, ecology and interactions
  • Evolution, biodiversity and systematics
  • Fungal pathogenesis and disease control
  • Genomics, genetics and molecular biology

Register at: http://www.imc9.info/index.htm.

Metagenomics of windshield splatters

November 30th, 2009 by Francis Martin No comments »

weld-splatter-park-2Metagenomics involves sampling and sequencing the genome sequences of a community of organisms inhabiting a common environment, ocean, soil, or human gut. In their paper “Windshield splatter analysis with the Galaxy metagenomic pipeline” Kosakovsky Pond et al. applied metagenomic methodologies to directly determine the taxonomic composition of bugs collected by the front bumper and windshield of a 2006 Dodge Caravan. As every driver knows, the windshield of a moving vehicle is subjected to numerous insect strikes and this study shows that it can be used as a collection device for representative sampling of eukaryotes from our surrounding environment. The 454 reads were analyzed using a comprehensive pipeline – the Galaxy platform – for phylogenetic profiling of metagenomic samples that includes all steps from processing and quality control of data generated by next-generation sequencing technologies to statistical analyses and data visualization (http://galaxyproject.org). The number of sequencing reads has been used as a proxy for the relative abundance of taxa and therefore used to contrast biodiversity estimates between geographic locations. Most reads from two trips in in the northeastern United States  map to bacterial species. Aside from green plants and a few animal roadkills, two insect groups, Diptera and Hemiptera, represented the majority of eukaryotic reads in both samples. This study demonstrates that 454 sequencing technology can identify eukaryotic taxa from random reads generated from environmental samples and it is it possible to assess species abundance between geographic locations.

Kosakovsky Pond S, Wadhawan S, Chiaromonte F, Ananda G, Wen-Yu Chung, Taylor J, Nekrutenko A and The Galaxy Team (2009) Windshield splatter analysis with the Galaxy metagenomic pipeline. Genome Research, http://www.genome.org/cgi/doi/10.1101/gr.094508.109.

Rose de Pré

November 8th, 2009 by Francis Martin No comments »

www.mssf.orgThe JGI have announced the pre-release of the Agaricus bisporus var. bisporus (H97) v1.0 assembly. The genome size is 30.2 Mbp with an average coverage of 8.5x. based on ab initio prediction, protein alignment and ESTs, ~10,000 genes have been predicted by automated annotation. The annotation may be browsed, searched, downloaded and expanded with your curations at the Agaricus bisporus var. bisporus (H97) portal: genome.jgi-psf.org/Agabi1. Because this is a pre-release only approved users will be able to get access to this portal.

Cartoon: © www.mssf.org

Future Microbiome Projects

October 25th, 2009 by Francis Martin No comments »

A significant part of JGI FY 2010 funding will be aimed to projects of interest to our lab. This includes:

• A Great Prairie soil metagenome project that Jim Tiedje of Michigan State University is leading with Janet Jansson of Lawrence Berkeley National Laboratory.
• An Arabidopsis rhizosphere project led by JGI microbial ecology group leader Phil Hugenholtz and Jeff Dangl of the University of North Carolina at Chapel Hill.

The Smell of Autumn

October 25th, 2009 by Francis Martin No comments »

redyellow

Fall was so dry that mushrooms remained rare in the woods around the village. I missed the long walks through the trees hunting the fruits of the underground fungal webs. Mid-October finally brought rain and mushrooms poped up, but most of them are saprotrophs efficiently eating wood logs and dead stumps. When seeing this silent but frenetic activity I realized why so many are in the Genome Encyclopedia of Fungi portfolio.