Posts Tagged ‘Illumina’

Metatranscriptomics of Forest Soil Ecosystems

October 29th, 2011


Forest soils (including litter, humus and coarse woody debris) host diverse microbial communities that impact tree health and productivity, and which play pivotal roles in terrestrial carbon sequestration, and biogeochemical cycles. Among these microbial communities, fungi are undoubtly major players. Traditionally, they have been divided into discrete ecological guilds, such as leaf litter-decomposers, humus saprobes, white- and brown-rot wood decayers, parasites and mycorrhizal symbionts. However, the actual functional properties of individual species, and the synergistic effects among them, are often obscure. Moreover, the basic biodiversity of the vast majority of soil systems (e.g., boreal forests and subartic taiga) remains unexplored using high-throughput DNA barcoding approaches.

We hypothesize that firm distinctions between fungi commonly labeled mycorrhizal, wood decomposer, humus and litter saprobes are, in some instances, unwarranted, and that crucial ecosystem processes, such as carbon sequestration, wood and litter decay and trophic mutualism, can only be understood in the context of interactions among multiple species representing a functional continuum. The number of available fungal genomes has expanded dramatically in recent months, and this provides unprecedented opportunities to study the functional (and taxonomic) diversity of soil communities.

Within the framework of the DOE Joint Genome Institute Community Sequencing Program, we have therefore embarked in a challenging large-scale metatranscriptomics project to explore the interaction of forest trees with communities of soil fungi, including ectomycorrhizal symbionts that dramatically affect tree growth, and saprotrophic soil fungi impacting carbon sequestration in forests. We are going to sequence the metatranscriptome of soil fungi (i.e., wood decayers, litter and humus saprotrophs, and ectomycorrhizal symbionts) in woody debris, litter/humus, rhizosphere and ectomycorrhizal roots of ecosystems representative of major Earth biomes, the boreal, temperate and mediterranean forests.

Metatranscriptome samples. A range of forest ecosystems has been selected on the basis of their ecological importance and the availability of metadata linked to these forest sites. In contrast to agricultural soils, forest soils, in particular those of boreal forests with low pH values, are characterised by strong vertical stratification due to the resulting absence of fauna causing mixing. This provides a spatial structure for evaluation of hypotheses concerning functional attributes of taxa occupying spatially distinct horizons.

Sampling will be conducted on selected stands in long-term observatories (LTOs) or national survey sites:

  • Boreal forests: Bonanza Creek (Alaska) and Siljansfors (Sweden).
  • Temperate forests: DOE long term studies at Duke Forest, the post-fire stands at the Bitterroot National Forest and Michigan maple N-deposition sites (USA), a forest-woodland-grassland transect in Rollainville (France), and the Breuil-Chenue plantation (France).
  • Mediterranean forests at Puéchabon near Montpellier (France) and at Aspurz south-western Pyrenees (Spain).

For these soil samples, we will run: (1) Tag-encoded FLX-titanium amplicon pyrosequencing (TEFAP) of the fungal rDNA ITS to survey the existing communities and (2) RNA-Seq of soil samples. For this cDNA profiling, we will sequence ~110Gbp per site for a total of 1 terabase using Illumina HiSeq PE chemistry. Reads produced by RNA-Seq will be used to reconstruct de novo the different fungal metatranscriptomes (best case scenario). In addition, we will use Illumina fragment recruitment, a process of aligning sequencing reads to reference genomes. Metatranscriptomic reads will thus be aligned to the >100 genomes of soil fungi available in the JGI MycoCosm. To improve this crucial step, we also propose the gDNA sequencing and RNA-Seq of the 25 most abundant fungal species harvested on the studied sites to serve as the foundation for a reference database for metagenomics of fungi and for a comprehensive survey of the potential soil fungal metabolome. We will annotate the fungal genomes/transcriptomes and soil fungal metagenomes with all these characteristics and will compare the different metagenomes in terms of these characteristics.

These experimental datasets will provide a mechanistic insight into the fungal communities’ structural organization and functioning in forests. In addition, the present metagenomic data will give a comprehensive picture of the organization of the tree-associated microbiome in terms of metabolic pathways, subsystems, molecular functions and biological processes.

Sequencing of new fungal species will be performed in concert with existing large-scale genome studies (e.g., the 1000 Fungal Genomes project), so as to minimize unnecessary redundancies. As such we recognize that this project represents a large effort and great challenge in defining the microbiome of important forest ecosystems and a group of micro-organisms, the soil fungi.


Image (top): Denali Ntl Park, Alaska (© F Martin)

Evolution of Obligate Biotrophy

July 22nd, 2011

Eukaryotic microbes are colonizing a wide range of environments, including living plants and animals. Parasites, endophytes and symbionts have evolved complex mechanisms to interact with their host. Plant pathologists have attempted to classify pathogens into groups called necrotrophs, biotrophs and, more recently, hemibiotrophs. Biotrophs derive energy from living cells. Necrotrophs derive energy from killed cells. Hemibiotrophs have an initial period of biotrophy followed by a necrotrophic phase. But this division into groups based on nutritional mode is poorly supported by recent genomics studies (see Oliver & Ipcho, 2004).

Evolution to obligate biotrophy occurred independently in fungal and in oomycetous pathogens (rusts, downy and powdery mildews) and molecular features driving this adaptation have been studied in a series of recent papers (e.g., Baxter et al., 2010Spanu et al., 2010; Duplessis et al., 2011).

In their recent PLoS Biology paper, Kemen et al. elegantly investigated the mechanisms leading to obligate biotrophy in the white rust pathogen Albugo laibachii (Oomycota). Their comparison of the Albugo genome to Hyaloperonospora arabidopsidis genome (Baxter et al., 2010) sheds light on the evolution of biotrophy in Oomycetes and they nicely summarized the current knowledge of gain and loss of genes and pathways in the figure below.

Figure. Gain and loss of genes and pathways for selected Chromalveolata in comparison to A. laibachii. © PLoS Biology.

Kemen et al. (2011) Gene Gain and Loss during Evolution of Obligate Parasitism in the White Rust Pathogen of Arabidopsis thaliana. PLoS 9,  e1001094.


Date Palm Genome

July 17th, 2011

The Date palm, Phoenix dactylifera L., is a tree of the palm family (Arecaceae, or Palmae), native to desert regions of the Persian Gulf. Mentioned in the Qur’an and Bible, its fruits have been a staple food in the Middle East since the neolithic. Today, dates are amongst the most important crop of many countries in the Arabian Gulf and North Africa.

A draft version of the genome of the Khalas variety has been published by a team of the Weill Cornell Medical College in Qatar in the June 2011 issue of Nature Biotechnology.

The Khalas genome has been sequenced using the Illumina GA IIx and the WGS assembly only covered 380 Mbp of the total ~658 Mbp genome size. As expected, large repeated regions were not included in the assembly, but most of the gene space has been assembled. Although a detailed analysis of the metabolic and developmental pathways is missing from this paper, the draft genome has been used to generate very useful genetic tools. The genome sequences of eight additional cultivars were used for an in-depth SNP/CNV analysis. A set of 32 SNPs has been identified for discriminating varieties, a long awaited tool for breeders. A region linked to gender determination was also characterized.

This genome and genetic resources should be very useful for improving traits such a fruit quality.

Image: Palmier dattier by Martiros Sarian (1880-1972).

454 vs. Illumina for microbiota amplicon profiling

July 14th, 2011

To survey the soil bacterial and fungal communities in forest ecosystems, we have extensively used 454 Titanium pyrosequencing (Buée et al., 2009, Uroz et al., 2010), but we are currently comparing this approach to the paired-end Illumina read sequencing of the rRNA internal transcribed spacer (ITS). We are expecting the first datasets in a few weeks. The later approach sounds very promising, but in their recent analysis of the intestinal microbiota Claesson et al. (2010) showed that this approach has still important limitations. A very large proportion of the Illumina 16 rRNA reads could not be classified down to genus level as a result of their shorter length and higher error rates beyond 60 nt. Let’s see what we get with the ITS. See abstract below.

[Abstract. High-throughput molecular technologies can profile microbial communities at high resolution even in complex environments like the intestinal microbiota. Recent improvements in next-generation sequencing technologies allow for even finer resolution. We compared phylogenetic profiling of both longer (454 Titanium) sequence reads with shorter, but more numerous, paired-end reads (Illumina). For both approaches, we targeted six tandem combinations of 16S rRNA gene variable regions, in microbial DNA extracted from a human faecal sample, in order to investigate their limitations and potentials. In silico evaluations predicted that the V3/V4 and V4/V5 regions would provide the highest classification accuracies for both technologies. However, experimental sequencing of the V3/V4 region revealed significant amplification bias compared to the other regions, emphasising the necessity for experimental validation of primer pairs. The latest developments of 454 and Illumina technologies offered higher resolution compared to their previous versions, and showed relative consistency with each other. However, the majority of the Illumina reads could not be classified down to genus level due to their shorter length and higher error rates beyond 60 nt. Nonetheless, with improved quality and longer reads, the far greater coverage of Illumina promises unparalleled insights into highly diverse and complex environments such as the human gut].

Claesson et al. (2010) Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Res. 38:e200. Epub 2010 Sep 29.

Buée et al. (2009) 454 pyrosequencing analysis of soil fungal diversity as affected by forest management. New Phytologist 184: 452–459.

Uroz et al. (2010) Pyrosequencing reveals a contrasted bacterial diversity between oak rhizosphere and surrounding soil. Environmental Microbiology Reports 2: 281–288.

The Potato Genome … j’aime les patates !!!

July 12th, 2011

There are several genomes of plants or fungi published every week, but the paper describing the Potato (Solanum tuberosum) genome and its analysis in this week issue of Nature is worth a post.

Getting a draft of the potato genome was indeed a real challenge as S. tuberosum is a heterozygous autotetraploid with four highly variable copies of each chromosome. This is also the first sequenced genome of an asterid, a major clade within eudicots … and most importantly, that’s the needed ingredient for making my favorite crispy French fries!!!

The Potato Genome Sequencing Consortium (PGSC) was initiated in early 2006 by the Plant Breeding Department of Wageningen University & Research in the Netherlands and has developed into a global consortium of 26 research groups, including the BGI.

The potato genome has 12 chromosomes and is estimated to be 840 million base pairs. To facilitate its WGS sequencing, the PGSC sequenced the doubled monoploid DM1-3 516R44 (DM) potato derived from a diploid landrace of potato in order to simplify and complement the sequencing of the diploid line RH89-039-16 (RH).

The 96.6 Gb of raw sequence was generated using the Illumina Genome Analyser and 454 pyrosequencer platforms, as well as conventional Sanger sequencing technologies. The genome was then assembled using SOAPdenovo4, resulting in a final assembly of 727 Mb.

The draft assembly contains ~62% repeated elements with long terminal repeat retrotransposons (LTRs) comprising the majority of the transposable element classes, representing 29.4% of the genome. The consortium also generated 31.5 Gb of RNA-Seq data from 32 DM and 16 RH libraries representing all major tissue types, developmental stages and responses to abiotic and biotic stresses. Reads were mapped against the DM genome sequence and in combination with ab initio gene prediction, protein and EST alignments, 39,031 protein-coding genes were predicted, including 2,642 asterid-specific and 3,372 potato-lineage-specific genes. Genes encoding transcription factors, self-incompatibility, and defence-related proteins were prominent in the asterid-specific genes and likely contribute to the unique characteristics of asterids.

Potato is susceptible to a wide range of pests and pathogens (e.g. Phytophthora infestans, the culprit of the Irish Potato Famine) and the identification of R genes conferring disease resistance is a main focus of plant pathologists. The DM assembly contains 408 NBS-LRR-encoding genes, 57 Toll/interleukin-1 receptor/plant R gene homology (TIR) domains and 351 non-TIR type. A high rate of pseudogenization has been observed in this R genes. More than one third of these NBS-LRR genes are pseudogenes owing to indels, frameshift mutations, or premature stop codons. It is tempting to speculate that this rapid pseudogenization parallels the rapid evolution of effector-coding genes observed in Phytophthora infestans.

Transcript profiling was used to study the molecular mechanisms controlling the stolon-to-tuber transition. This developmental event coincides with strong upregulation of genes associated with protein storage (patatin), starch biosynthesis, and defense against pests and pathogens (Kunitz protease inhibitor genes).

This draft genome will undoubtly provide a unique resource for genetic improvement of the most important vegetable crop. It will also facilitate the genomic analysis of other tasty members of the Solanaceae family, tomato, pepper, and eggplant.


Nature News: All eyes on the potato genome.

Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature doi:10.1038/nature10158





Mycorrhiza 25 Genomes project approved

September 22nd, 2010

IMG_7095I am glad to report that our proposals ‘ Exploring the Genome Diversity of Mycorrhizal Fungi to Understand the Evolution and Functioning of Symbiosis in Woody Shrubs and Trees ‘ and ‘Community proposal to sequence a diverse assemblage of saprotrophic Basidiomycota (Agaricomycotina) ‘ to JGI’s Community Sequencing Program was approved for sequencing this cycle. This is extremely exciting, because it means that sometime between this Fall and next Summer we will have a large set of new mycorrhizal and  saprotrophic Agaricomycotina genomes, followed later in 2011-12 by another set of genomes. By the end of 2011, we should be able to mine and compare 50 novel symbiotic and saprotrophic  genomes.

As of this writing, JGI 454 and Illumina machines are busily churning out DNA from Hebeloma cylindrosporum, Piloderma croceum, Cenococcum geophilum, Pisolithus tinctorius and P. microcarpus.  Amanita muscaria, Boletus edulis, Laccaria amethystina, Lactarius quietus, Paxillus rubicundulus, Suillus luteus, and Sebacina vermifera will soon be queuing for sequencing.

Photo: The Fly Agaric, Amanita muscaria © F Martin

Living Stones

August 29th, 2010

Villaron - pierre colorée de lichens

Together with the mycorrhizal symbiosis, the lichen symbiosis has fascinated biologists for decades.

Lichens are composite organisms consisting of a fungus (the mycobiont) interacting with a photosynthetic partner (the photobiont), an algae or a cyanobacterium, to form a mutualistic interaction. These chimeric organisms are found in most ecosystems, including in some of the most extreme environments — deserts, rocky coasts, mountains, artic tundra, … . They are also abundant as epiphytes on trees, on bare rocks, such as walls, benches and tombstones. The morphology and physiology of lichens are very different from those of the free-living fungi and algae, but very little is known on the signals, genes and proteins involved in the complex symbiotic interactions.

The genomes of two ectomycorrhizal symbionts, the basidiomycete Laccaria bicolor and the ascomycete Tuber melanosporum, are now available to study the evolution of fungal symbiosis. No lichen genome was available to date to investigate how this other major fungal symbiosis evolved. This is thus great to see that the 36-Mb genome of the lichen Cladonia grayi has been sequenced (by 454 and Illumina). C. grayi is a member of the Cladoniaceae, a well-studied world-wide family of stalked-cup lichens classified within the Lecanoromycetes, a class that includes more than 70% of the lichen-forming fungal diversity. The 60 Mb genome of the Asterochloris photobiont associated with C. grayi has also been sequenced. Asterochloris sp. a single celled member of the largest family of lichen-forming green algae, the Trebouxiaceae.

We are excited to be part of the Cladonia Genome Consortium leaded by Daniele Armaleo and François Lutzoni (Duke University).

Photo: A lichen community on a stone wall in the Villaron hamlet (Haute-Maurienne, France). © F Martin

Methylome Phylogeny

May 18th, 2010

Laccaria methylome


DNA cytosine methylation is a crucial process for the regulation of many cellular events in mammalian, plant and fungal development, although other eukaryotic species live well without this mechanism. Zilberman’s team at the University of California, Berkeley, is reporting the DNA methylation patterns of 17 organisms — five plants, seven animals and five fungi — in Science online. They have picked these species throughout the tree of life to reconstruct how methylation might have evolved. My pet fungus Laccaria bicolor, but also Coprinopsis cinerea and Postia placenta, are among the scrutinized fungi. The methylome (genome-wide methylation maps) for each species was generated by using high-throughput bisulphite Illumina sequencing. Zilberman suggests that the last common ancestor of plants, animals and fungi carried enzymes, DNA methyltransferases, that methylated both transposons and gene bodies.

In fungi, DNA methylation is concentrated in transposable elements (TEs), as observed in plants and vertebrates, and no methylation was observed in the middle of active genes. It however remains to be shown whether cytosine methylation takes place in dormant or active TEs. This nice methylome study was on my to-do list for Laccaria and Tuber genomes. Snif, snif !!!

Zemach, A., McDaniel, I. E., Silva, P. & Zilberman, D. (2010) Genome-Wide Evolutionary Analysis of Eukaryotic DNA Methylation. Science 328, 916 – 919 .

Further readings:

Jeltsch A (2010) Phylogeny of Methylomes. Science 328, 837.

Katsnelson A (2010) Mapping methylation’s mysterious background. Nature | doi:10.1038/news.2010.185.

Next-Generation Sequencing of Sordaria Genome

May 1st, 2010

sordariaAs I reported earlier, the genome of the filamentous pathogenic fungus Grosmannia clavigera has been assembled from a combination of Sanger, 454, and Solexa sequence data.  The genome of the giant panda (Ailuropoda melanoleura) — specifically of the female Beijing Olympics mascot Jingjing — has also been determined using short-read Illumina sequencing technology, “un tour de force” for such a complex genome. This was the first reported de novo assembly of a large mammalian genome achieved using next-generation sequencing methods. These two studies demonstrated the feasibility for using next-generation sequencing technologies (454 & Illumina) for accurate, cost-effective and rapid de novo assembly of eukaryotic genomes. In a recent issue of PLoS Genetics, Nowrousian et al. published the de novo assembly of the 40 Mb genome of Sordaria macrospora from short sequence reads. They generated 3.4 Gb of sequence data from four lanes from a 300 bp library and three lanes from a 500 bp library using the Illumina GA. In addition, 415 Mb of sequence data were produced by 454 sequencing. Assembly of the Solexa reads only as well as the combined Solexa and 454 reads was carried out with the Velvet assembler.

Diego Martinez and Mary Anne Nelson discuss the technical merits of this work in their Perspective paper in PLoS Genetics.

The natural habitat of the saprobic S. macrospora is herbivore dung. With 10,789 predicted genes, the gene repertoire in S. macrospora is similar to that of N. crassaS. macrospora harbors duplications of several genes involved in self/nonself-recognition and contains more polyketide biosynthesis genes than N. crassa. One putative polyketide biosynthesis (PKS) cluster might have been acquired through horizontal gene transfer (HGT) from a distantly related ascomycete group. This finding supports recent suggestions that HGT is widespread in fungi both for the transfer of single genes, clustered genes like PKS genes, or even larger stretches of DNA up to whole chromosomes as was found in the phytopathogenic fungus Nectria haematococca. The S. macrospora genome contains even fewer transposable elements than its closest relative, Neurospora crassa, despite the absence of active RIP.

Nowrousian M, Stajich JE, Chu M, Engh I, Espagne E, et al. (2010) De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis. PLoS Genet 6(4): e1000891. doi:10.1371/journal.pgen.1000891.

Li et al. (2010) The sequence and de novo assembly of the giant panda genome. Nature 463, 311-317.

Martinez DA, Nelson MA (2010) The Next Generation Becomes the Now Generation. PLoS Genet 6(4): e1000906.

Photo: S. macrocarpa protoperithecium (Nowrousian et al. 2010. PLoS Genetics).).

One Bacterial Cell, One Complete Genome

May 1st, 2010

90% of microbial bugs are eluding current culturing attempts. Sequencing of single cells is a novel culture-independent approach, which enables access to the genetic material of an individual cell of unculturable bacteria. In PLoS One this week, Jan-Fang Cheng’s and Nancy Moran’s groups at JGI and the University of Arizona report the completed sequence of Candidatus Sulcia meulleri, obtained from an uncultured single cell. The Bacteroidetes Sulcia is one of two obligate bacterial symbionts inhabiting sharpshooters. A single Sulcia cell was sampled from the host bacteriome using an inverted microscope (Zeiss) and a micromanipulator, its genome amplified via multiple displacement amplification and sequenced using a combination of Sanger sequence and pyrosequencing, generating a total of 57 Mb of sequence. This approach can now be used to generate complete reference genomes urgently needed for metagenomic of bacterial communities.

Concombre à la Crème & Cucumber Genome

April 25th, 2010

concombre_marketerWhen the hot weather hits, nothing is more cooling than a cucumber salad. Unlike the somewhat seedy American cucumbers with thick, bitter skins, cucumbers from my garden are thin-skinned and practically seedless, so you can just slice them and eat them, without peeling. You can also gently toss together the sliced cucumbers in bowl with a little bit of fresh cream (or yogurt if you’re on diet), salt and pepper to taste. Right before serving, sprinkle on crumbled bits of feta cheese.

Why talking about my garden cucumbers? Because in a paper appearing online today in PLoS ONE, researchers from China and the US reported that they have come up with an integrated genetic and cytogenetic map of cucumber (Cucumis sativus). Researchers from the Chinese Academy of Agricultural Sciences, the China Agricultural University, and the US Department of Agriculture’s Agricultural Research Service used whole genome shotgun sequencing to come up with nearly 1,000 polymorphic simple sequence repeat markers in cucumber. Using these markers, along with cytogenetic data, they then created a high-density linkage map that will be used for future genetic and genomic studies in cucumbers and related pumpkins, squash, melon and watermelon.

The Cucurbitaceae family comprises about 120 genera and 800 species, including many economically important vegetable and fruit crops such as cucumber (Cucumis sativus L.), melon (C. melo L.), watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai), squash and pumpkin (Cucurbita spp.)

The genome of the cucumber (cultivar Chinese Long 9930) has been published a few months ago in Nature Genetics. The genome sequencing was done by the Beijing Genomics Institute-Shenzhen and the Cucumber Genome Initiative (CuGI). It was coordinated by Sanwen Huang of the Chinese Academy of Agricultural Science and included the Genome Center at BGI, UC Davis as well as several laboratories in China and others in the U.S., Denmark, the Netherlands, Australia and South Korea.

BGI applied a hybrid strategy for the whole genome sequencing that takes advantage of read length and paired-end of the conventional Sanger sequencing and of the extra-high throughput of the next generation Illumina GA sequencing (~72X coverage). They have finished 4x Sanger sequencing of the genome and preliminary assembly showed 90% the genome was covered. The total length of the genome assembly was 243.5 Mb, whereas the genome size estimated by flow cytometry was 367 Mb.  The 30% non-assembled genome are transposable elements and rRNA sequences. In addition, ~410K EST was generated from cDNA samples using Roche 454 sequencing to facilitate protein-coding gene annotation. The gene-prediction methods  predicted 26,682 protein-coding genes in 15,669 gene families. The cucumber gene repertoire contains the smallest number of tandem duplications (479), much smaller than grapevine (5,382). These low number of genes and tandem duplications is likely resulting from a lack of whole genome duplication.

The genome analysis showed that five of the seven cucumber’s chromosomes arose from ten ancestral chromosomes shared after divergence from melon (C. melo), and gene-coding stretches of DNA share about 95 percent similarity to melon. The cucumber genome will also provide insights into traits such as disease and pest-resistance, the “fresh green” odor of the fruit, bitter flavors and sex expression.

The cucumber genome is bursting with transposons and repetitive sequences — many of which have not been detected in previously sequenced genomes. Note also that this study identified 800 phloem proteins in the this genome, but only 61 NBS-containing resistance genes (against 398 in poplar has we’ve shown). Lipoxygenase (LOX) enzymes might be a complementary system to cope with biotic stress.

The cucumber is the seventh plant to have its genome sequence published, following the well-studied model plant Arabidopsis thaliana, the poplar tree, grapevine, papaya, and the crops rice and sorghum.

Additional information available at: Cucurbit Genomics Database.

Ren et al. (2010) An Integrated Genetic and Cytogenetic Map of the Cucumber Genome. PLoS ONE 4(6): e5795. doi:10.1371/journal.pone.0005795.

Huang et al. (2009) The genome of the cucumber, : Cucumis sativus: L. Nature Genetics 41, 1275 – 1281.