Mimulus guttatus v1.1 (Monkey flower)
About the genome:
Overview (from JGI)
One of the challenges of 21st-century biology is to determine, at the DNA sequence level, the basis of adaptive evolution in nature. The flowering plant genus Mimulus (monkey flowers) has become a leading model system for studying ecological and evolutionary genetics in nature.
Since Darwin, Mimulus species have been used to investigate a wide range of topics of interest to ecologists and evolutionarybiologists, including plant adaptation to soils contaminated with heavy metals, mating system evolution, the genetic basis of inbreeding depression, plant/herbivore interactions, adaptive radiation of floral form, life history evolution, and the origin of new species. Compared to other plants whose genomes have been sequenced, Mimulus is uniquely suited for ecological and evolutionary studies because of its tremendous range of floral morphology (and associated pollinators), mating systems (selfing to outcrossing), growth forms (annual herbs to perennial woody shrubs), and habitat preference (desert to riparian to aquatic). A well-resolved phylogeny of the roughly 160 Mimulus species reveals that the genus has undergone two large adaptive radiations, one in western North America and an independent radiation in Australia.More information can be found at mimulusevolution.org.
Like all plant genetic model systems, Mimulus species have a small genome (about 430 Mb), a short generation time (6 to 12 weeks), high fecundity (100 to 2000 seeds per pollination), self-compatibility, and ease of greenhouse propagation. Unlike most plant genetic model systems, the ecology of Mimulus is known in great detail, and nearly all studies of Mimulus have a prominent field-based component. Recognizing the need to develop a basic set of genomics tools for an ecologically important model system, the National Science Foundation Frontiers in Intergrative Biological Research program has funded a $5 million, 5-year integrated ecological and genomic analysis of the genomes of M. guttatus, M. nasutus, M. lewisii, and M. cardinalis.
Whole genome shotgun senome sequencing of the IM62 inbred line of Mimulus guttatus is in progress at the DOE Joint Genome Institute, in collaboration with John H. Willis and Fred Dietrich (Duke Univ.), Todd Vision (Univ. of North Carolina), Toby Bradshaw (Univ. of Washington), Lila Fishman (Univ. of Montana), Douglas W. Schemske (Michigan State Univ.), Jeffery P. Tomkins (Clemson Univ.), Paul Beardsley (Idaho State Univ.), and Jeremy Schmutz, Shengquiang Shu, Therese Mitros, Uffe Hellsten, David Goodstein, and Daniel Rokhsar (DOE JGI). The present, highly fragmented, genome assembly is an early result of the genome project. It is made available here under Ft Lauderdale data release guidelines. Please note that it is only a preliminary release, and the sequence, coordinates, gene set, etc. are expected to change, possibly significantly, in the near future. A more stable, publication quality release, mapped to chromosomes, is expected by the end of 2010, as more datasets are produced and incorportated.
AnnotationTranscript assemblies were constructed using PASA from Mimulus guttatus ESTs (~555K, IM62 and DUN, including some 454 EST reads) and ESTs of related species (~256K, other asterids). Loci were determined by BLAT alignments of above transcript assemblies and/or BLASTX alignments of proteins from arabi (Arabidopsis thaliana), rice, grape genomes or a few tomato, potato and tobacco proteins to repeat-soft-masked M. guttatus genome. Gene models were predicted by homology-based predictors, mainly by FGENESH+ with the addition of GenomeScan if FGENESH+ produced no model at the locus. Predicted genes were UTR-extended and/or improved by PASA. Filtered gene set was made from PASA improved gene models based on ESTs support or protein homology support subjected to filtering of repeats/transposable elements. Protein domains (Pfam and Panther) were assigned to filtered gene models and gene models whose protein contains >= 30% TE domains were further filtered out to yield 26,718 genes/loci.
StatisticsThis release of Phytozome includes the JGI gene annotation v1.1 of assembly v1.0:
- Approximately 321.7Mb arranged in 2216 scaffolds
- Approximately 300.7Mb arranged in 17831 contigs (~ 6.5% gap)
- Scaffold N50 (L50) = 81 (1.1 Mbp)
- Contig N50 (L5) = 1770 (45.5 Kbp)
- 512 scaffolds larger than 50 Kbp, with 95.7% of the genome in scaffolds larger than 50 Kbp
- 26718 loci containing protein-coding genes
- 28282 protein-coding transcripts
What can I do with the Mimulus dataset?
- I would like to use this data to help clone a gene, analyse a gene family, etc.
- Wonderful! Please feel free to use this data to advance your studies of plant biology. Please reference "Mimulus Genome Project, DoE Joint Genome Institute" as your citation. Also please recognize that this is a preliminary release, which may include various errors. Caveat emptor.
- I think I found an error.What should I do?
- If you would like to bring any items to our attention, please email us.
- I would like to do a large-scale comparison of Mimulus to other genomes, and/or a global analysis of its gene content.
- Our plans for publication of the Mimulus genome are described below, and are focused on the large-scale analysis of the gene and repetitive content of the genome and its evolutionary dynamics (including synteny and chromosomal/segmental duplication, gene family evolution, gene structure evolution) relative to other angiosperms. The Fort Lauderdale guidelines for large scale sequencing projects aims to balance the value of rapid data release for the user community with respect for the scientific interests of the generators of the data. We ask that you respect these scientific goals as discussed in the Fort Lauderdale guidelines. A plan for the coordinated submission of companion manuscripts to Genome Research will be developed.
What are the publication plans?
We expect that the initial manuscript describing the assembly, annotation, and first analysis of the Mimulus genome will be based on an improved, chromosome-scale assembly to be developed by the end of 2010. This assembly and its annotation will be distributed prior to publication. Our target is submission of a manuscript no later than nine months after completion of chromosome-scale assembly and annotation.