Chlamydomonas reinhardtii is a single celled green alga, belonging to the chlorophytes, a group of highly adaptable species that live in many different environments throughout the world. Normally C. reinhardtii derives energy from photosynthesis, but thrives in total darkness when provided with an alternative carbon source.
Its impressive adaptability and quick generation time has made
Chlamydomonas an important model for biological
research. Over the years, studies of Chlamydomonas have
provided major research contributions in the areas of
photosynthesis, ciliary diseases and molecular biology.
(from The Joint Genome Institute Chlamydomonas v4 portal).
This release of Chlamydomonas genomic information features v5.3.1 of Chlamydomonas annotations, which are based on the Augustus update u11.6 (u11.6) annotation of JGI assembly v5. This annotation includes prediction of alternative transcripts for the first time.
Approximately 111 Mb arranged in 17 linkage groups (chromosomes) with 37 additional unmapped scaffolds, with an L50 of 7.8 Mb.
This set of gene predictions incorporates new JGI gene expression data from 1M reads generated from nitrogen-starved cells on the 454 Titatnium platform as well as 239M 2x150 stranded Illumina reads and other smaller RNA-Seq datasets. These reads are in the process of being deposited in the GenBank Sequence Read Archive (SRA).
The underlying gene prediction was performed by Mario Stanke using Augustus (annotation version u11.6). We removed 5 transcripts that were shorter than 25 aa; 59 with internal stop codons and 716 with homology (over >=30% of their length) to Transposable Elements deposited in RebBase. Manual examination merged or split loci where updated gene predictions were incompatible with older annotations. This left 17,737 loci (of which 1,789 encode alternate transcripts) making a total of 19,526 protein-coding transcripts
12,264 stable gene (13,450 stable transcript) identifiers were mapped forwards from v4; 9,424 deflines, descriptions and gene symbols (e.g. IFT27) were mapped. (Stable gene identifiers are of the form Cre01.g123450 for locus; Cre01.g123450.t2.1 for transcripts, where Cre01 means the locus is on chromosome 1 and gene numbers (g123450 in this example) generally increment by 50 starting at the beginning of chromosome 1. The t2.1 at the end of the transcript identifier indicates this is version 1 of transcript 2 at this locus. Augustus gene identifiers of the form g12345 remain for the transcripts that are still being mapped to a stable locus. All loci will have a stable identifier in the next release.)
The next release (Chlamydomonas v5.4) is scheduled for summer 2013. We are obtaining Illumina sequences from transcript termini (RNA-PET) and expect to use this to improve annotation of gene ends.