Selaginella Genomics
All the data in this project are being made publicly available.
Please contact us for information on other assemblies or work in progress
Index (click on your choice to jump to a section)
EST Sequences
- EST sequences
For a description of the EST cDNA cloning, sequencing and its assembly see the
readme
file. The sequencing and assembly were both done by the
DOE/Joint Genome Institute as part of the community sequencing project.
The JGI group clustered the ESTs to produce unigene sets. These are probably what you are most interested in
- cluster 339
- Selaginella genomic cluster clustered with the malign program. 96% identity, with end-pair merging.
- cluster 338
- Selaginella genomic cluster clustered with the blastn program. 96% identity, 150 bp overlap.
- EST cDNA sequences were cloned into two vectors
- To obtain cDNA clones:
- Please indicate the TEMPLATE (CAOP, CAOS, etc.) and the 5 digit number unique to each clone for each clone requested
for example CAOP10001. This number is found in the blast search against the unassembled EST sequences.
- Do not request the assembled clone number, which begins with a 7 digit number (for example, 1569091:1).
- All are Amp resistant.
- Send requests by email to Jody Banks at banksj@purdue.edu. These will be sent to you ASAP.
Shotgun DNA sequences
- Raw shotgun sequence
There are two datasets. Note that these have not been cleaned of vector or repeated sequences, and contain both
organellar and genomic sequence. Alse keep in mind that Selaginella is an outbred organism and therefore
two haplotypes are represented in the sequence.
- 050519 downloaded on 19 May 2005 and comprises 797871 raw sequence reads.
- 050930 downloaded 30 September 2005 from the NCBI trce archive. This is the
most currect dataset as of late November 2005. This dataset comprises 1814715 sequences. Note that one
of the mate-pair files is missing at the NCBI site and therefore does not appear here.
- A new dataset has been deposited at NCBI as of December 2006. The cleaed data should be posted here shortly
Cleaned and assembled sequences
- Cleaned shotgun sequence - we have LUCY to clean the sequences of vector and low quality reads.
This information will be posted soon (really).
- Purdue Assembly - several assemblies areavailable on the
BLAST search page.
- The newest assembly was completed 10 November 2006 (061110) based on 2395957 reads. Have a look at the longest contigs (15) in
ithe assembly: 061110_long10.fa
or all contigs 061110.fa
- DOE Assembly - 6 May 2005. This assembly is available on the
BLAST search page.
It was assembled from the first 798K reads using unknown methods at JGI.
- Vector sequence
BAC sequences from Banks lab (Purdue)
Contact Jody Banks for details
- Two selaginella BAC clones
Updated $Date: 2007/09/26 14:07:43 $ by $Author: gribskov $