1 |
<html> |
2 |
<body> |
3 |
<h1>Welcome</h1> |
4 |
<p> |
5 |
The data files listed below are formatted for visualization in the |
6 |
Integrated Genome Browser, available from <a |
7 |
href="https://bioviz.org">BioViz.org</a>. |
8 |
</p> |
9 |
<p> |
10 |
Data files are from <a href="http://www.thellungiella.org">thellungiella.org</a>. Files were downloaded in December 2012 and reformatted for visualization in IGB. |
11 |
</p> |
12 |
<p> |
13 |
For information about the genome sequence and annotations, please |
14 |
contact Maheshi Dassanayake and Dong-Ha Oh. For information about this QuickLoad site or Integrated Genome Browser, contact Ann Loraine. |
15 |
</p> |
16 |
<p> |
17 |
The genome sequence and annotations for Thellungiella parvula (common |
18 |
name Anatolian cress) were published in the journal Nature Genetics |
19 |
in 2011. See: <a href="http://www.ncbi.nlm.nih.gov/pubmed/21822265">The genome of the extremophile crucifer Thellungiella parvula</a>. |
20 |
</p> |
21 |
<h2>About the files</h2> |
22 |
<p> |
23 |
Files with extension .gz were compressed and indexed using bgzip and |
24 |
tabix tools from Heng Li. |
25 |
</p> |
26 |
<p> |
27 |
The BED file T_parvula_May_2012.bed.gz is BED14 (bed detail) format, |
28 |
developed originally in the UCSC Genome Bioinformatics group and described on their site. |
29 |
</p> |
30 |
<p> |
31 |
In IGB, BED14 field 13 is designated |
32 |
the "gene name" field and field 14 is desigated as the description |
33 |
field. Both can be searched using the coordinates box or the Search |
34 |
tab once the file is loaded into IGB. |
35 |
</p> |
36 |
<p> |
37 |
T_parvula_May_2012.bed.gz was created from TpV84_ORFs.gff from |
38 |
thellungiella.org. In this case, BED14 fields 13 and 4 are identical |
39 |
and field 14 contains the Note attribute from the GFF extra feature |
40 |
field. The time stamp of the GFF file we used was Dec. 28, 2011 and |
41 |
the file was downloaded in Dec. 2012. The original file contained |
42 |
transcriptional start site and polyadenylation signals that were not |
43 |
included in the reformatted BED file because the format does not |
44 |
support these feature types except as field 2, which represents the start of |
45 |
transcription for a gene model. Boundaries of CDS, five_prime_UTR, and |
46 |
three_prime_UTR features were used to delimit exon boundaries in the |
47 |
BED output. For information about software used to perform the format |
48 |
conversion, please see log messages associated with this file. |
49 |
</p> |
50 |
<p> |
51 |
The .2bit file contains the genome sequence for parvula assembly |
52 |
version 2, released May, 2012. The fasta file was converted to .2bit |
53 |
format using Jim Kent's faToTwoBit tool. |
54 |
</p> |
55 |
</body> |
56 |
</html> |