1 |
aloraine |
60 |
<html> |
2 |
|
|
<body> |
3 |
|
|
<h1>Welcome</h1> |
4 |
|
|
<p> |
5 |
|
|
The data files listed below are formatted for visualization in the |
6 |
|
|
Integrated Genome Browser, available from <a |
7 |
|
|
href="https://bioviz.org">BioViz.org</a>. |
8 |
|
|
</p> |
9 |
|
|
<p> |
10 |
|
|
Data files are from <a href="http://www.thellungiella.org">thellungiella.org</a>. Files were downloaded in December 2012 and reformatted for visualization in IGB. |
11 |
|
|
</p> |
12 |
|
|
<p> |
13 |
|
|
For information about the genome sequence and annotations, please |
14 |
|
|
contact Maheshi Dassanayake and Dong-Ha Oh. For information about this QuickLoad site or Integrated Genome Browser, contact Ann Loraine. |
15 |
|
|
</p> |
16 |
|
|
<p> |
17 |
|
|
The genome sequence and annotations for Thellungiella parvula (common |
18 |
|
|
name Anatolian cress) were published in the journal Nature Genetics |
19 |
|
|
in 2011. See: <a href="http://www.ncbi.nlm.nih.gov/pubmed/21822265">The genome of the extremophile crucifer Thellungiella parvula</a>. |
20 |
|
|
</p> |
21 |
|
|
<h2>About the files</h2> |
22 |
|
|
<p> |
23 |
|
|
Files with extension .gz were compressed and indexed using bgzip and |
24 |
|
|
tabix tools from Heng Li. |
25 |
|
|
</p> |
26 |
|
|
<p> |
27 |
|
|
The BED file T_parvula_May_2012.bed.gz is BED14 (bed detail) format, |
28 |
|
|
developed originally in the UCSC Genome Bioinformatics group and described on their site. |
29 |
|
|
</p> |
30 |
|
|
<p> |
31 |
|
|
In IGB, BED14 field 13 is designated |
32 |
|
|
the "gene name" field and field 14 is desigated as the description |
33 |
|
|
field. Both can be searched using the coordinates box or the Search |
34 |
|
|
tab once the file is loaded into IGB. |
35 |
|
|
</p> |
36 |
|
|
<p> |
37 |
|
|
T_parvula_May_2012.bed.gz was created from TpV84_ORFs.gff from |
38 |
|
|
thellungiella.org. In this case, BED14 fields 13 and 4 are identical |
39 |
|
|
and field 14 contains the Note attribute from the GFF extra feature |
40 |
|
|
field. The time stamp of the GFF file we used was Dec. 28, 2011 and |
41 |
|
|
the file was downloaded in Dec. 2012. The original file contained |
42 |
|
|
transcriptional start site and polyadenylation signals that were not |
43 |
|
|
included in the reformatted BED file because the format does not |
44 |
|
|
support these feature types except as field 2, which represents the start of |
45 |
|
|
transcription for a gene model. Boundaries of CDS, five_prime_UTR, and |
46 |
|
|
three_prime_UTR features were used to delimit exon boundaries in the |
47 |
|
|
BED output. For information about software used to perform the format |
48 |
|
|
conversion, please see log messages associated with this file. |
49 |
|
|
</p> |
50 |
|
|
<p> |
51 |
|
|
The .2bit file contains the genome sequence for parvula assembly |
52 |
|
|
version 2, released May, 2012. The fasta file was converted to .2bit |
53 |
|
|
format using Jim Kent's faToTwoBit tool. |
54 |
|
|
</p> |
55 |
|
|
</body> |
56 |
|
|
</html> |