/[genomes]/quickload/T_parvula_May_2012/HEADER.md
ViewVC logotype

Contents of /quickload/T_parvula_May_2012/HEADER.md

Parent Directory Parent Directory | Revision Log Revision Log


Revision 60 - (show annotations)
Thu Sep 27 16:44:53 2018 UTC (6 years ago) by aloraine
File size: 2406 byte(s)
Add Thellungiella parvula (salt cress) v2, May 2012 genome assembly
1 <html>
2 <body>
3 <h1>Welcome</h1>
4 <p>
5 The data files listed below are formatted for visualization in the
6 Integrated Genome Browser, available from <a
7 href="https://bioviz.org">BioViz.org</a>.
8 </p>
9 <p>
10 Data files are from <a href="http://www.thellungiella.org">thellungiella.org</a>. Files were downloaded in December 2012 and reformatted for visualization in IGB.
11 </p>
12 <p>
13 For information about the genome sequence and annotations, please
14 contact Maheshi Dassanayake and Dong-Ha Oh. For information about this QuickLoad site or Integrated Genome Browser, contact Ann Loraine.
15 </p>
16 <p>
17 The genome sequence and annotations for Thellungiella parvula (common
18 name Anatolian cress) were published in the journal Nature Genetics
19 in 2011. See: <a href="http://www.ncbi.nlm.nih.gov/pubmed/21822265">The genome of the extremophile crucifer Thellungiella parvula</a>.
20 </p>
21 <h2>About the files</h2>
22 <p>
23 Files with extension .gz were compressed and indexed using bgzip and
24 tabix tools from Heng Li.
25 </p>
26 <p>
27 The BED file T_parvula_May_2012.bed.gz is BED14 (bed detail) format,
28 developed originally in the UCSC Genome Bioinformatics group and described on their site.
29 </p>
30 <p>
31 In IGB, BED14 field 13 is designated
32 the "gene name" field and field 14 is desigated as the description
33 field. Both can be searched using the coordinates box or the Search
34 tab once the file is loaded into IGB.
35 </p>
36 <p>
37 T_parvula_May_2012.bed.gz was created from TpV84_ORFs.gff from
38 thellungiella.org. In this case, BED14 fields 13 and 4 are identical
39 and field 14 contains the Note attribute from the GFF extra feature
40 field. The time stamp of the GFF file we used was Dec. 28, 2011 and
41 the file was downloaded in Dec. 2012. The original file contained
42 transcriptional start site and polyadenylation signals that were not
43 included in the reformatted BED file because the format does not
44 support these feature types except as field 2, which represents the start of
45 transcription for a gene model. Boundaries of CDS, five_prime_UTR, and
46 three_prime_UTR features were used to delimit exon boundaries in the
47 BED output. For information about software used to perform the format
48 conversion, please see log messages associated with this file.
49 </p>
50 <p>
51 The .2bit file contains the genome sequence for parvula assembly
52 version 2, released May, 2012. The fasta file was converted to .2bit
53 format using Jim Kent's faToTwoBit tool.
54 </p>
55 </body>
56 </html>

  ViewVC Help
Powered by ViewVC 1.1.26