Table of Contents
Dataset Papers in Biology
Volume 2013 (2013), Article ID 670926, 7 pages
http://dx.doi.org/10.7167/2013/670926
Dataset Paper

Transcriptome Assembly and Expression Data from Normal and Mantled Oil Palm Fruit

1Genome Institute, National Center for Genetic Engineering and Biotechnology, 113 Phahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand
2Department of Genetics, Kasetsart University, 50 Phahonyothin Road, Chatuchak, Bangkok 10900, Thailand
3Department of Agronomy, Kasetsart University, Kamphaeng Saen, Nakhon Pathom 73140, Thailand

Received 27 March 2012; Accepted 18 April 2012

Academic Editors: F. Coppedè and T. Yin

This dataset has been dedicated to the public domain using the CC0 waiver.

Dataset http://dx.doi.org/10.7167/2013/670926/dataset

Dataset

Dataset Item 1 (Table). A list of isotigs (which represent mRNAs), their Blast2GO results, expression results, and number of differences identified. The column Sequence Name presents the isotig number designated by the assembly program; Normal Read Count, the sum of number of sequences that mapped to each contig in the isotig derived from the readstatus.txt output file; Normal Expression, the reads per 1 kb sequence; Sequence Description, Blast2GO sequence description; At the highest match Arabidopsis thaliana protein from a blastx against the TAIR Arabidopsis database with value cut-off of 1E-30 (E-value not shown); Minimum E-value, the E-value of the highest matching sequence in Blast2GO; Mean Similarity, the mean similarity of the highest matching sequence from Blast2GO; #GOs, the number of gene ontology terms of the highest matching sequence from Blast2GO; GOs, the gene ontology terms assigned to the highest matching sequence from Blast2GO; and Enzyme Codes, the enzyme code of the highest matching sequence from Blast2GO.

  • Column 1: Sequence Name
  • Column 2: Length
  • Column 3: Normal Read Count
  • Column 4: Normal Expression Corrected
  • Column 5: Sequence Description
  • Column 6: At
  • Column 7: Minimum E-value
  • Column 8: Mean Similarity
  • Column 9: #GOs
  • Column 10: GOs
  • Column 11: Enzyme Codes

Dataset Item 2 (Table). A list of isogroups (which represent genes), the most common Blast2GO description, expression results, and number of differences identified. The column Gene presents the isogroup number designated by the assembly program; Normal Read Count, the sum of number of sequences that mapped to each contig in the isogroup derived from the readstatus.txt output file; and Description(s), the most common sequence description (multiple descriptions are listed for isogroups without a consensus description).

  • Column 1: Gene
  • Column 2: Normal Read Count
  • Column 3: Description(s)

Dataset Item 3 (Table). A list of the isotigs that were identified as transcription factors. The column Isotig presents the isotig identified as being a transcription factor; Normal Read Count, the sum of number of sequences that mapped to each contig in the isogroup derived from the readstatus.txt output file; Sequence Description, the description of the isotig from Blast2GO; #GOs, the number of GO terms; GOs, the GO terms; and Enzyme Codes, the enzyme code for the closest matched protein.

  • Column 1: Isotig
  • Column 2: Isogroup
  • Column 3: Identification Method
  • Column 4: Normal Read Count
  • Column 5: Sequence Description
  • Column 6: #GOs
  • Column 7: GOs
  • Column 8: Enzyme Codes

Dataset Item 4 (Table). The isotigs that were identified by blasting the protein sequences from the plant transcription factor database against the isotig sequences using a tblastx. The column Isotig presents the isotig identified as being a transcription factor; Species, the species with the highest match for that isotig; Database Match, the protein ID with species name appended for the highest match transcription factor; E-value, the E-value of the highest match transcription factor; Percentage Identity, the percentage identity of the highest match transcription factor; Alignment Length, the alignment length of the highest match transcription factor; Mismatches, the number of mismatches in the alignment of the highest match transcription factor; Gap Openings, the number of gap opening in the alignment of the highest match transcription factor; Description, the description of the isotig from Blast2GO; and Normal Read Count, the sum of number of sequences that mapped to each contig in the isogroup derived from the readstatus.txt output file.

  • Column 1: Isotig
  • Column 2: Identification Method
  • Column 3: Species
  • Column 4: Database Match
  • Column 5: E-value
  • Column 6: Percentage Identity (%)
  • Column 7: Alignment Length
  • Column 8: Mismatches
  • Column 9: Gap Openings
  • Column 10: Description
  • Column 11: Normal Read Count

Dataset Item 5 (Table). The isotig match to the sequences described by Tranbarger et al. (2011) as transcription factors/regulators. The column Tranbarger presents the sequence from Tranbarger et al. (2011) dataset; Isotig Match, the closest isotig match identified by Blast; Normal Read Count, the sum of number of sequences that mapped to each contig in the isogroup derived from the readstatus.txt output file; and Description, the description of the isotig from Blast2GO.

  • Column 1: Tranbarger
  • Column 2: Isotig Match
  • Column 3: Normal Read Count
  • Column 4: Description

Dataset Item 6 (Table). The accession for each sequence uploaded to GenBank.

  • Column 1: Sequence
  • Column 2: Accession

Dataset Item 7 (Table). The sequence information of 34 sequences that were too short for GenBank (<200 bp) to accept.

  • Column 1: Isotig
  • Column 2: Isogroup
  • Column 3: Length
  • Column 4: Sequence