dc.contributor.author | Papanicolaou, A | |
dc.contributor.author | Stierli, R | |
dc.contributor.author | Ffrench-Constant, RH | |
dc.contributor.author | Heckel, DG | |
dc.date.accessioned | 2016-06-13T08:47:51Z | |
dc.date.issued | 2009-12-24 | |
dc.description.abstract | BACKGROUND: The decreasing costs of capillary-based Sanger sequencing and next generation technologies, such as 454 pyrosequencing, have prompted an explosion of transcriptome projects in non-model species, where even shallow sequencing of transcriptomes can now be used to examine a range of research questions. This rapid growth in data has outstripped the ability of researchers working on non-model species to analyze and mine transcriptome data efficiently. RESULTS: Here we present a semi-automated platform 'est2assembly' that processes raw sequence data from Sanger or 454 sequencing into a hybrid de-novo assembly, annotates it and produces GMOD compatible output, including a SeqFeature database suitable for GBrowse. Users are able to parameterize assembler variables, judge assembly quality and determine the optimal assembly for their specific needs. We used est2assembly to process Drosophila and Bicyclus public Sanger EST data and then compared them to published 454 data as well as eight new insect transcriptome collections. CONCLUSIONS: Analysis of such a wide variety of data allows us to understand how these new technologies can assist EST project design. We determine that assembler parameterization is as essential as standardized methods to judge the output of ESTs projects. Further, even shallow sequencing using 454 produces sufficient data to be of wide use to the community. est2assembly is an important tool to assist manual curation for gene models, an important resource in their own right but especially for species which are due to acquire a genome project using Next Generation Sequencing. | en_GB |
dc.description.sponsorship | We would like to thank Karl Gordon (CSIRO) for helping with end-user
testing, two anonymous referees for improving the manuscript and the following
for making pre-publication data available: Chris Jiggins and his laboratory
(Univ. of Cambridge), Owen McMillan and his laboratory (State
Univ. of N. Carolina), Yannick Pauchet and Iva Fuková (Univ. of Exeter).
Further, Bastien Chevreux provided development versions of MIRA and
excellent support, Jose Blanca provided sff_extract, James Wasmuth provided
support for prot4EST, Ralf Schmid for annot8r, Derek Huntley for
SEAN and Steffi Gebauer-Jung for TrimbyWindow. David Clements and
Scott Cain helped with Chado and GBrowse. We also thank the TU-Dresden
Deimos PC-Farm for computational support. The authors report no
conflicting interests. AP was supported by the Max Planck Gesellschaft and
the European Union Research Network GAMEXP; DGH was supported by
the Max Planck Gesellschaft; RHfC was supported by the European Union
Research Network EMBEK1. | en_GB |
dc.identifier.citation | Vol. 10, article 447 | en_GB |
dc.identifier.doi | 10.1186/1471-2105-10-447 | |
dc.identifier.uri | http://hdl.handle.net/10871/22012 | |
dc.language.iso | en | en_GB |
dc.publisher | BioMed Central | en_GB |
dc.relation.url | http://www.ncbi.nlm.nih.gov/pubmed/20034392 | en_GB |
dc.rights | © 2009 Papanicolaou et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited | en_GB |
dc.title | Next generation transcriptomes for next generation genomes using est2assembly | en_GB |
dc.type | Article | en_GB |
dc.date.available | 2016-06-13T08:47:51Z | |
dc.identifier.issn | 1471-2105 | |
exeter.place-of-publication | England | |
dc.description | This is the final version. Available on open access from BMC via the DOI in this record. | en_GB |
dc.identifier.journal | BMC Bioinformatics | en_GB |