Bioinformatics of next generation sequencing approaches: using 454 and Illumina data to look at insect genomes and transcriptomes
Thesis or dissertation
University of Exeter
Reason for embargo
I need to publish two research papers (thesis - chapter 4 and chapter 5)
By providing a rapid and cost effective means of generating sequencing resources for almost any organism, ‘Next generation sequencing technologies’ (NGS) have great potential to help address numerous gene and genome level questions in molecular biology. Progress in NGS is exponentially increasing sequence throughput and large scale studies in the genomics/transcriptomics of non-model organisms are becoming a reality. Therefore the main focus of the work presented in this thesis is on the analysis of the large scale non-model insect datasets generated by NGS technologies and their potential to develop functional genomics tools for these species. Four different NGS datasets from four very different insects the Greenhouse whitefly (Trialeurodes vaporariorum) the Passionvine butterfly (Heliconius melopmene), the blowfly (Lucilia sericata) and the Green Dock beetle (Gastrophysa viridula) were analysed and annotated. Molecular research in these insects has been hindered in the past due to limited nucleotide sequence information. Transcriptome data generated by 454 pyrosequencing was used as a starting point to study the genomics of these ecologically and economically important non-model insect species. The resulting transcriptomes were annotated for gene families involved in xenobiotic metabolism, namely the glutathione-S-transferases (GSTs), cytochrome P450s (P450s) and the carboxylesterases (CCEs). In each case the number and diversity of gene family members is discussed with those documented in other insects. In the case of H. melpomone, the transcriptome data was also used to complement the genomic research by identifying and validating cytochrome P450 gene models in the recently sequenced genome. Furthermore, Illumina generated RNA-seq data was used for SNP characterisation in L. sericata. Transcriptome sequencing is shown to be a useful and cost effective technique to enhance the resources available for non-model organisms as well as for gene discovery in the absence of the reference genomic resources. By focusing on genes involved in xenobiotic metabolism this thesis has isolated numerous candidate genes potentially involved in important processes such as insecticide resistance (Lucilia and Trialeurodes) and host plant exploitation (Gastrophysa and Heliconius). NGS technologies and bioinformatics can thus open up avenues to develop functional genomics resources for diverse species of interest to ecologists and evolutionary biologists.
PhD in Biological Sciences