Heuristic discovery and design of promoters for the fine-control of metabolism in industrially relevant microbes
Gilman, James
Date: 25 April 2018
Publisher
University of Exeter
Degree Title
PhD in Biological Sciences
Abstract
Predictable, robust genetic parts including constitutive promoters are one
of the defining attributes of synthetic biology. Ideally, candidate promoters
should cover a broad range of expression strengths and yield homogeneous
output, whilst also being orthogonal to endogenous regulatory pathways.
However, such libraries are not ...
Predictable, robust genetic parts including constitutive promoters are one
of the defining attributes of synthetic biology. Ideally, candidate promoters
should cover a broad range of expression strengths and yield homogeneous
output, whilst also being orthogonal to endogenous regulatory pathways.
However, such libraries are not always readily available in non-model
organisms, such as the industrially relevant genus Geobacillus.
A multitude of different approaches are available for the identification and
de novo design of prokaryotic promoters, although it may be unclear which
methodology is most practical in an industrial context. Endogenous promoters
may be individually isolated from upstream of well-understood genes, or
bioinformatically identified en masse. Alternatively, pre-existing promoters may
be mutagenised, or mathematical abstraction can be used to model promoter
strength and design de novo synthetic regulatory sequences.
In this investigation, bioinformatic, mathematic and mutagenic
approaches to promoter discovery were directly compared. Hundreds of
previously uncharacterised putative promoters were bioinformatically identified
from the core genome of four Geobacillus species, and a rational sampling
method was used to select sequences for in vivo characterisation. A library of
95 promoters covered a 2-log range of expression strengths when
characterised in vivo using fluorescent reporter proteins. Data derived from this
experimental characterisation were used to train Artificial Neural Network,
Partial Least Squares and Random Forest statistical models, which quantifiably
inferred the relationship between DNA sequence and function. The resulting
models showed limited predictive- but good descriptive-power. In particular, the
models highlighted the importance of sequences upstream of the canonical -35
and -10 motifs for determining promoter function in Geobacillus.
Additionally, two commonly used mutagenic techniques for promoter
production, Saturation Mutagenesis of Flanking Regions and error-prone PCR,
were applied. The resulting sequence libraries showed limited promoter activity,
underlining the difficulty of deriving synthetic promoters in species where
understanding of transcription regulation is limited. As such, bioinformatic
identification and deep-characterisation of endogenous promoter elements was
posited as the most practical approach for the derivation of promoter libraries in
non-model organisms of industrial interest.
Doctoral Theses
Doctoral College
Item views 0
Full item downloads 0