Show simple item record

dc.contributor.authorSłowiński, P
dc.contributor.authorLi, M
dc.contributor.authorRestrepo, P
dc.contributor.authorAlomran, N
dc.contributor.authorSpurr, LF
dc.contributor.authorMiller, C
dc.contributor.authorTsaneva-Atanasova, K
dc.contributor.authorHorvath, A
dc.date.accessioned2020-09-16T10:08:04Z
dc.date.issued2020-09-16
dc.description.abstractVariant allele frequencies (VAF) are an important measure of genetic variation that can be estimated at single-nucleotide variant (SNV) sites. RNA and DNA VAFs are used as indicators of a wide-range of biological traits, including tumor purity and ploidy changes, allele-specific expression and gene-dosage transcriptional response. Here we present a novel methodology to assess gene and chromosomal allele asymmetries and to aid in identifying genomic alterations in RNA and DNA datasets. Our approach is based on analysis of the VAF distributions in chromosomal segments (continuous multi-SNV genomic regions). In each segment we estimate variant probability, a parameter of a random process that can generate synthetic VAF samples that closely resemble the observed data. We show that variant probability is a biologically interpretable quantitative descriptor of the VAF distribution in chromosomal segments which is consistent with other approaches. To this end, we apply the proposed methodology on data from 72 samples obtained from patients with breast invasive carcinoma (BRCA) from The Cancer Genome Atlas (TCGA). We compare DNA and RNA VAF distributions from matched RNA and whole exome sequencing (WES) datasets and find that both genomic signals give very similar segmentation and estimated variant probability profiles. We also find a correlation between variant probability with copy number alterations (CNA). Finally, to demonstrate a practical application of variant probabilities, we use them to estimate tumor purity. Tumor purity estimates based on variant probabilities demonstrate good concordance with other approaches (Pearson's correlation between 0.44 and 0.76). Our evaluation suggests that variant probabilities can serve as a dependable descriptor of VAF distribution, further enabling the statistical comparison of matched DNA and RNA datasets. Finally, they provide conceptual and mechanistic insights into relations between structure of VAF distributions and genetic events. The methodology is implemented in a Matlab toolbox that provides a suite of functions for analysis, statistical assessment and visualization of Genome and Transcriptome allele frequencies distributions. GeTallele is available at: https://github.com/SlowinskiPiotr/GeTalleleen_GB
dc.description.sponsorshipMcCormick Genomic and Proteomic Center (MGPC)en_GB
dc.description.sponsorshipGeorge Washington Universityen_GB
dc.description.sponsorshipWellcome Trusten_GB
dc.description.sponsorshipEngineering and Physical Sciences Research Council (EPSRC)en_GB
dc.identifier.citationVol. 8, article 1021en_GB
dc.identifier.doi10.3389/fbioe.2020.01021
dc.identifier.grantnumberMGPC_PG2018en_GB
dc.identifier.grantnumber204909/Z/16/Zen_GB
dc.identifier.grantnumberEP/N014391/1en_GB
dc.identifier.urihttp://hdl.handle.net/10871/122888
dc.language.isoenen_GB
dc.publisherFrontiers Mediaen_GB
dc.rights© 2020 Słowiński, Li, Restrepo, Alomran, Spurr, Miller, Tsaneva-Atanasova and Horvath. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.en_GB
dc.subjectvariant allele fraction (VAF)en_GB
dc.subjectRNA—DNAen_GB
dc.subjectearth mover's distance (EMD)en_GB
dc.subjectcircos ploten_GB
dc.subjectfarey sequenceen_GB
dc.titleGeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributionsen_GB
dc.typeArticleen_GB
dc.date.available2020-09-16T10:08:04Z
dc.identifier.issn2296-4185
dc.descriptionThis is the final version. Available on open access from Frontiers Media via the DOI in this recorden_GB
dc.descriptionData Availability Statement: The data analyzed in this study is subject to the following licenses/restrictions: The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. Requests to access these datasets should be directed to p.m.slowinski@exeter.ac.uk.en_GB
dc.identifier.journalFrontiers in Bioengineering and Biotechnologyen_GB
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_GB
dcterms.dateAccepted2020-08-04
rioxxterms.versionVoRen_GB
rioxxterms.licenseref.startdate2020-09-16
rioxxterms.typeJournal Article/Reviewen_GB
refterms.dateFCD2020-09-16T10:05:15Z
refterms.versionFCDVoR
refterms.dateFOA2020-09-16T10:08:08Z
refterms.panelBen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record

©  2020 Słowiński, Li, Restrepo, Alomran, Spurr, Miller, Tsaneva-Atanasova and Horvath. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Except where otherwise noted, this item's licence is described as © 2020 Słowiński, Li, Restrepo, Alomran, Spurr, Miller, Tsaneva-Atanasova and Horvath. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.