Show simple item record

dc.contributor.authorDavid, G
dc.contributor.authorBertolotti, A
dc.contributor.authorLayer, R
dc.contributor.authorScofield, D
dc.contributor.authorHayward, A
dc.contributor.authorBaril, T
dc.contributor.authorBurnett, HA
dc.contributor.authorGudmunds, E
dc.contributor.authorJensen, H
dc.contributor.authorHusby, A
dc.date.accessioned2024-03-21T11:22:41Z
dc.date.issued2024-03-15
dc.date.updated2024-03-21T09:52:28Z
dc.description.abstractComprehensive characterisation of structural variation in natural populations has only become feasible in the last decade. To investigate the population genomic nature of structural variation (SV), reproducible and high-confidence SV callsets are first required. We created a population-scale reference of the genome-wide landscape of structural variation across 33 Nordic house sparrows (Passer domesticus) individuals. To produce a consensus callset across all samples using short-read data, we compare heuristic-based quality-filtering and visual curation (Samplot/PlotCritic and Samplot-ML) approaches. We demonstrate that curation of SVs is important for reducing putative false positives and that the time invested in this step outweighs the potential costs of analysing short-read discovered SV datasets that include many potential false positives. We find that even a lenient manual curation strategy (e.g. applied by a single curator) can reduce the proportion of putative false positives by up to 80%, thus enriching the proportion of high-confidence variants. Crucially, in applying a lenient manual curation strategy with a single curator, nearly all (>99%) variants rejected as putative false positives were also classified as such by a more stringent curation strategy using three additional curators. Furthermore, variants rejected by manual curation failed to reflect expected population structure from SNPs, whereas variants passing curation did. Combining heuristic-based quality-filtering with rapid manual curation of structural variants in short-read data can therefore become a time- and cost-effective first step for functional and population genomic studies requiring high-confidence SV callsets.en_GB
dc.description.sponsorshipSwedish Research Councilen_GB
dc.description.sponsorshipResearch Council of Norwayen_GB
dc.description.sponsorshipDepartment of Ecology and Genetics, Uppsala Universityen_GB
dc.description.sponsorshipBiotechnology and Biological Sciences Research Council (BBSRC)en_GB
dc.identifier.citationArticle evae049en_GB
dc.identifier.doihttps://doi.org/10.1093/gbe/evae049
dc.identifier.grantnumber2018-05973en_GB
dc.identifier.grantnumber23997en_GB
dc.identifier.grantnumber223257en_GB
dc.identifier.grantnumber302619en_GB
dc.identifier.grantnumberBB/N020146/1en_GB
dc.identifier.grantnumberBB/M009122/1en_GB
dc.identifier.urihttp://hdl.handle.net/10871/135598
dc.identifierORCID: 0000-0001-7413-718X (Hayward, Alexander)
dc.identifierScopusID: 35264146100 (Hayward, Alexander)
dc.language.isoenen_GB
dc.publisherOxford University Press / Society for Molecular Biology and Evolutionen_GB
dc.relation.urlhttps://doi.org/10.5061/dryad.6q573n647en_GB
dc.relation.urlhttps://doi.org/10.5281/zenodo.8287680en_GB
dc.rights© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.en_GB
dc.subjectstructural variationen_GB
dc.subjectshort-readen_GB
dc.subjecthigh-confidence variantsen_GB
dc.subjectrapid manual curationen_GB
dc.subjectcuration strategiesen_GB
dc.subjectputative false positivesen_GB
dc.titleCalling structural variants with confidence from short-read data in wild bird populationsen_GB
dc.typeArticleen_GB
dc.date.available2024-03-21T11:22:41Z
dc.identifier.issn1759-6653
dc.descriptionThis is the author accepted manuscript. The final version is available on open access from Oxford University Press via the DOI in this recorden_GB
dc.descriptionData Availability: The Illumina reads and assembled reference genome from this article are available at NCBI, Bioproject number PRJNA255814 (P. domesticus reference accession number SAMN02929199). Additional data and script are available at the Dryad database: https://doi.org/10.5061/dryad.6q573n647en_GB
dc.identifier.eissn1759-6653
dc.identifier.journalGenome Biology and Evolutionen_GB
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_GB
dcterms.dateAccepted2024-03-07
dcterms.dateSubmitted2022-11-29
rioxxterms.versionAMen_GB
rioxxterms.licenseref.startdate2024-03-07
rioxxterms.typeJournal Article/Reviewen_GB
refterms.dateFCD2024-03-21T11:18:37Z
refterms.versionFCDAM
refterms.dateFOA2024-03-21T11:22:54Z
refterms.panelAen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Except where otherwise noted, this item's licence is described as © The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.