Assessing performance of pathogenicity predictors using clinically relevant variant datasets

Gunning, AC; Fryer, V; Fasham, J; Crosby, AH; Ellard, S; Baple, EL; Wright, CF

dc.contributor.author	Gunning, AC
dc.contributor.author	Fryer, V
dc.contributor.author	Fasham, J
dc.contributor.author	Crosby, AH
dc.contributor.author	Ellard, S
dc.contributor.author	Baple, EL
dc.contributor.author	Wright, CF
dc.date.accessioned	2020-09-02T14:22:05Z
dc.date.issued	2020-08-25
dc.description.abstract	Background Pathogenicity predictors are integral to genomic variant interpretation but, despite their widespread usage, an independent validation of performance using a clinically relevant dataset has not been undertaken. Methods We derive two validation datasets: an ‘open’ dataset containing variants extracted from publicly available databases, similar to those commonly applied in previous benchmarking exercises, and a ‘clinically representative’ dataset containing variants identified through research/diagnostic exome and panel sequencing. Using these datasets, we evaluate the performance of three recent meta-predictors, REVEL, GAVIN and ClinPred, and compare their performance against two commonly used in silico tools, SIFT and PolyPhen-2. Results Although the newer meta-predictors outperform the older tools, the performance of all pathogenicity predictors is substantially lower in the clinically representative dataset. Using our clinically relevant dataset, REVEL performed best with an area under the receiver operating characteristic curve of 0.82. Using a concordance-based approach based on a consensus of multiple tools reduces the performance due to both discordance between tools and false concordance where tools make common misclassification. Analysis of tool feature usage may give an insight into the tool performance and misclassification. Conclusion Our results support the adoption of meta-predictors over traditional in silico tools, but do not support a consensus-based approach as in current practice.	en_GB
dc.description.sponsorship	Wellcome Trust	en_GB
dc.description.sponsorship	Department of Health	en_GB
dc.description.sponsorship	Health Innovation Challenge Fund	en_GB
dc.identifier.citation	Published online 25 August 2020	en_GB
dc.identifier.doi	10.1136/jmedgenet-2020-107003
dc.identifier.grantnumber	WT200990/Z/16/Z	en_GB
dc.identifier.grantnumber	WT200990/A/16/Z	en_GB
dc.identifier.grantnumber	HICF-1009-003	en_GB
dc.identifier.grantnumber	WT098051	en_GB
dc.identifier.uri	http://hdl.handle.net/10871/122685
dc.language.iso	en	en_GB
dc.publisher	BMJ Publishing Group	en_GB
dc.rights	© Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ. This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.	en_GB
dc.title	Assessing performance of pathogenicity predictors using clinically relevant variant datasets	en_GB
dc.type	Article	en_GB
dc.date.available	2020-09-02T14:22:05Z
dc.identifier.issn	0022-2593
dc.description	This is the final version. Available on open access from BMJ Publishing Group via the DOI in this record	en_GB
dc.identifier.journal	Journal of Medical Genetics	en_GB
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_GB
dcterms.dateAccepted	2020-06-20
exeter.funder	::Wellcome Trust	en_GB
rioxxterms.version	VoR	en_GB
rioxxterms.licenseref.startdate	2020-06-20
rioxxterms.type	Journal Article/Review	en_GB
refterms.dateFCD	2020-09-02T14:17:35Z
refterms.versionFCD	VoR
refterms.dateFOA	2020-09-02T14:22:10Z
refterms.panel	A	en_GB

Files in this item

Name:: jmedgenet-2020-107003.full.pdf
Size:: 2.289Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record

© Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ.
This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Except where otherwise noted, this item's licence is described as © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ. This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Show Statistical Information