Identifying clinical features in primary care electronic health record studies: methods for codelist development

Watson, J; Nicholson, BD; Hamilton, W; Price, S

dc.contributor.author	Watson, J
dc.contributor.author	Nicholson, BD
dc.contributor.author	Hamilton, W
dc.contributor.author	Price, S
dc.date.accessioned	2017-12-05T14:53:02Z
dc.date.issued	2017-11-22
dc.description.abstract	OBJECTIVE: Analysis of routinely collected electronic health record (EHR) data from primary care is reliant on the creation of codelists to define clinical features of interest. To improve scientific rigour, transparency and replicability, we describe and demonstrate a standardised reproducible methodology for clinical codelist development. DESIGN: We describe a three-stage process for developing clinical codelists. First, the clear definition a priori of the clinical feature of interest using reliable clinical resources. Second, development of a list of potential codes using statistical software to comprehensively search all available codes. Third, a modified Delphi process to reach consensus between primary care practitioners on the most relevant codes, including the generation of an 'uncertainty' variable to allow sensitivity analysis. SETTING: These methods are illustrated by developing a codelist for shortness of breath in a primary care EHR sample, including modifiable syntax for commonly used statistical software. PARTICIPANTS: The codelist was used to estimate the frequency of shortness of breath in a cohort of 28 216 patients aged over 18 years who received an incident diagnosis of lung cancer between 1 January 2000 and 30 November 2016 in the Clinical Practice Research Datalink (CPRD). RESULTS: Of 78 candidate codes, 29 were excluded as inappropriate. Complete agreement was reached for 44 (90%) of the remaining codes, with partial disagreement over 5 (10%). 13 091 episodes of shortness of breath were identified in the cohort of 28 216 patients. Sensitivity analysis demonstrates that codes with the greatest uncertainty tend to be rarely used in clinical practice. CONCLUSIONS: Although initially time consuming, using a rigorous and reproducible method for codelist generation 'future-proofs' findings and an auditable, modifiable syntax for codelist generation enables sharing and replication of EHR studies. Published codelists should be badged by quality and report the methods of codelist generation including: definitions and justifications associated with each codelist; the syntax or search method; the number of candidate codes identified; and the categorisation of codes after Delphi review.	en_GB
dc.description.sponsorship	JW (DRF-2016-09-034) and BDN (DRF-2015-08-18) are both funded by Doctoral Research Fellowships from the National Institute for Health Research Trainees Coordinating Centre. WH is part-funded by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care South West Peninsula at the Royal Devon and Exeter NHS Foundation Trust.	en_GB
dc.identifier.citation	Vol. 7, article e019637	en_GB
dc.identifier.doi	10.1136/bmjopen-2017-019637
dc.identifier.uri	http://hdl.handle.net/10871/30584
dc.language.iso	en	en_GB
dc.publisher	BMJ Publishing Group	en_GB
dc.relation.source	CPRD data on which the sensitivity analysis was based is held securely by University of Exeter Medical School under the CPRD data access licence (https://www.cprd.com/dataAccess/).	en_GB
dc.relation.url	https://www.ncbi.nlm.nih.gov/pubmed/29170293	en_GB
dc.rights	© Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted. This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/	en_GB
dc.subject	clinical coding	en_GB
dc.subject	electronic health records	en_GB
dc.subject	epidemiology	en_GB
dc.subject	primary care	en_GB
dc.title	Identifying clinical features in primary care electronic health record studies: methods for codelist development	en_GB
dc.type	Article	en_GB
dc.date.available	2017-12-05T14:53:02Z
exeter.place-of-publication	England	en_GB
dc.description	This is the final version of the article. Available from BMJ Publishing Group via the DOI in this record.	en_GB
dc.identifier.journal	BMJ Open	en_GB

Files in this item

Name:: Watson et al. 2017.pdf
Size:: 644.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Institute of Health Research

Show simple item record

Show Statistical Information