Show simple item record

dc.contributor.authorGibbons, C
dc.contributor.authorRichards, S
dc.contributor.authorValderas, JM
dc.contributor.authorCampbell, J
dc.date.accessioned2018-05-15T07:10:20Z
dc.date.issued2017-03-15
dc.description.abstractBACKGROUND: Machine learning techniques may be an effective and efficient way to classify open-text reports on doctor's activity for the purposes of quality assurance, safety, and continuing professional development. OBJECTIVE: The objective of the study was to evaluate the accuracy of machine learning algorithms trained to classify open-text reports of doctor performance and to assess the potential for classifications to identify significant differences in doctors' professional performance in the United Kingdom. METHODS: We used 1636 open-text comments (34,283 words) relating to the performance of 548 doctors collected from a survey of clinicians' colleagues using the General Medical Council Colleague Questionnaire (GMC-CQ). We coded 77.75% (1272/1636) of the comments into 5 global themes (innovation, interpersonal skills, popularity, professionalism, and respect) using a qualitative framework. We trained 8 machine learning algorithms to classify comments and assessed their performance using several training samples. We evaluated doctor performance using the GMC-CQ and compared scores between doctors with different classifications using t tests. RESULTS: Individual algorithm performance was high (range F score=.68 to .83). Interrater agreement between the algorithms and the human coder was highest for codes relating to "popular" (recall=.97), "innovator" (recall=.98), and "respected" (recall=.87) codes and was lower for the "interpersonal" (recall=.80) and "professional" (recall=.82) codes. A 10-fold cross-validation demonstrated similar performance in each analysis. When combined together into an ensemble of multiple algorithms, mean human-computer interrater agreement was .88. Comments that were classified as "respected," "professional," and "interpersonal" related to higher doctor scores on the GMC-CQ compared with comments that were not classified (P<.05). Scores did not vary between doctors who were rated as popular or innovative and those who were not rated at all (P>.05). CONCLUSIONS: Machine learning algorithms can classify open-text feedback of doctor performance into multiple themes derived by human raters with high performance. Colleague open-text comments that signal respect, professionalism, and being interpersonal may be key indicators of doctor's performance.en_GB
dc.description.sponsorshipWe thank Karen Alexander, the National Institute for Health Research (NIHR) Adaptive Tests for Long-Term Conditions (ATLanTiC) patient and public involvement partner, for providing critical insight, comments, and editing the manuscript. Data collection and qualitative coding were funded by the UK General Medical Council as an unrestricted research award. Support for the novel work presented in this paper was given by a postdoctoral fellowship award for CG (NIHR-PDF-2014-07-028).en_GB
dc.identifier.citationVol. 19(3), e65en_GB
dc.identifier.doi10.2196/jmir.6533
dc.identifier.urihttp://hdl.handle.net/10871/32852
dc.language.isoenen_GB
dc.publisherJMIR Publicationsen_GB
dc.relation.urlhttps://www.ncbi.nlm.nih.gov/pubmed/28298265en_GB
dc.rights©Chris Gibbons, Suzanne Richards, Jose Maria Valderas, John Campbell. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.03.2017. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.en_GB
dc.subjectdata miningen_GB
dc.subjectfeedbacken_GB
dc.subjectmachine learningen_GB
dc.subjectsurveys and questionnairesen_GB
dc.subjectwork performanceen_GB
dc.subjectAlgorithmsen_GB
dc.subjectClinical Competenceen_GB
dc.subjectFeedbacken_GB
dc.subjectHumansen_GB
dc.subjectPhysiciansen_GB
dc.subjectSupervised Machine Learningen_GB
dc.subjectSurveys and Questionnairesen_GB
dc.titleSupervised machine learning algorithms can classify open-text feedback of doctor performance with human-level accuracyen_GB
dc.typeArticleen_GB
dc.date.available2018-05-15T07:10:20Z
dc.identifier.issn1439-4456
exeter.place-of-publicationCanadaen_GB
dc.descriptionThis is the final version of the article. Available from the publisher via the DOI in this record.en_GB
dc.identifier.journalJournal of Medical Internet Researchen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record