Show simple item record

dc.contributor.authorCardoso, P
dc.contributor.authorDennis, JM
dc.contributor.authorBowden, J
dc.contributor.authorShields, BM
dc.contributor.authorMcKinley, TJ
dc.contributor.authorMASTERMIND Consortium
dc.date.accessioned2023-12-11T11:04:06Z
dc.date.issued2024-01-08
dc.date.updated2023-12-11T09:57:44Z
dc.description.abstractBackground: The handling of missing data is a challenge for inference and regression modelling. A particular challenge is dealing with missing predictor information, particularly when trying to build and make predictions from models for use in clinical practice. Methods: We utilise a flexible Bayesian approach for handling missing predictor information in regression models. This provides practitioners with full posterior predictive distributions for both the missing predictor information (conditional on the observed predictors) and the outcome-of-interest. We apply this approach to a previously proposed counterfactual treatment selection model for type 2 diabetes second-line therapies. Our approach combines a regression model and a Dirichlet process mixture model (DPMM), where the former defines the treatment selection model, and the latter provides a flexible way to model the joint distribution of the predictors. Results: We show that DPMMs can model complex relationships between predictor variables and can provide powerful means of fitting models to incomplete data (under missing-completely-at-random and missing-at-random assumptions). This framework ensures that the posterior distribution for the parameters and the conditional average treatment effect estimates automatically reflect the additional uncertainties associated with missing data due to the hierarchical model structure. We also demonstrate that in the presence of multiple missing predictors, the DPMM model can be used to explore which variable(s), if collected, could provide the most additional information about the likely outcome. Conclusions: When developing clinical prediction models, DPMMs offer a flexible way to model complex covariate structures and handle missing predictor information. DPMM-based counterfactual prediction models can also provide additional information to support clinical decision-making, including allowing predictions with appropriate uncertainty to be made for individuals with incomplete predictor dataen_GB
dc.description.sponsorshipMedical Research Council (MRC)en_GB
dc.description.sponsorshipResearch Englanden_GB
dc.identifier.citationVol. 24, article 12en_GB
dc.identifier.doi10.1186/s12911-023-02400-3
dc.identifier.grantnumberMR/N00633X/1en_GB
dc.identifier.urihttp://hdl.handle.net/10871/134768
dc.identifierORCID: 0000-0002-9485-3236 (McKinley, Trevelyan)
dc.language.isoenen_GB
dc.publisherBMCen_GB
dc.relation.urlhttps://cprd.com/research-applicationsen_GB
dc.rights© The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecom mons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
dc.subjectDirichlet process mixture modelen_GB
dc.subjecttreatment selection modelen_GB
dc.subjectprecision medicineen_GB
dc.subjecttype 2 diabetesen_GB
dc.subjectBayesian modellingen_GB
dc.titleDirichlet process mixture models to impute missing predictor data in counterfactual prediction models: an application to predict optimal type 2 diabetes therapyen_GB
dc.typeArticleen_GB
dc.date.available2023-12-11T11:04:06Z
dc.identifier.issn1472-6947
dc.descriptionThis is the final version. Available on open access from BMC via the DOI in this recorden_GB
dc.descriptionAvailability of data and materials: The routine clinical data analysed during the current study are available in the CPRD repository (CPRD; https://cprd.com/research-applications), but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. For re-using these data, an application must be made directly to CPRD. A synthetic sample data is available on GitHub within the repository “PM-Cardoso/DPMM-tsm”en_GB
dc.identifier.eissn1472-6947
dc.identifier.journalBMC Medical Informatics and Decision Makingen_GB
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_GB
dcterms.dateAccepted2023-12-11
dcterms.dateSubmitted2023-01-25
rioxxterms.versionVoRen_GB
rioxxterms.licenseref.startdate2023-12-11
rioxxterms.typeJournal Article/Reviewen_GB
refterms.dateFCD2023-12-11T09:57:50Z
refterms.versionFCDAM
refterms.dateFOA2024-02-02T16:27:05Z
refterms.panelAen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which 
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the 
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or 
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line 
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory 
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this 
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecom mons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Except where otherwise noted, this item's licence is described as © The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecom mons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.