Show simple item record

dc.contributor.authorKolbeinsson, A
dc.contributor.authorO'Brien, K
dc.contributor.authorHuang, T
dc.contributor.authorGao, S
dc.contributor.authorLiu, S
dc.contributor.authorSchwarz, JR
dc.contributor.authorVaidya, A
dc.contributor.authorMahmood, F
dc.contributor.authorZitnik, M
dc.contributor.authorChen, T
dc.contributor.authorHartvigsen, T
dc.date.accessioned2025-03-11T15:23:36Z
dc.date.issued2025
dc.date.updated2025-02-28T20:41:31Z
dc.description.abstractTest-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining. But despite a flood of new methods, different types of interventions are largely developing independently. In practice, multiple interventions must be applied sequentially to the same model, yet we lack standardized ways to study how interventions interact. We fill this gap by introducing composable interventions, a framework to study the effects of using multiple interventions on the same language models, featuring new metrics and a unified codebase. Using our framework, we conduct extensive experiments and compose popular methods from three emerging intervention categories -- Knowledge Editing, Model Compression, and Machine Unlearning. Our results from 310 different compositions uncover meaningful interactions: compression hinders editing and unlearning, composing interventions hinges on their order of application, and popular general-purpose metrics are inadequate for assessing composability. Taken together, our findings showcase clear gaps in composability, suggesting a need for new multi-objective interventions. All of our code is public: https://github.com/hartvigsen-group/composable-interventions.en_GB
dc.identifier.citationICLR 2025 - The Thirteenth International Conference on Learning Representations, 24 - 28 April 2025, Singapore. Awaiting full citation and linken_GB
dc.identifier.urihttp://hdl.handle.net/10871/140593
dc.identifierORCID: 0000-0002-7740-8843 (Huang, Tianjin)
dc.language.isoenen_GB
dc.publisherInternational Conference on Learning Representationsen_GB
dc.relation.urlhttps://iclr.cc/Conferences/2025en_GB
dc.relation.urlhttps://iclr.cc/virtual/2025/papers.htmlen_GB
dc.relation.urlhttps://iclr.cc/virtual/2025/poster/28014en_GB
dc.rights.embargoreasonUnder embargo until close of conferenceen_GB
dc.rights© 2025 The author(s)en_GB
dc.titleComposable Interventions for Language Modelsen_GB
dc.typeConference paperen_GB
dc.date.available2025-03-11T15:23:36Z
dc.descriptionThis is the final version.en_GB
dc.rights.urihttp://www.rioxx.net/licenses/all-rights-reserveden_GB
rioxxterms.versionVoRen_GB
rioxxterms.licenseref.startdate2025-03-11
rioxxterms.typeConference Paper/Proceeding/Abstracten_GB
refterms.dateFCD2025-03-11T15:20:59Z
refterms.versionFCDVoR
refterms.panelBen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record