Composable Interventions for Language Models

Kolbeinsson, A; O'Brien, K; Huang, T; Gao, S; Liu, S; Schwarz, JR; Vaidya, A; Mahmood, F; Zitnik, M; Chen, T; Hartvigsen, T

dc.contributor.author	Kolbeinsson, A
dc.contributor.author	O'Brien, K
dc.contributor.author	Huang, T
dc.contributor.author	Gao, S
dc.contributor.author	Liu, S
dc.contributor.author	Schwarz, JR
dc.contributor.author	Vaidya, A
dc.contributor.author	Mahmood, F
dc.contributor.author	Zitnik, M
dc.contributor.author	Chen, T
dc.contributor.author	Hartvigsen, T
dc.date.accessioned	2025-03-11T15:23:36Z
dc.date.issued	2025
dc.date.updated	2025-02-28T20:41:31Z
dc.description.abstract	Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining. But despite a flood of new methods, different types of interventions are largely developing independently. In practice, multiple interventions must be applied sequentially to the same model, yet we lack standardized ways to study how interventions interact. We fill this gap by introducing composable interventions, a framework to study the effects of using multiple interventions on the same language models, featuring new metrics and a unified codebase. Using our framework, we conduct extensive experiments and compose popular methods from three emerging intervention categories -- Knowledge Editing, Model Compression, and Machine Unlearning. Our results from 310 different compositions uncover meaningful interactions: compression hinders editing and unlearning, composing interventions hinges on their order of application, and popular general-purpose metrics are inadequate for assessing composability. Taken together, our findings showcase clear gaps in composability, suggesting a need for new multi-objective interventions. All of our code is public: https://github.com/hartvigsen-group/composable-interventions.	en_GB
dc.identifier.citation	ICLR 2025 - The Thirteenth International Conference on Learning Representations, 24 - 28 April 2025, Singapore. Awaiting full citation and link	en_GB
dc.identifier.uri	http://hdl.handle.net/10871/140593
dc.identifier	ORCID: 0000-0002-7740-8843 (Huang, Tianjin)
dc.language.iso	en	en_GB
dc.publisher	International Conference on Learning Representations	en_GB
dc.relation.url	https://iclr.cc/Conferences/2025	en_GB
dc.relation.url	https://iclr.cc/virtual/2025/papers.html	en_GB
dc.relation.url	https://iclr.cc/virtual/2025/poster/28014	en_GB
dc.rights.embargoreason	Under embargo until close of conference	en_GB
dc.rights	© 2025 The author(s)	en_GB
dc.title	Composable Interventions for Language Models	en_GB
dc.type	Conference paper	en_GB
dc.date.available	2025-03-11T15:23:36Z
dc.description	This is the final version.	en_GB
dc.rights.uri	http://www.rioxx.net/licenses/all-rights-reserved	en_GB
rioxxterms.version	VoR	en_GB
rioxxterms.licenseref.startdate	2025-03-11
rioxxterms.type	Conference Paper/Proceeding/Abstract	en_GB
refterms.dateFCD	2025-03-11T15:20:59Z
refterms.versionFCD	VoR
refterms.panel	B	en_GB

Files in this item

Name:: 7753_Composable_Interventions_.pdf
Size:: 1.452Mb
Format:: PDF
Description:: Composable interventions for ...

View/Open

This item appears in the following Collection(s)

Computer Science

Show simple item record

Show Statistical Information