A Language Model based Framework for New Concept Placement in Ontologies

Dong, H; Chen, J; He, Y; Gao, Y; Horrocks, I

dc.contributor.author	Dong, H
dc.contributor.author	Chen, J
dc.contributor.author	He, Y
dc.contributor.author	Gao, Y
dc.contributor.author	Horrocks, I
dc.date.accessioned	2024-03-04T15:29:02Z
dc.date.issued	2024-05-19
dc.date.updated	2024-03-04T14:32:36Z
dc.description.abstract	We investigate the task of inserting new concepts extracted from texts into an ontology using language models. We explore an approach with three steps: edge search which is to find a set of candidate locations to insert (i.e., subsumptions between concepts), edge formation and enrichment which leverages the ontological structure to produce and enhance the edge candidates, and edge selection which eventually locates the edge to be placed into. In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection. We evaluate the methods on recent datasets created using the SNOMED CT ontology and the MedMentions entity linking benchmark. The best settings in our framework use fine-tuned PLM for search and a multi-label Cross-encoder for selection. Zero-shot prompting of LLMs is still not adequate for the task, and we propose explainable instruction tuning of LLMs for improved performance. Our study shows the advantages of PLMs and highlights the encouraging performance of LLMs that motivates future studies.	en_GB
dc.description.sponsorship	Engineering and Physical Sciences Research Council (EPSRC)	en_GB
dc.description.sponsorship	Samsung Research UK (SRUK)	en_GB
dc.identifier.citation	In: The Semantic Web: 21st International Conference, ESWC 2024, Hersonissos, Crete, Greece, 26 - 30 May 2024. Proceedings, Part I, edited by Albert Meroño Peñuela, Anastasia Dimou, Raphaël Troncy, Olaf Hartig, Maribel Acosta, Mehwish Alam, Heiko Paulheim, and Pasquale Lisena, pp. 79 - 99. Lecture Notes in Computer Science, vol. 14664	en_GB
dc.identifier.doi	10.1007/978-3-031-60626-7_5
dc.identifier.grantnumber	EP/V050869/1	en_GB
dc.identifier.grantnumber	EP/S032347/1	en_GB
dc.identifier.grantnumber	EP/S019111/1	en_GB
dc.identifier.uri	http://hdl.handle.net/10871/135471
dc.identifier	ORCID: 0000-0001-6828-6891 (Dong, Hang)
dc.language.iso	en	en_GB
dc.publisher	Springer	en_GB
dc.rights.embargoreason	Under embargo until 19 May 2025 in compliance with publisher policy	en_GB
dc.rights	© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG.
dc.subject	Ontology Enrichment	en_GB
dc.subject	Concept Placement	en_GB
dc.subject	Pre-trained Language Models	en_GB
dc.subject	Large Language Models	en_GB
dc.subject	SNOMED CT	en_GB
dc.title	A Language Model based Framework for New Concept Placement in Ontologies	en_GB
dc.type	Conference paper	en_GB
dc.date.available	2024-03-04T15:29:02Z
dc.description	This is the author accepted manuscript. The final version is available from Springer via the DOI in this record	en_GB
dc.rights.uri	http://www.rioxx.net/licenses/all-rights-reserved	en_GB
rioxxterms.version	AM	en_GB
rioxxterms.licenseref.startdate	2024-03-04
rioxxterms.type	Conference Paper/Proceeding/Abstract	en_GB
refterms.dateFCD	2024-03-04T14:32:39Z
refterms.versionFCD	AM
refterms.panel	B	en_GB
pubs.name-of-conference	Extended Semantic Web Conference

Files in this item

Name:: new-concept-insertion-final.pdf
Size:: 662.0Kb
Format:: PDF
Description:: A Language Model based Framework ...

View/Open

This item appears in the following Collection(s)

Computer Science

Show simple item record

Show Statistical Information