Ontology Alignment (OA) is a complex, demanding and error-prone task, requiring the intervention of domain and Semantic Web experts. Automating the alignment process thus becomes a must-do, especially when involving large datasets, to at least produce a first input for human experts. Automated ontology alignment could benefit from the outstanding language ability of Large Language Models (LLMs), which could implicitly provide the background knowledge that has been the Achilles' heel of traditional alignment systems. However, this requires a correct evaluation of the performance of LLMs and understanding the best way to incorporate them into more specific tools. In this paper, we show that a naive prompting approach on the popular GPT-4 model could face several problems when transferred to real-world use cases. To this end, we replicated the methods of Norouzi et al. (2023), applied to the OAEI 2022 conference track, on a reference alignment between a pair of datasets (reduced versions of two popular thesauri: European Commission's EuroVoc and TESEO, from the Italian Senate of the Republic), which has never been tested in OAEI evaluation campaigns. This reference alignment has several features common to real-world use cases: it is has a larger size than those considered in the study we replicated, it is not published online and is therefore not subject to data contamination and it involves relations between concepts that are more complex than simple equivalence. The replicated methods achieved a significantly lower performance on our reference alignment than on the OAEI 2022 conference track, suggesting that size, data contamination, and semantic complexity need to be considered when using LLMs for the alignment task.

Macilenti, G., Stellato, A., Fiorelli, M. (2024). Prompting is not all you need: evaluating GPT-4 performance on a real-world ontology alignment use case. PROCEDIA COMPUTER SCIENCE, 246, 1289-1298 [10.1016/j.procs.2024.09.557].

Prompting is not all you need: evaluating GPT-4 performance on a real-world ontology alignment use case

Macilenti, Giulio
;
Stellato, Armando;Fiorelli, Manuel
2024-01-01

Abstract

Ontology Alignment (OA) is a complex, demanding and error-prone task, requiring the intervention of domain and Semantic Web experts. Automating the alignment process thus becomes a must-do, especially when involving large datasets, to at least produce a first input for human experts. Automated ontology alignment could benefit from the outstanding language ability of Large Language Models (LLMs), which could implicitly provide the background knowledge that has been the Achilles' heel of traditional alignment systems. However, this requires a correct evaluation of the performance of LLMs and understanding the best way to incorporate them into more specific tools. In this paper, we show that a naive prompting approach on the popular GPT-4 model could face several problems when transferred to real-world use cases. To this end, we replicated the methods of Norouzi et al. (2023), applied to the OAEI 2022 conference track, on a reference alignment between a pair of datasets (reduced versions of two popular thesauri: European Commission's EuroVoc and TESEO, from the Italian Senate of the Republic), which has never been tested in OAEI evaluation campaigns. This reference alignment has several features common to real-world use cases: it is has a larger size than those considered in the study we replicated, it is not published online and is therefore not subject to data contamination and it involves relations between concepts that are more complex than simple equivalence. The replicated methods achieved a significantly lower performance on our reference alignment than on the OAEI 2022 conference track, suggesting that size, data contamination, and semantic complexity need to be considered when using LLMs for the alignment task.
2024
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore INF/01
Settore ING-INF/05
Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
English
Large language models
Ontology alignment
Semantic technologies
Macilenti, G., Stellato, A., Fiorelli, M. (2024). Prompting is not all you need: evaluating GPT-4 performance on a real-world ontology alignment use case. PROCEDIA COMPUTER SCIENCE, 246, 1289-1298 [10.1016/j.procs.2024.09.557].
Macilenti, G; Stellato, A; Fiorelli, M
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1877050924026036-main.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 894.61 kB
Formato Adobe PDF
894.61 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/426343
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact