Purpose: Ontology Matching (OM) has been studied for decades, yet fully automatic solutions remain elusive because ontologies differ in structure, granularity and vocabulary. Nevertheless, the abundant textual content attached to ontology entities suggests that the task could benefit from modern language-representation models. We therefore present the Semantically-Informed Similarity Matching Algorithm (SISMA), a novel system that matches concepts by leveraging the similarity of SBERT embeddings computed over pseudo-sentences extracted from the ontologies. Methodology: We focus on the task of class and property equivalence. We represent each ontology concept as a set of SBERT embeddings associated with each predicate. For every pair, a similarity matrix is computed and reduced to a score via linear operations with two learnable matrices. These are trained on a dedicated dataset. We evaluated our system on the OAEI benchmark alignments, training on the Conference track and testing on the Circular Economy (CE) and Material Sciences and Engineering (MSE) tracks. Findings: Our experiments reveal that the SISMA method achieves performance comparable to the state of the art. On the CE track our system achieves a higher F1-score than the participating systems, while on the MSE track it performs slightly lower. We also compared our results with a baseline across the parameter space, confirming that the training step is key to overall performance. Value: We have designed, implemented, and evaluated a novel system for ontology matching that achieves performance comparable to state-of-the-art methods. Our approach is readily extensible—primarily by training and testing on additional datasets—and the underlying idea can be realized in alternative ways, for example by replacing the current linear-operator scoring and threshold-filtering approach with a classifier that operates directly on the similarity matrix space.
Macilenti, G., Fiorelli, M., Stellato, A. (2025). SISMA: sentence embedding–based ontology matching with SBERT. In B. Spahiu, S. Vahdati, A. Salatino, T. Pellegrini, G. Havur (a cura di), Linking meaning: semantic technologies shaping the future of AI: proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria (pp. 140-157). Amsterdam : IOS Press [10.3233/ssw250016].
SISMA: sentence embedding–based ontology matching with SBERT
Macilenti, Giulio
;Fiorelli, Manuel;Stellato, Armando
2025-01-01
Abstract
Purpose: Ontology Matching (OM) has been studied for decades, yet fully automatic solutions remain elusive because ontologies differ in structure, granularity and vocabulary. Nevertheless, the abundant textual content attached to ontology entities suggests that the task could benefit from modern language-representation models. We therefore present the Semantically-Informed Similarity Matching Algorithm (SISMA), a novel system that matches concepts by leveraging the similarity of SBERT embeddings computed over pseudo-sentences extracted from the ontologies. Methodology: We focus on the task of class and property equivalence. We represent each ontology concept as a set of SBERT embeddings associated with each predicate. For every pair, a similarity matrix is computed and reduced to a score via linear operations with two learnable matrices. These are trained on a dedicated dataset. We evaluated our system on the OAEI benchmark alignments, training on the Conference track and testing on the Circular Economy (CE) and Material Sciences and Engineering (MSE) tracks. Findings: Our experiments reveal that the SISMA method achieves performance comparable to the state of the art. On the CE track our system achieves a higher F1-score than the participating systems, while on the MSE track it performs slightly lower. We also compared our results with a baseline across the parameter space, confirming that the training step is key to overall performance. Value: We have designed, implemented, and evaluated a novel system for ontology matching that achieves performance comparable to state-of-the-art methods. Our approach is readily extensible—primarily by training and testing on additional datasets—and the underlying idea can be realized in alternative ways, for example by replacing the current linear-operator scoring and threshold-filtering approach with a classifier that operates directly on the similarity matrix space.| File | Dimensione | Formato | |
|---|---|---|---|
|
SSW-62-SSW250016.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
925.17 kB
Formato
Adobe PDF
|
925.17 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


