This paper explores the employment of LLMs, specifically of Mistral-Nemo, in the semi-automatic population of the Ancient Greek WordNet synsets. Several approaches are investigated: zero-shot, few-shots, and fine-tuning. The results are compared against an English baseline. Zero-shot approach yields the highest accuracy, while fine-tuning leads to the highest number of potential synonyms. Our analysis also reveals that polysemy and PoS play a role in the model’s performance, as the highest scores are registered for polysemous words and for verbs and nouns. The results are encouraging for the application of such approaches in a human-in-the-loop scenario, since human validation still proves crucial in ensuring the accuracy of the results.

Marchesi, B., Clementelli, A., Maurizio Mammarella, A., Zampetta, S., Biagetti, E., Brigada Villa, L., et al. (2025). Towards the Semi-Automated Population of the Ancient Greek WordNet. In E.J. Cristina Bosco (a cura di), Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025) (pp. 647-658). Aachen : ACL Anthology / CEUR Workshop Proceedings.

Towards the Semi-Automated Population of the Ancient Greek WordNet

Claudia Roberta Combei;
2025-01-01

Abstract

This paper explores the employment of LLMs, specifically of Mistral-Nemo, in the semi-automatic population of the Ancient Greek WordNet synsets. Several approaches are investigated: zero-shot, few-shots, and fine-tuning. The results are compared against an English baseline. Zero-shot approach yields the highest accuracy, while fine-tuning leads to the highest number of potential synonyms. Our analysis also reveals that polysemy and PoS play a role in the model’s performance, as the highest scores are registered for polysemous words and for verbs and nouns. The results are encouraging for the application of such approaches in a human-in-the-loop scenario, since human validation still proves crucial in ensuring the accuracy of the results.
2025
Settore L-LIN/01
Settore GLOT-01/A - Glottologia e linguistica
English
Rilevanza internazionale
Articolo scientifico in atti di convegno
Lexical semantics; synonym generation; LLMs; Ancient Greek; WordNet
https://aclanthology.org/2025.clicit-1.62/
Marchesi, B., Clementelli, A., Maurizio Mammarella, A., Zampetta, S., Biagetti, E., Brigada Villa, L., et al. (2025). Towards the Semi-Automated Population of the Ancient Greek WordNet. In E.J. Cristina Bosco (a cura di), Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025) (pp. 647-658). Aachen : ACL Anthology / CEUR Workshop Proceedings.
Marchesi, B; Clementelli, A; Maurizio Mammarella, A; Zampetta, S; Biagetti, E; Brigada Villa, L; Mastellari, V; Ginevra, R; Combei, C; Zanchi., C...espandi
Contributo in libro
File in questo prodotto:
File Dimensione Formato  
Combei_etal_2025_Greek_WordNet.pdf

accesso aperto

Descrizione: Copyright ©2025 for the individual papers by the papers’ authors. Copyright ©2025 for the volume as a collection by its editors. This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0). ACL materials are Copyright © 1963–2025 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
Tipologia: Versione Editoriale (PDF)
Licenza: Copyright degli autori
Dimensione 708.41 kB
Formato Adobe PDF
708.41 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/444243
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact