This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.

Hromei, C.d., Croce, D., Basile, V., Basili, R. (2023). Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian. In AIxIA 2023 – Advances in Artificial Intelligence (pp.172-186). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_12].

Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian

Hromei C. D.;Croce D.;Basile V.;Basili R.
2023-01-01

Abstract

This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.
22nd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023
Roma, Italia
2023
22
Rilevanza internazionale
2023
Settore INF/01
Settore ING-INF/05
English
Affect Detection and Discourse Coherence; Authorship Analysis; Computational Ethics; Information Extraction; Large Language Models; Multi-task Learning; Named Entity Recognition; Semantic Processing Task
Intervento a convegno
Hromei, C.d., Croce, D., Basile, V., Basili, R. (2023). Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian. In AIxIA 2023 – Advances in Artificial Intelligence (pp.172-186). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_12].
Hromei, Cd; Croce, D; Basile, V; Basili, R
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/359283
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact