This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.
Hromei, C.d., Croce, D., Basile, V., Basili, R. (2023). Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian. In AIxIA 2023 – Advances in Artificial Intelligence (pp.172-186). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_12].
Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian
Hromei C. D.;Croce D.;Basile V.;Basili R.
2023-01-01
Abstract
This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.