Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian

Hromei, Cd; Croce, D; Basile, V; Basili, R

doi:10.1007/978-3-031-47546-7_12

This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.

Hromei, C.d., Croce, D., Basile, V., Basili, R. (2023). Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian. In AIxIA 2023 – Advances in Artificial Intelligence (pp.172-186). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_12].

Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian

Hromei C. D.;Croce D.;Basile V.;Basili R.

2023-01-01

Abstract

This paper explores the potential of utilizing a unified neural model to tackle multiple and complex semantic processing tasks in the Italian language. We applied a state-of-the-art instruction-tuned Decoder-only Large Language Model to the recent EVALITA 2023 [17] challenge, which encompassed 13 different tasks and 22 subtasks across diverse semantic dimensions, such as Affect Detection, Authorship Analysis, Computational Ethics, Named Entity Recognition, Information Extraction, and Discourse Coherence. Our approach focuses on representing tasks using natural language instructions, for which prompts to the model are designed able to define the process as well as the desired responses. Notably, this single neural model achieved first place in 41% of the subtasks and demonstrated top-three performance in 64% of them. A dedicated experiment was also conducted to investigate the degree of linguistic generalization achieved by the LLM specifically, through instruction-tuning it with limited sets of training data. Results suggest that instruction-tuning is still required to capture dependencies between input and output even in such LLMs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				22nd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023
			
	Luogo del convegno
	
				Roma, Italia
			
	Anno del convegno
	
				2023
			
	Numero del convegno
	
				22
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data di pubblicazione
	
				2023
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-031-47546-7_12
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore INF/01
Settore ING-INF/05
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Affect Detection and Discourse Coherence; Authorship Analysis; Computational Ethics; Information Extraction; Large Language Models; Multi-task Learning; Named Entity Recognition; Semantic Processing Task
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Hromei, C.d., Croce, D., Basile, V., Basili, R. (2023). Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian. In AIxIA 2023 – Advances in Artificial Intelligence (pp.172-186). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_12].
			
	Tutti gli autori
	
						Hromei, Cd; Croce, D; Basile, V; Basili, R
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/359283

Citazioni

ND

0

ND

Scaling large language models to the extreme: neural semantic processing of multiple tasks in italian

Hromei C. D.;Croce D.;Basile V.;Basili R.

2023-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)