GQA-it: Italian question answering on image scene graphs

IRIS

The recent breakthroughs in the field of deep learning have lead to state-of-the-art results in several Computer Vision and Natural Language Processing tasks such as Visual Question Answering (VQA). Nevertheless, the training requirements in cross-linguistic settings are not completely satisfying at the moment. The datasets suitable for training VQA systems for non-English languages are still not available, thus representing a significant barrier for most neural methods. This paper explores the possibility of acquiring in a semiautomatic fashion a large-scale dataset for VQA in Italian. It consists of more than 1 M question-answer pairs over 80k images, with a test set of 3,000 question-answer pairs manually validated. To the best of our knowledge, the models trained on this dataset represent the first attempt to approach VQA in Italian, with experimental results comparable with those obtained on the English original material.

Croce, D., Passaro, L.c., Lenci, A., Basili, R. (2021). GQA-it: Italian question answering on image scene graphs. In CLiC-it 2021: Italian Conference on Computational Linguistics 2021: proceedings of the eighth italian conference on computational linguistics. CEUR-WS.

GQA-it: Italian question answering on image scene graphs

Croce D.;Passaro L. C.;Lenci A.;Basili R.

2021-01-01

Abstract

The recent breakthroughs in the field of deep learning have lead to state-of-the-art results in several Computer Vision and Natural Language Processing tasks such as Visual Question Answering (VQA). Nevertheless, the training requirements in cross-linguistic settings are not completely satisfying at the moment. The datasets suitable for training VQA systems for non-English languages are still not available, thus representing a significant barrier for most neural methods. This paper explores the possibility of acquiring in a semiautomatic fashion a large-scale dataset for VQA in Italian. It consists of more than 1 M question-answer pairs over 80k images, with a test set of 3,000 question-answer pairs manually validated. To the best of our knowledge, the models trained on this dataset represent the first attempt to approach VQA in Italian, with experimental results comparable with those obtained on the English original material.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				8th Italian Conference on Computational Linguistics, CLiC-it 2021
			
	Luogo del convegno
	
				Universita degli Studi di Milano-Bicocca, Italia
			
	Anno del convegno
	
				2022
			
	Numero del convegno
	
				8
			
	Organizzatore/i del convegno
	
				B13
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data di pubblicazione
	
				2021
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore INF/01
Settore ING-INF/05
			
	Lingua del contenuto
	
				English
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Croce, D., Passaro, L.c., Lenci, A., Basili, R. (2021). GQA-it: Italian question answering on image scene graphs. In CLiC-it 2021: Italian Conference on Computational Linguistics 2021: proceedings of the eighth italian conference on computational linguistics. CEUR-WS.
			
	Tutti gli autori
	
						Croce, D; Passaro, Lc; Lenci, A; Basili, R
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/359268

Citazioni

ND

4

ND

social impact