CorAIt – A non-native speech database for Italian

IRIS

CorAIt is a non-native speech database for Italian, which is freely accessible online for academic research purposes. It was especially designed to meet the requirements of a larger research project focused on foreign accented Italian speech. The corpus is aimed at providing a uniform collection of speech samples uttered by non-native speakers of Italian. To date, 105 non-native speakers – whose mother tongues are either French, Romanian, Spanish, English, German, or Russian – have been recorded. The corpus includes also a control group made up of 16 Italian speakers. There are almost 8 hours of audio material, both read speech (first and second reading), and spontaneous speech. This paper emphasizes the necessity for this type of database, it describes the steps involved in its construction, and it presents the features of CorAIt.

Combei, C.r. (2017). CorAIt – A non-native speech database for Italian. In M.N. Roberto Basili (a cura di), Proceedings of the Fourth Italian Conference on Computational Linguistics CLiC-it 2017 (pp. 113-118). Torino : Accademia University Press [10.4000/books.aaccademia.2386].

CorAIt – A non-native speech database for Italian

Claudia Roberta Combei

2017-01-01

Abstract

CorAIt is a non-native speech database for Italian, which is freely accessible online for academic research purposes. It was especially designed to meet the requirements of a larger research project focused on foreign accented Italian speech. The corpus is aimed at providing a uniform collection of speech samples uttered by non-native speakers of Italian. To date, 105 non-native speakers – whose mother tongues are either French, Romanian, Spanish, English, German, or Russian – have been recorded. The corpus includes also a control group made up of 16 Italian speakers. There are almost 8 hours of audio material, both read speech (first and second reading), and spontaneous speech. This paper emphasizes the necessity for this type of database, it describes the steps involved in its construction, and it presents the features of CorAIt.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2017
			
	DOI del contributo
	
				https://dx.doi.org/10.4000/books.aaccademia.2386
			
	Settore disciplinare del contributo (valido fino a 24/06/2024)
	
				Settore L-LIN/01
			
	Settore disciplinare del contributo (valido dal 09/05/2024)
	
				Settore GLOT-01/A - Glottologia e linguistica
			
	Lingua del contenuto
	
				English
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo scientifico in atti di convegno
			
	Parole chiave
	
				non-native speech database
learner corpus
L2 Italian
audio corpus
			
	URL alternativo
	
				http://dx.doi.org/10.4000/books.aaccademia.2386
			
	Citazione
	
				Combei, C.r. (2017). CorAIt – A non-native speech database for Italian. In M.N. Roberto Basili (a cura di), Proceedings of the Fourth Italian Conference on Computational Linguistics CLiC-it 2017 (pp. 113-118). Torino : Accademia University Press [10.4000/books.aaccademia.2386].
			
	Tutti gli autori
	
						Combei, Cr
					
	Tipologia
	
				Contributo in libro
			
	Appare nelle tipologie:
	
				03 - Contributo in libro

File in questo prodotto:

File	Dimensione	Formato
Combei_2017_Corait_Corpus.pdf non disponibili Licenza: Copyright dell'editore Dimensione 769.74 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	769.74 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/410696

Citazioni

ND

0

ND

social impact