The definition of high quality datasets for benchmarking single components and entire systems in intelligent robots is a fundamental task for developing, testing and comparing different technical solutions. In this paper, we describe the methodology adopted for the acquisition and the creation of a spoken corpus for domestic and service robots. The corpus has been inspired by and acquired in the RoboCup@Home setting, with the involvement of RoboCup@Home participants. The annotated data set is publicly available for developing, testing and comparing speech understanding functionalities of domestic and service robots, not only for teams involved in RoboCup@Home or in other competitions, but also for research groups active in the field. We regard the construction of the dataset as a first step towards a full benchmarking methodology for spoken language interaction in service robotics.
Bastianelli, E., Iocchi, L., Nardi, D., Castellucci, G., Croce, D., Basili, R. (2015). RoboCup@Home spoken corpus: Using robotic competitions for gathering datasets. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (pp.19-30). Springer Verlag [10.1007/978-3-319-18615-3_2].
RoboCup@Home spoken corpus: Using robotic competitions for gathering datasets
CROCE, DANILO;BASILI, ROBERTO
2015-01-01
Abstract
The definition of high quality datasets for benchmarking single components and entire systems in intelligent robots is a fundamental task for developing, testing and comparing different technical solutions. In this paper, we describe the methodology adopted for the acquisition and the creation of a spoken corpus for domestic and service robots. The corpus has been inspired by and acquired in the RoboCup@Home setting, with the involvement of RoboCup@Home participants. The annotated data set is publicly available for developing, testing and comparing speech understanding functionalities of domestic and service robots, not only for teams involved in RoboCup@Home or in other competitions, but also for research groups active in the field. We regard the construction of the dataset as a first step towards a full benchmarking methodology for spoken language interaction in service robotics.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.