Though very important in software engineering, linking artifacts of the same type (clone detection) or of different types (traceability recovery) is extremely tedious, error-prone and requires significant effort. Past research focused on supporting analysts with mechanisms based on Natural Language Processing (NLP) to identify candidate links. Because a plethora of NLP techniques exists, and their performances vary among contexts, it is important to characterize them according to the provided level of support. The aim of this paper is to characterize a comprehensive set of NLP techniques according to the provided level of support to human analysts in detecting equivalent requirements. The characterization consists on a case study, featuring real requirements, in the context of an Italian company in the defense and aerospace domain. The major result from the case study is that simple NLP are more precise than complex ones.
Falessi, D., Cantone, G., Canfora, G. (2010). A comprehensive characterization of NLP techniques for identifying equivalent requirements. In Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement (pp.100-110). IEEE CS [10.1145/1852786.1852810].
A comprehensive characterization of NLP techniques for identifying equivalent requirements
FALESSI, DAVIDE
;CANTONE, GIOVANNI;
2010-09-10
Abstract
Though very important in software engineering, linking artifacts of the same type (clone detection) or of different types (traceability recovery) is extremely tedious, error-prone and requires significant effort. Past research focused on supporting analysts with mechanisms based on Natural Language Processing (NLP) to identify candidate links. Because a plethora of NLP techniques exists, and their performances vary among contexts, it is important to characterize them according to the provided level of support. The aim of this paper is to characterize a comprehensive set of NLP techniques according to the provided level of support to human analysts in detecting equivalent requirements. The characterization consists on a case study, featuring real requirements, in the context of an Italian company in the defense and aerospace domain. The major result from the case study is that simple NLP are more precise than complex ones.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.