On the need of preserving order of data when validating within-project defect classifiers

IRIS

We are in the shoes of a practitioner who uses previous project releases' data to predict which classes of the current release are defect-prone. In this scenario, the practitioner would like to use the most accurate classifier among the many available ones. A validation technique, hereinafter "technique", defines how to measure the prediction accuracy of a classifier. Several previous research efforts analyzed several techniques. However, no previous study compared validation techniques in the within-project across-release class-level context or considered techniques that preserve the order of data. In this paper, we investigate which technique recommends the most accurate classifier. We use the last release of a project as the ground truth to evaluate the classifier's accuracy and hence the ability of a technique to recommend an accurate classifier. We consider nine classifiers, two industry and 13 open projects, and three validation techniques: namely 10-fold cross-validation (i.e., the most used technique), bootstrap (i.e., the recommended technique), and walk-forward (i.e., a technique preserving the order of data). Our results show that: 1) classifiers differ in accuracy in all datasets regardless of their entity per value, 2) walk-forward outperforms both 10-fold cross-validation and bootstrap statistically in all three accuracy metrics: AUC of the selected classifier, bias and absolute bias, 3) surprisingly, all techniques resulted to be more prone to overestimate than to underestimate the performances of classifiers, and 3) the defect rate resulted in changing between the second and first half in both industry projects and 83% of open-source datasets. This study recommends the use of techniques that preserve the order of data such as walk-forward over 10-fold cross-validation and bootstrap in the within-project across-release class-level context given the above empirical results and that walk-forward is by nature more simple, inexpensive, and stable than the other two techniques.

Falessi, D., Huang, J., Narayana, L., Thai, J., Turhan, B. (2020). On the need of preserving order of data when validating within-project defect classifiers. EMPIRICAL SOFTWARE ENGINEERING, 25(6), 4805-4830 [10.1007/s10664-020-09868-x].

On the need of preserving order of data when validating within-project defect classifiers

Falessi, D;Huang, J;Narayana, L;Thai, JF;Turhan, B

2020-01-01

Abstract

We are in the shoes of a practitioner who uses previous project releases' data to predict which classes of the current release are defect-prone. In this scenario, the practitioner would like to use the most accurate classifier among the many available ones. A validation technique, hereinafter "technique", defines how to measure the prediction accuracy of a classifier. Several previous research efforts analyzed several techniques. However, no previous study compared validation techniques in the within-project across-release class-level context or considered techniques that preserve the order of data. In this paper, we investigate which technique recommends the most accurate classifier. We use the last release of a project as the ground truth to evaluate the classifier's accuracy and hence the ability of a technique to recommend an accurate classifier. We consider nine classifiers, two industry and 13 open projects, and three validation techniques: namely 10-fold cross-validation (i.e., the most used technique), bootstrap (i.e., the recommended technique), and walk-forward (i.e., a technique preserving the order of data). Our results show that: 1) classifiers differ in accuracy in all datasets regardless of their entity per value, 2) walk-forward outperforms both 10-fold cross-validation and bootstrap statistically in all three accuracy metrics: AUC of the selected classifier, bias and absolute bias, 3) surprisingly, all techniques resulted to be more prone to overestimate than to underestimate the performances of classifiers, and 3) the defect rate resulted in changing between the second and first half in both industry projects and 83% of open-source datasets. This study recommends the use of techniques that preserve the order of data such as walk-forward over 10-fold cross-validation and bootstrap in the within-project across-release class-level context given the above empirical results and that walk-forward is by nature more simple, inexpensive, and stable than the other two techniques.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2020
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1007/s10664-020-09868-x
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Comitato scientifico
			
	Settore disciplinare dell'articolo (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Defect classifiers
Classifiers
Model validation techniques
			
	URL alternativo
	
				https://arxiv.org/abs/1809.01510
			
	Citazione
	
				Falessi, D., Huang, J., Narayana, L., Thai, J., Turhan, B. (2020). On the need of preserving order of data when validating within-project defect classifiers. EMPIRICAL SOFTWARE ENGINEERING, 25(6), 4805-4830 [10.1007/s10664-020-09868-x].
			
	Tutti gli autori
	
						Falessi, D; Huang, J; Narayana, L; Thai, J; Turhan, B
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
1809.01510.pdf solo utenti autorizzati Tipologia: Documento in Pre-print Licenza: Copyright dell'editore Dimensione 1.05 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.05 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/256963

Citazioni

ND

33

30

social impact