Evaluating Explainable Machine Learning Models for Clinicians

IRIS

Gaining clinicians' trust will unleash the full potential of artificial intelligence (AI) in medicine, and explaining AI decisions is seen as the way to build trustworthy systems. However, explainable artificial intelligence (XAI) methods in medicine often lack a proper evaluation. In this paper, we present our evaluation methodology for XAI methods using forward simulatability. We define the Forward Simulatability Score (FSS) and analyze its limitations in the context of clinical predictors. Then, we applied FSS to our XAI approach defined over an ML-RO, a machine learning clinical predictor based on random optimization over a multiple kernel support vector machine (SVM) algorithm. To Compare FSS values before and after the explanation phase, we test our evaluation methodology for XAI methods on three clinical datasets, namely breast cancer, VTE, and migraine. The ML-RO system is a good model on which to test our XAI evaluation strategy based on the FSS. Indeed, ML-RO outperforms two other base models-a decision tree (DT) and a plain SVM-in the three datasets and gives the possibility of defining different XAI models: TOPK, MIGF, and F4G. The FSS evaluation score suggests that the explanation method F4G for the ML-RO is the most effective in two datasets out of the three tested, and it shows the limits of the learned model for one dataset. Our study aims to introduce a standard practice for evaluating XAI methods in medicine. By establishing a rigorous evaluation framework, we seek to provide healthcare professionals with reliable tools for assessing the performance of XAI methods to enhance the adoption of AI systems in clinical practice.

Scarpato, N., Nourbakhsh, A., Ferroni, P., Riondino, S., Roselli, M., Fallucchi, F., et al. (2024). Evaluating Explainable Machine Learning Models for Clinicians. COGNITIVE COMPUTATION, 16(4), 1436-1446 [10.1007/s12559-024-10297-x].

Evaluating Explainable Machine Learning Models for Clinicians

Scarpato N.;Nourbakhsh A.;Ferroni P.;Riondino S.;Roselli M.;Fallucchi F.;Barbanti P.;Guadagni F.;Zanzotto F. M.

2024-05-31

Abstract

Gaining clinicians' trust will unleash the full potential of artificial intelligence (AI) in medicine, and explaining AI decisions is seen as the way to build trustworthy systems. However, explainable artificial intelligence (XAI) methods in medicine often lack a proper evaluation. In this paper, we present our evaluation methodology for XAI methods using forward simulatability. We define the Forward Simulatability Score (FSS) and analyze its limitations in the context of clinical predictors. Then, we applied FSS to our XAI approach defined over an ML-RO, a machine learning clinical predictor based on random optimization over a multiple kernel support vector machine (SVM) algorithm. To Compare FSS values before and after the explanation phase, we test our evaluation methodology for XAI methods on three clinical datasets, namely breast cancer, VTE, and migraine. The ML-RO system is a good model on which to test our XAI evaluation strategy based on the FSS. Indeed, ML-RO outperforms two other base models-a decision tree (DT) and a plain SVM-in the three datasets and gives the possibility of defining different XAI models: TOPK, MIGF, and F4G. The FSS evaluation score suggests that the explanation method F4G for the ML-RO is the most effective in two datasets out of the three tested, and it shows the limits of the learned model for one dataset. Our study aims to introduce a standard practice for evaluating XAI methods in medicine. By establishing a rigorous evaluation framework, we seek to provide healthcare professionals with reliable tools for assessing the performance of XAI methods to enhance the adoption of AI systems in clinical practice.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				31-mag-2024
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1007/s12559-024-10297-x
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Esperti anonimi
			
	Settore disciplinare dell'articolo (valido dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Explainable artificial intelligence
Machine learning
Precision medicine
Artificial intelligence
Feature importance
			
	Citazione
	
				Scarpato, N., Nourbakhsh, A., Ferroni, P., Riondino, S., Roselli, M., Fallucchi, F., et al. (2024). Evaluating Explainable Machine Learning Models for Clinicians. COGNITIVE COMPUTATION, 16(4), 1436-1446 [10.1007/s12559-024-10297-x].
			
	Tutti gli autori
	
						Scarpato, N; Nourbakhsh, A; Ferroni, P; Riondino, S; Roselli, M; Fallucchi, F; Barbanti, P; Guadagni, F; Zanzotto, Fm
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/389006

Citazioni

ND

8

8

social impact