Sparse matrix computations on clusters with GPGPUs

IRIS

Hybrid nodes containing GPUs are rapidly becoming the norm in parallel machines. We have conducted some experiments regarding how to plug GPU-enabled computational kernels into PSBLAS, a MPI-based library specifically geared towards sparse matrix computations. In this paper, we present our findings on which strategies are more promising in the quest for the optimal compromise among raw performance, speedup, software maintainability, and extensibility. We consider several solutions to implement the data exchange with the GPU focusing on the data access and transfer, and present an experimental evaluation for a cluster system with up to two GPUs per node. In particular, we compare the pinned memory and the Open-MPI approaches, which are the two most used alternatives for multi-GPU communication in a cluster environment. We find that OpenMPI turns out to be the best solution for large data transfers, while the pinned memory approach is still a good solution for small transfers between GPUs.

Cardellini, V., Fanfarillo, A., Filippone, S. (2014). Sparse matrix computations on clusters with GPGPUs. In Proceedings of the 2014 International Conference on High Performance Computing and Simulation (HPCS 2014) (pp.23-30). IEEE.

Sparse matrix computations on clusters with GPGPUs

CARDELLINI, VALERIA;Fanfarillo, A;FILIPPONE, SALVATORE

2014-01-01

Abstract

Hybrid nodes containing GPUs are rapidly becoming the norm in parallel machines. We have conducted some experiments regarding how to plug GPU-enabled computational kernels into PSBLAS, a MPI-based library specifically geared towards sparse matrix computations. In this paper, we present our findings on which strategies are more promising in the quest for the optimal compromise among raw performance, speedup, software maintainability, and extensibility. We consider several solutions to implement the data exchange with the GPU focusing on the data access and transfer, and present an experimental evaluation for a cluster system with up to two GPUs per node. In particular, we compare the pinned memory and the Open-MPI approaches, which are the two most used alternatives for multi-GPU communication in a cluster environment. We find that OpenMPI turns out to be the best solution for large data transfers, while the pinned memory approach is still a good solution for small transfers between GPUs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				2014 International Conference on High Performance Computing and Simulation (HPCS 2014)
			
	Luogo del convegno
	
				Bologna, Italy
			
	Anno del convegno
	
				2014
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Tipo di relazione
	
				contributo
			
	Data dell'intervento
	
				lug-2014
			
	Data di pubblicazione
	
				2014
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
			
	Lingua del contenuto
	
				English
			
	URL alternativo
	
				https://ieeexplore.ieee.org/document/6903665
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Cardellini, V., Fanfarillo, A., Filippone, S. (2014). Sparse matrix computations on clusters with GPGPUs. In Proceedings of the 2014 International Conference on High Performance Computing and Simulation (HPCS 2014) (pp.23-30). IEEE.
			
	Tutti gli autori
	
						Cardellini, V; Fanfarillo, A; Filippone, S
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/93529

Citazioni

ND

1

1

social impact