An analytical model for a parallel fault-tolerant computing system

IRIS

We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.

De Nitto Persone', V., Grassi, V. (1999). An analytical model for a parallel fault-tolerant computing system. PERFORMANCE EVALUATION, 38(3-4), 201-218 [10.1016/S0166-5316(99)00047-4].

An analytical model for a parallel fault-tolerant computing system

DE NITTO PERSONE', VITTORIA;GRASSI, VINCENZO

1999-01-01

Abstract

We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				1999
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/S0166-5316(99)00047-4
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Esperti anonimi
			
	Settore disciplinare dell'articolo (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
			
	Lingua del contenuto
	
				English
			
	Impact Factor ISI
	
				Con Impact Factor ISI
			
	Parole chiave
	
				parallel system; fork-join; blocking; performability; maintenance policy
			
	Citazione
	
				De Nitto Persone', V., Grassi, V. (1999). An analytical model for a parallel fault-tolerant computing system. PERFORMANCE EVALUATION, 38(3-4), 201-218 [10.1016/S0166-5316(99)00047-4].
			
	Tutti gli autori
	
						De Nitto Persone', V; Grassi, V
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/49493

Citazioni

ND

6

3

social impact