We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.

DE NITTO PERSONE', V., Grassi, V. (1999). An analytical model for a parallel fault-tolerant computing system. PERFORMANCE EVALUATION, 38(3-4), 201-218 [10.1016/S0166-5316(99)00047-4].

An analytical model for a parallel fault-tolerant computing system

DE NITTO PERSONE', VITTORIA;GRASSI, VINCENZO
1999-01-01

Abstract

We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.
1999
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
English
Con Impact Factor ISI
parallel system; fork-join; blocking; performability; maintenance policy
DE NITTO PERSONE', V., Grassi, V. (1999). An analytical model for a parallel fault-tolerant computing system. PERFORMANCE EVALUATION, 38(3-4), 201-218 [10.1016/S0166-5316(99)00047-4].
DE NITTO PERSONE', V; Grassi, V
Articolo su rivista
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/49493
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact