We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.
DE NITTO PERSONE', V., Grassi, V. (1999). An analytical model for a parallel fault-tolerant computing system. PERFORMANCE EVALUATION, 38(3-4), 201-218 [10.1016/S0166-5316(99)00047-4].
An analytical model for a parallel fault-tolerant computing system
DE NITTO PERSONE', VITTORIA;GRASSI, VINCENZO
1999-01-01
Abstract
We present an analytical model of a parallel computing system. Since the probability of fault occurrence is non-negligible, the model takes into consideration fault-tolerance issues, by combining results obtained from a performance model with a fault/repair model. To this purpose, the system performance must be evaluated under several different configurations, caused by the occurrence of faults and repairs. This requires efficient solution techniques of the performance model. The model we adopt is based on an extended queueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation condition: blocking before service (BBS), Repetitive Service or Blocking After Service. We prove that the underlying Markov process has a particular structure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between the frequency of service interruption due to repair operations and the need of avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.