—This paper describes a novel architecture of fault tolerant Solid State Mass Memory (SSMM) for satellite applications. Mass memories with low-latency time, high throughput, and storage capabilities cannot be easily implemented using space qualified components, due to the inevitable technological delay of these kind of components. For this reason, the choice of Commercial Off The Shelf (COTS) components is mandatory for this application. Therefore, the design of an electronic system for space applications, based on commercial components, must match the reliability requirements using system level methodologies [1], [2]. In the proposed architecture error-correcting codes are used to strengthen the commercial Dynamic Random Access Memory (DRAM) chips, while the system controller is developed by applying fault tolerant design solutions. The main features of the SSMM are the dynamic reconfiguration capability, and the high performances which can be gracefully reduced in case of permanent faults, maintaining part of the system functionality. This paper shows the system design methodology, the architecture, and the simulation results of the SSMM. The properties of the building blocks are described in detail both in their functionality and fault tolerant capabilities. A detailed analysis of the system reliability and data integrity is reported. The graceful degradation capability of our system allows different levels of acceptable performances, in terms of active I/O link Interfaces and storage capability. The results also show that the overall reliability of the SSMM is almost the same using different RS coding schemes, allowing a dynamic reconfiguration of the coding to reduce the latency (shorter codewords), or to improve the data integrity (longer codewords). The use of a scrubbing technique can be useful if a high SEU rate is expected, or if the data must be stored for a long period in the SSMM. The reported simulations show the behavior of the SSMM in presence of permanent and transient faults. In fact, we show that the SCU is able to recover from transient faults. On the other hand, using a spare microcontroller also hard faults can be tolerated. The distributed file system confines the unrecoverable fault effects only in a single I/O Interface. In this way, the SSMM maintains its capability to store and read data. The proposed system allows obtaining SSMM characterized by high reliability and high speed due the intrinsic parallelism of the switching matrix.

Cardarilli, G.c., Leandri, A., Marinucci, P., Ottavi, M., Pontarelli, S., Re, M., et al. (2003). Design of a fault tolerant solid state mass memory. IEEE TRANSACTIONS ON RELIABILITY, 52(4), 476-491 [10.1109/TR.2003.821938].

Design of a fault tolerant solid state mass memory

CARDARILLI, GIAN CARLO;OTTAVI, MARCO;PONTARELLI, SALVATORE;RE, MARCO;SALSANO, ADELIO
2003-01-01

Abstract

—This paper describes a novel architecture of fault tolerant Solid State Mass Memory (SSMM) for satellite applications. Mass memories with low-latency time, high throughput, and storage capabilities cannot be easily implemented using space qualified components, due to the inevitable technological delay of these kind of components. For this reason, the choice of Commercial Off The Shelf (COTS) components is mandatory for this application. Therefore, the design of an electronic system for space applications, based on commercial components, must match the reliability requirements using system level methodologies [1], [2]. In the proposed architecture error-correcting codes are used to strengthen the commercial Dynamic Random Access Memory (DRAM) chips, while the system controller is developed by applying fault tolerant design solutions. The main features of the SSMM are the dynamic reconfiguration capability, and the high performances which can be gracefully reduced in case of permanent faults, maintaining part of the system functionality. This paper shows the system design methodology, the architecture, and the simulation results of the SSMM. The properties of the building blocks are described in detail both in their functionality and fault tolerant capabilities. A detailed analysis of the system reliability and data integrity is reported. The graceful degradation capability of our system allows different levels of acceptable performances, in terms of active I/O link Interfaces and storage capability. The results also show that the overall reliability of the SSMM is almost the same using different RS coding schemes, allowing a dynamic reconfiguration of the coding to reduce the latency (shorter codewords), or to improve the data integrity (longer codewords). The use of a scrubbing technique can be useful if a high SEU rate is expected, or if the data must be stored for a long period in the SSMM. The reported simulations show the behavior of the SSMM in presence of permanent and transient faults. In fact, we show that the SCU is able to recover from transient faults. On the other hand, using a spare microcontroller also hard faults can be tolerated. The distributed file system confines the unrecoverable fault effects only in a single I/O Interface. In this way, the SSMM maintains its capability to store and read data. The proposed system allows obtaining SSMM characterized by high reliability and high speed due the intrinsic parallelism of the switching matrix.
2003
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore ING-INF/01 - ELETTRONICA
English
Con Impact Factor ISI
Cardarilli, G.c., Leandri, A., Marinucci, P., Ottavi, M., Pontarelli, S., Re, M., et al. (2003). Design of a fault tolerant solid state mass memory. IEEE TRANSACTIONS ON RELIABILITY, 52(4), 476-491 [10.1109/TR.2003.821938].
Cardarilli, Gc; Leandri, A; Marinucci, P; Ottavi, M; Pontarelli, S; Re, M; Salsano, A
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
r2-design_of_a_fault_tolerant_solid_state_mass_memory.pdf

solo utenti autorizzati

Licenza: Copyright dell'editore
Dimensione 1.88 MB
Formato Adobe PDF
1.88 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/93627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 45
  • ???jsp.display-item.citation.isi??? 34
social impact