The occurrence of faults in multicomputers with hundreds or thousands of nodes is a likely event that can be dealt with hardware or software fault-tolerant approaches. This paper presents a unifying model that describes software reconfiguration strategies for parallel applications with regular computational pattern. We show that most existing strategies can be obtained as instances of the proposedthreshold-basedreconfiguration meta-algorithm. Moreover, this approach is useful to discover several yet unexplored strategies among which we consider the class of theadaptive threshold-basedstrategies. The performance optimization analysis demonstrates that these strategies, applied to data-parallel regular computations, give optimal results for worst fault patterns. A wide spectrum of simulations, where the system parameters have been settled to those of actual multicomputers, confirms that adaptive threshold-based strategies yield the most stable performance for a variety of workloads, independently of the number and pattern of faults.

Threshold-based reconfiguration strategies for gracefully degradable parallel computations / Colajanni, Michele; Grassi, V.; Angelaccio, M.. - In: JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING. - ISSN 0743-7315. - STAMPA. - 55 (1):(1998), pp. 138-151.

Threshold-based reconfiguration strategies for gracefully degradable parallel computations

COLAJANNI, Michele;
1998

Abstract

The occurrence of faults in multicomputers with hundreds or thousands of nodes is a likely event that can be dealt with hardware or software fault-tolerant approaches. This paper presents a unifying model that describes software reconfiguration strategies for parallel applications with regular computational pattern. We show that most existing strategies can be obtained as instances of the proposedthreshold-basedreconfiguration meta-algorithm. Moreover, this approach is useful to discover several yet unexplored strategies among which we consider the class of theadaptive threshold-basedstrategies. The performance optimization analysis demonstrates that these strategies, applied to data-parallel regular computations, give optimal results for worst fault patterns. A wide spectrum of simulations, where the system parameters have been settled to those of actual multicomputers, confirms that adaptive threshold-based strategies yield the most stable performance for a variety of workloads, independently of the number and pattern of faults.
1998
55 (1)
138
151
Threshold-based reconfiguration strategies for gracefully degradable parallel computations / Colajanni, Michele; Grassi, V.; Angelaccio, M.. - In: JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING. - ISSN 0743-7315. - STAMPA. - 55 (1):(1998), pp. 138-151.
Colajanni, Michele; Grassi, V.; Angelaccio, M.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/449885
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact