Checkpointing is a very well known mechanism to achieve fault tolerance. In distributed applications, a local checkpoint is useful for fault tolerance purposes only if can belong to at least one consistent global checkpoint and then, execution can be restarted from it without needing to roll back the execution in the past. The paper introduces a theoretical framework that facilitates the definition and the analysis of distributed checkpoint algorithms to avoid roll backpropagation. On this base, several algorithms are presented and evaluated in a set of testbed applications.
Distributed checkpoint algorithms to avoid roll-back propagation / Zambonelli, F.. - 1:(1998), pp. 403-410. (Intervento presentato al convegno 24th EUROMICRO Conference, EURMIC 1998 tenutosi a swe nel 1998) [10.1109/EURMIC.1998.711833].
Distributed checkpoint algorithms to avoid roll-back propagation
Zambonelli F.
1998
Abstract
Checkpointing is a very well known mechanism to achieve fault tolerance. In distributed applications, a local checkpoint is useful for fault tolerance purposes only if can belong to at least one consistent global checkpoint and then, execution can be restarted from it without needing to roll back the execution in the past. The paper introduces a theoretical framework that facilitates the definition and the analysis of distributed checkpoint algorithms to avoid roll backpropagation. On this base, several algorithms are presented and evaluated in a set of testbed applications.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris