A survey of rollback-recovery protocols in message-passing systems
BAR fault tolerance for cooperative services
ACM Computing Surveys
Proceedings of the twentieth ACM symposium on Operating systems principles - SOSP ’05
Carl Porth
Jean-Philippe Martin
Mike Dahlin
Allen Clement
Amitanand S. Aiyer
David B. Johnson
Beyond nash equilibrium
Globally precise-restartable execution of parallel programs