23rd Iranian Conference on Electrical Engineering - ICEE2015 , 2015-05-10

Title : ( A Diskless Chekpointing Approach for Failure Recovery in Multiprocessor Safety-Critical Embedded Systems )

Authors: sima nokarizi , Yasser Sedaghat , Reza Ramezani ,

Citation: BibTeX | EndNote

Abstract

Backward recovery is the one of the most important techniques for error recovery in safety-critical systems which are usually based on nonvolatile memories. Since storing checkpoints in hard disks –as a nonvolatile memory- imposes noteworthy timing overhead to the system, diskless checkpointing would be a good solution for low cost fault tolerance in parallel and distributed systems. In this paper an algorithm is proposed which is able to recover a multiprocessor system from failure when up to half of the processors are failed, simultaneously. In contrast to many existing work, in the presented work each processor can have more than one task. The algorithm also by grouping tasks and by coding checkpoints eliminates the need of hard and nonvolatile disks to store checkpoints. The simulation results show the ability of the proposed algorithm in recovering system from failure when up to half of processors are simultaneously failed without using any extra dedicated checkpointing processor. Also compared to the existing approaches, the presented method requires fewer processors.

Keywords

, Fault Tolerance, Backward Recovery, Multiprocessor Error Recovery, Diskless Checkpointing.
برای دانلود از شناسه و رمز عبور پرتال پویا استفاده کنید.

@inproceedings{paperid:1049800,
author = {Nokarizi, Sima and Sedaghat, Yasser and Ramezani, Reza},
title = {A Diskless Chekpointing Approach for Failure Recovery in Multiprocessor Safety-Critical Embedded Systems},
booktitle = {23rd Iranian Conference on Electrical Engineering - ICEE2015},
year = {2015},
location = {تهران, IRAN},
keywords = {Fault Tolerance; Backward Recovery; Multiprocessor Error Recovery; Diskless Checkpointing.},
}

[Download]

%0 Conference Proceedings
%T A Diskless Chekpointing Approach for Failure Recovery in Multiprocessor Safety-Critical Embedded Systems
%A Nokarizi, Sima
%A Sedaghat, Yasser
%A Ramezani, Reza
%J 23rd Iranian Conference on Electrical Engineering - ICEE2015
%D 2015

[Download]