Table of Contents Author Guidelines Submit a Manuscript
Journal of Computer Systems, Networks, and Communications
Volume 2009 (2009), Article ID 409873, 8 pages
http://dx.doi.org/10.1155/2009/409873
Research Article

A Novel Low-Overhead Recovery Approach for Distributed Systems

Computer Science Department, Southern Illinois University, Carbondale, IL 62901, USA

Received 24 November 2008; Accepted 31 March 2009

Academic Editor: Urs Bapst

Copyright © 2009 B. Gupta and S. Rahimi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. R. Koo and S. Toueg, “Checkpointing and rollback-recovery for distributed systems,” IEEE Transactions on Software Engineering, vol. 13, no. 1, pp. 23–31, 1987. View at Publisher · View at Google Scholar
  2. Y. M. Wang, A. Lowry, and W. K. Fuchs, “Consistent global checkpoints based on direct dependency tracking,” Information Processing Letters, vol. 50, no. 4, pp. 223–230, 1994. View at Publisher · View at Google Scholar
  3. K. M. Chandy and L. Lamport, “Distributed snapshots: determining global states of distributed systems,” ACM Transactions on Computer Systems, vol. 3, no. 1, pp. 63–75, 1985. View at Publisher · View at Google Scholar
  4. Y.-M. Wang, “Consistent global checkpoints that contain a given set of local checkpoints,” IEEE Transactions on Computers, vol. 46, no. 4, pp. 456–468, 1997. View at Publisher · View at Google Scholar · View at MathSciNet
  5. B. Gupta, S. K. Banerjee, and B. Liu, “Design of new roll-forward recovery approach for distributed systems,” IEE Proceedings: Computers & Digital Techniques, vol. 149, no. 3, pp. 105–112, 2002. View at Publisher · View at Google Scholar
  6. B. Gupta, S. Rahimi, and Z. Liu, “Novel low-overhead roll-forward recovery scheme for distributed systems,” IET Computers & Digital Techniques, vol. 1, no. 4, pp. 397–404, 2007. View at Publisher · View at Google Scholar
  7. D. Manivannan and M. Singhal, “Asynchronous recovery without using vector timestamps,” Journal of Parallel and Distributed Computing, vol. 62, no. 12, pp. 1695–1728, 2002. View at Publisher · View at Google Scholar
  8. E. N. Elnozahy, D. B. Johnson, and W. Zwaenepoel, “The performance of consistent checkpointing,” in Proceedings of the 11th Symposium on Reliable Distributed Systems (RELDIS '92), pp. 86–95, Houston, Tex, USA, October 1992. View at Publisher · View at Google Scholar
  9. G. Cao and M. Singhal, “On coordinated checkpointing in distributed systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 12, pp. 1213–1225, 1998. View at Publisher · View at Google Scholar
  10. G. Cao and M. Singhal, “Mutable checkpoints: a new checkpointing approach for mobile computing systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 2, pp. 157–172, 2001. View at Publisher · View at Google Scholar
  11. S. Venkatesan, T. T.-Y. Juang, and S. Alagar, “Optimistic crash recovery without changing application messages,” IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 3, pp. 263–271, 1997. View at Publisher · View at Google Scholar
  12. M. Singhal and N. G. Shivaratri, Advanced Concepts in Operating Systems, McGraw-Hill, New York, NY, USA, 1994.
  13. P. Jalote, Fault Tolerance in Distributed Systems, Prentice-Hall, Upper Saddle River, NJ, USA, 1994.
  14. M. L. Powell and D. L. Presotto, “Publishing: a reliable broadcast communication mechanism,” in Proceedings of the 9th ACM Symposium on Operating Systems Principles (SOSP '83), pp. 100–109, Bretton Woods, NH, USA, October 1983. View at Publisher · View at Google Scholar
  15. Q. Jiang, Y. Luo, and D. Manivannan, “An optimistic checkpointing and message logging approach for consistent global checkpoint collection in distributed systems,” Journal of Parallel and Distributed Computing, vol. 68, no. 12, pp. 1575–1589, 2008. View at Publisher · View at Google Scholar
  16. D. B. Johnson and W. Zwaenepoel, “Sender-based message logging,” in Proceedings of the 17th International Symposium on Fault-Tolerant Computing (FTCS '87), pp. 14–19, Pittsburgh, Pa, USA, July 1987.
  17. O. P. Damani and V. K. Garg, “How to recover efficiently and asynchronously when optimism fails,” in Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96), pp. 108–115, Hong Kong, May 1996. View at Publisher · View at Google Scholar
  18. R. E. Strom and S. Yemini, “Optimistic recovery in distributed systems,” ACM Transactions on Computer Systems, vol. 3, no. 3, pp. 204–226, 1985. View at Google Scholar
  19. T.-Y. Juang and S. Venkatesan, “Efficient algorithm for crash recovery in distributed systems,” in Proceedings of the 10th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS '90), pp. 349–361, Bangalore, India, December 1990.