Table of Contents Author Guidelines Submit a Manuscript
VLSI Design
Volume 9 (1999), Issue 3, Pages 271-290

Design and Implementation of Dynamic Load Balancing Algorithms for Rollback Reduction in Optimistic PDES

1Department of Computer Science, University of North Texas, P.O. Box 311366, Denton, TX 76203-1366, USA
2Wireless Networks, Nortel, P.O. Box 833871, Richardson, USA

Received 26 May 1998

Copyright © 1999 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


In an optimistic parallel simulation, logical processes (Ips) proceed with their computation without any constraints. However, if the computing requirements of different lps are not balanced or if the processors are not homogeneous, some lps may lag behind in simulation time while others surge forward. In other words, if the simulation clocks of different lps are not progressing at the same rate, cascading rollbacks may occur nullifying the potential benefit of an optimistic parallel discrete event simulation (PDES). Hence it is necessary to balance the computational load on different lps in such a way that their local simulation clocks advance almost at the same rate. In this paper, we propose two algorithms for dynamic load balancing which reduce the number of rollbacks in an optimistic PDES system. Our first algorithm is based on the load transfer mechanism between lps; while the second algorithm, based on the principle of evolutionary strategy, migrates logical processes between several pairs of physical processors. We have implemented both of these algorithms on a cluster of heterogeneous workstations and studied their performance. The experimental results show that the algorithm based on the load transfer is effective when the grain size is greater than 10 milliseconds. The algorithm based on the process migration yields good performance only for grain sizes of 20 milliseconds or larger. In both of these cases the speed up ranges mostly between and 2 using four processors.