Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2015, Article ID 860891, 12 pages
http://dx.doi.org/10.1155/2015/860891
Research Article

Global Scheduling Heuristics for Multicore Architecture

1Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Rajasthan 333031, India
2Department of Electrical Electronics and Instrumentation, Birla Institute of Technology and Science Pilani, Rajasthan 333031, India
3Oracle India Pvt. Ltd., Bangalore, Karnataka 560076, India

Received 22 July 2014; Revised 26 March 2015; Accepted 27 April 2015

Academic Editor: Jan Weglarz

Copyright © 2015 D. C. Kiran et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. A. Vajda, Programming Many-Core Chips, Springer, New York, NY, USA, 2011.
  2. V. S. Pai, P. Ranganathan, and S. V. Adve, “Impact of instruction-level parallelism on multiprocessor performance and simulation methodology,” in Proceedings of the 3rd International Symposium on High-Performance Computer Architecture (HPCA '97), pp. 72–83, February 1997. View at Scopus
  3. S. William, Computer Organisation and Architecture, Pearson Education, 8th edition, 2010.
  4. V. S. Pai, P. Ranganathan, H. Abdel-Shafi, and S. Adve, “The impact of exploiting instruction-level parallelism on shared-memory multiprocessors,” IEEE Transactions on Computers, vol. 48, no. 2, pp. 218–226, 1999. View at Publisher · View at Google Scholar · View at Scopus
  5. Y. Sazeides, S. Vassiliadis, and J. E. Smith, “The performance potential of data dependence speculation & collapsing,” in Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-29 '96), pp. 238–247, IEEE, Paris, France, December 1996. View at Publisher · View at Google Scholar
  6. D. H. Friendly, S. J. Patel, and Y. N. Patt, “Putting the fill unit to work: dynamic optimizations for trace cache microprocessors,” in Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture, pp. 173–181, December 1998. View at Scopus
  7. D. A. Patterson, “Reduced instruction set computers,” Communications of the ACM, vol. 28, no. 1, pp. 8–21, 1985. View at Publisher · View at Google Scholar · View at Scopus
  8. J. B. Dennis and G. R. Gao, “An efficient pipelined dataflow processor architecture,” in Proceedings of the ACM/IEEE Conference on Supercomputing (Supercomputing '88), pp. 368–373, November 1988.
  9. J. A. Fisher, “The VLIW machine: a multiprocessor for compiling scientific code,” Computer, vol. 17, no. 7, pp. 45–52, 1984. View at Publisher · View at Google Scholar · View at Scopus
  10. J. E. Smith and G. S. Sohi, “The microarchitecture of superscalar processors,” Proceedings of the IEEE, vol. 83, no. 12, pp. 1609–1624, 1995. View at Publisher · View at Google Scholar · View at Scopus
  11. P. Faraboschi, J. A. Fisher, and C. Young, “Instruction scheduling for instruction level parallel processors,” Proceedings of the IEEE, vol. 89, no. 11, pp. 1638–1658, 2001. View at Publisher · View at Google Scholar · View at Scopus
  12. D. Bernstein and M. Rodeh, “Global instruction scheduling for superscalar machines,” in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 241–255, Toronto, Canada, June 1991. View at Publisher · View at Google Scholar
  13. P. B. Gibbons and S. S. Muchnick, “Efficient instruction scheduling for a pipelined architecture,” SIGPLAN Notices, vol. 21, no. 7, pp. 11–16, 1986. View at Publisher · View at Google Scholar
  14. T. L. Adam, K. M. Chandy, and J. R. Dickson, “A comparison of list schedules for parallel processing systems,” Communications of the ACM, vol. 17, no. 12, pp. 685–690, 1974. View at Publisher · View at Google Scholar · View at Scopus
  15. M. C. Golumbic and V. Rainish, “Instruction scheduling beyond basic blocks,” IBM Journal of Research and Development, vol. 34, no. 1, pp. 93–97, 1990. View at Publisher · View at Google Scholar · View at Scopus
  16. P. Robert, P. Colwell, R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman, “A VLIW architecture for a trace scheduling computer,” in Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-II '87), IEEE Computer Society Press, Los Alamitos, Calif, USA, 1987.
  17. M. Lee, P. Tirumalai, and T. Ngai, “Software pipelining and superblock scheduling: compilation techniques for VLIW machines,” in Proceedings of the 26th Hawaii International Conference on System Sciences, vol. 1, pp. 202–213, Wailea, Hawaii, USA, January 1993. View at Publisher · View at Google Scholar
  18. S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, “Effective compiler support for predicated execution using the hyperblock,” in Proceedings of the 25th Annual International Symposium on Microarchitecture, pp. 45–54, December 1992. View at Scopus
  19. C. McCann, R. Vaswani, and J. Zahorjan, “A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors,” ACM Transactions on Computer Systems, vol. 11, no. 2, pp. 146–178, 1993. View at Publisher · View at Google Scholar · View at Scopus
  20. C. Chekuri, R. Motwani, R. Johnson, B. Ramakrishna Rau, and B. Natarajan, “Profile-driven instruction level parallel scheduling,” HP Laboratories Technical Report HPL-96-16, 1996. View at Google Scholar
  21. J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Francisco, Calif, USA, 2011.
  22. D. C. Kiran, S. Gurunarayanan, and J. P. Misra, “Taming compiler to work with multicore processors,” in Proceedings of the International Conference on Process Automation, Control and Computing (PACC '11), pp. 1–6, IEEE, Coimbatore, India, July 2011. View at Publisher · View at Google Scholar · View at Scopus
  23. D. C. Kiran, S. Gurunarayanan, F. Khaliq, and A. Nawal, “Compiler efficient and power aware instruction level parallelism for multicore architecture,” in Eco-Friendly Computing and Communication Systems: Proceedings of the International Conference, ICECCS 2012, Kochi, India, August 9–11, 2012, vol. 305 of Communications in Computer and Information Science, pp. 9–17, Springer, Berlin, Germany, 2012. View at Publisher · View at Google Scholar
  24. D. C. Kiran, S. Gurunaraynan, J. P. Misra, and F. Khaliq, “An efficient method to compute static single assignment form for multicore architecture,” in Proceedings of the 1st IEEE International Conference on Recent Advances in Information Technology, pp. 776–781, March 2012. View at Publisher · View at Google Scholar · View at Scopus
  25. D. C. Kiran, B. Radheshyam, S. Gurunarayanan, and J. P. Misra, “Compiler assisted dynamic scheduling for multicore processors,” in Proceedings of the IEEE International Conference on Process Automation, Control and Computing (PACC '11), pp. 1–6, IEEE, Coimbatore, India, July 2011. View at Publisher · View at Google Scholar
  26. D. C. Kiran, S. Gurunarayanan, and J. P. Misra, “Compiler driven inter block parallelism for multicore processors,” in Wireless Networks and Computational Intelligence: 6th International Conference on Information Processing, ICIP 2012, Bangalore, India, August 10–12, 2012. Proceedings, vol. 292 of Communications in Computer and Information Science, pp. 426–435, Springer, Berlin, Germany, 2012. View at Publisher · View at Google Scholar
  27. The Jackcc Compiler, http://jackcc.sourceforge.net.
  28. R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, “Efficiently computing static single assignment form and the control dependence graph,” ACM Transactions on Programming Languages and Systems, vol. 13, no. 4, pp. 461–490, 1991. View at Google Scholar · View at Scopus
  29. J. M. Tendler, J. S. Dodson, J. S. Fields Jr., H. Le, and B. Sinharoy, “POWER4 system microarchitecture,” IBM Journal of Research and Development, vol. 46, no. 1, pp. 5–25, 2002. View at Publisher · View at Google Scholar · View at Scopus
  30. C. Cascaval, J. Castanos, M. Denneau et al., “Evaluation of a multithreaded architecture for cellular computing,” in Proceedings of the 8th International Symposium on High Performance Computer Architecture, pp. 311–322, January 2002. View at Publisher · View at Google Scholar
  31. E. Waingold, M. Taylor, D. Srikrishna et al., “Baring it all to software: raw machines,” Computer, vol. 30, no. 9, pp. 86–93, 1997. View at Publisher · View at Google Scholar · View at Scopus
  32. W. Lee, R. Barua, M. Frank et al., “Space-time scheduling of instruction-level parallelism on a raw machine,” in Proceedings of the 8th ACM Conference on Architectural Support for Programming Languages and Operating Systems, pp. 46–57, 1998.
  33. The gem5 Simulator System, http://www.gem5.org/.
  34. J. Babb, M. Frank, V. Lee et al., “RAW benchmark suite: computation structures for general purpose computing,” in Proceedings of the 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 134–143, April 1997. View at Scopus
  35. The Raw Benchmark Suit, http://groups.csail.mit.edu/cag/raw/benchmark/.
  36. M. D. Hill and M. R. Marty, “Amdahl's law in the multicore era,” IEEE Computer, vol. 41, no. 7, pp. 33–38, 2008. View at Publisher · View at Google Scholar · View at Scopus
  37. D. H. Woo and H.-H. S. Lee, “Extending Amdahl's law for energy-efficient computing in the many-core era,” Computer, vol. 41, no. 12, pp. 24–31, 2008. View at Publisher · View at Google Scholar · View at Scopus