Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2015, Article ID 157305, 10 pages
http://dx.doi.org/10.1155/2015/157305
Research Article

Automated Design Space Exploration with Aspen

Oak Ridge National Laboratory, One Bethel Valley Road, Building 5100, MS-6173 Oak Ridge, TN 37831-6173, USA

Received 22 April 2014; Accepted 2 November 2014

Academic Editor: Roman Wyrzykowski

Copyright © 2015 Kyle L. Spafford and Jeffrey S. Vetter. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. K. L. Spafford and J. S. Vetter, “Aspen: a domain specific language for performance modeling,” in Proceedings of the 24th International Conference for High Performance Computing, Networking, Storage and Analysis (SC '12), pp. 1–11, IEEE, Salt Lake City, Utah, USA, November 2012. View at Publisher · View at Google Scholar · View at Scopus
  2. J. J. Yi, L. Eeckhout, D. J. Lilja, B. Calder, L. K. John, and J. E. Smith, “The future of simulation: a field of dreams?” Computer, vol. 39, no. 11, pp. 22–29, 2006. View at Publisher · View at Google Scholar · View at Scopus
  3. D. Campbell, D. Cook, and B. Mulvaney, “A streaming sensor challenge problem for ubiquitous high performance computing,” in Proceedings of 15th Annual Workshop on High Performance Embedded Computing (HPEC '11), November 2011.
  4. L. G. Valiant, “Bridging model for parallel computation,” Communications of the ACM, vol. 33, no. 8, pp. 103–111, 1990. View at Publisher · View at Google Scholar · View at Scopus
  5. A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman, “LogGP: incorporating long messages into the LogP model,” in Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '95), pp. 95–105, July 1995. View at Scopus
  6. D. Culler, R. Karp, D. Patterson et al., “LogP: Towards a realistic model of parallel computation,” in Proceedings of the 4th ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pp. 1–12, May 1993. View at Scopus
  7. M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos, and D. S. Nikolopoulos, “Online power-performance adaptation of multithreaded programs using hardware event-based prediction,” in Proceedings of the 20th Annual International Conference on Supercomputing (ICS '06), pp. 157–166, Association for Computing Machinery, July 2006. View at Publisher · View at Google Scholar · View at Scopus
  8. S.-J. Lee, H.-K. Lee, and P.-C. Yew, “Runtime performance projection model for dynamic power management,” in Advances in Computer Systems Architecture, vol. 4697 of Lecture Notes in Computer Science, pp. 186–197, Springer, Berlin, Germany, 2007. View at Publisher · View at Google Scholar
  9. D. Snowdon, G. Van Der Linden, S. Petters, and G. Heiser, “Accurate runtime prediction of performance degradation under frequency scaling,” in Proceedings of the Workshop on Operating Systems Platforms for Embedded Real-Time Applications, 2007.
  10. S. Song and K. W. Cameron, “System-level power-performance efficiency modeling for emergent GPU architectures,” in Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT '12), pp. 473–474, ACM, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  11. S. Song, M. Grove, and K. W. Cameron, “An iso-energy-efficient approach to scalable system power-performance optimization,” in Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '11), pp. 262–271, September 2011. View at Publisher · View at Google Scholar · View at Scopus
  12. S. Song, C.-Y. Su, R. Ge, A. Vishnu, and K. W. Cameron, “Iso-energy-efficiency: an approach to power-constrained parallel computation,” in Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS '11), pp. 128–139, IEEE, May 2011. View at Publisher · View at Google Scholar · View at Scopus
  13. B. Rountree, D. K. Lowenthal, M. Schulz, and B. R. De Supinski, “Practical performance prediction under dynamic Voltage frequency scaling,” in Proceedings of the International Green Computing Conference (IGCC '11), July 2011. View at Publisher · View at Google Scholar · View at Scopus
  14. B. Rountree, D. K. Lowenthal, S. Funk, V. W. Freeh, B. R. de Supinski, and M. Schulz, “Bounding energy consumption in large-scale MPI programs,” in Proceedings of the ACM/IEEE Conference on Supercomputing (SC '07), pp. 49:1–49:9, ACM, November 2007. View at Publisher · View at Google Scholar · View at Scopus
  15. B. Rountree, D. K. Lowenthal, B. R. de Supinski, M. Schulz, V. W. Freeh, and T. Bletsch, “Adagio: making DVS practical for complex HPC applications,” in Proceedings of the 23rd International Conference on Supercomputing (ICS '09), pp. 460–469, ACM, Newport Beach, Calif, USA, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  16. B. Rountree, D. H. Ahn, B. R. de Supinski, D. K. Lowenthal, and M. Schulz, “Beyond DVFS: a first look at performance under a hardware-enforced power bound,” in Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPSW '12), pp. 947–953, Shanghai, China, May 2012. View at Publisher · View at Google Scholar · View at Scopus
  17. H. Gahvari and W. Gropp, “An introductory exascale feasibility study for FFTs and multigrid,” in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS '10), pp. 1–9, IEEE, Atlanta, Ga, USA, April 2010. View at Publisher · View at Google Scholar · View at Scopus
  18. A. Bhatele, P. Jetley, H. Gahvari, L. Wesolowski, W. D. Gropp, and L. Kalé, “Architectural constraints to attain 1 exaflop/s for three scientific application classes,” in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS '11), pp. 80–91, IEEE, Anchorage, Alaska, USA, May 2011. View at Publisher · View at Google Scholar · View at Scopus
  19. K. Czechowski, C. Battaglino, C. McClanahan, K. Iyer, P.-K. Yeung, and R. Vuduc, “On the communication complexity of 3D FFT and its implications for exascale,” in Proceedings of the 26th ACM International Conference on Supercomputing (ICS '12), pp. 205–214, June 2012. View at Publisher · View at Google Scholar · View at Scopus
  20. K. S. Chatha and R. Vemuri, “An iterative algorithm for hardware-software partitioning, hardware design space exploration and scheduling,” Design Automation for Embedded Systems, vol. 5, no. 3, pp. 281–293, 2000. View at Publisher · View at Google Scholar · View at Scopus
  21. E. Sotiriades and A. Dollas, “Design space exploration for the BLAST algorithm implementation,” in Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '07), pp. 323–325, April 2007. View at Publisher · View at Google Scholar · View at Scopus
  22. A. Stammermann, L. Kruse, W. Nebel et al., “System level optimization and design space exploration for low power,” in Proceedings of the 14th International Symposium on System Synthesis (ISSS '01), pp. 142–146, ACM, October 2001. View at Scopus
  23. J. Keinert, M. Streubuhr, T. Schlichter et al., “SystemCoDesigner an automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications,” ACM Transactions on Design Automation of Electronic Systems, vol. 14, no. 1, article 1, 2009. View at Publisher · View at Google Scholar · View at Scopus
  24. K. Lahiri, A. Raghunathan, and S. Dey, “Efficient exploration of the SoC communication architecture design space,” in Proceedings of the IEEE/ACM International Conference on Computer Aided Design (ICCAD '00), pp. 424–430, IEEE, Piscataway, NJ, USA, 2000. View at Scopus
  25. K. Lahiri, A. Raghunathan, and S. Dey, “Design space exploration for optimizing on-chip communication architectures,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 23, no. 6, pp. 952–961, 2004. View at Publisher · View at Google Scholar · View at Scopus
  26. M. Palesi and T. Givargis, “Multi-objective design space exploration using genetic algorithms,” in Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES '02), pp. 67–72, May 2002. View at Scopus
  27. H. P. Peixoto and M. F. Jacome, “Algorithm and architecture-level design space exploration using hierarchical data flows,” in Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP '97), pp. 272–282, July 1997. View at Scopus
  28. P. Mishra, N. Dutt, and A. Nicolau, “Functional abstraction driven design space exploration of heterogeneous programmable architectures,” in Proceedings of the 14th International Symposium on System Synthesis (ISSS '01), pp. 256–261, ACM, New York, NY, USA, October 2001. View at Scopus
  29. I. Karkowski and H. Corporaal, “Design space exploration algorithm for heterogeneous multi-processor embedded system design,” in Proceedings of the 35th Annual Design Automation Conference (DAC '98), pp. 82–87, San Francisco, Calif, USA, June 1998.
  30. R. Szymanek, F. Catthoor, and K. Kuchcinski, “Time-energy design space exploration for multi-layer memory architectures,” in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, vol. 1, pp. 318–323, February 2004. View at Publisher · View at Google Scholar · View at Scopus
  31. C. L. Janssen, H. Adalsteinsson, and J. P. Kenny, “Using simulation to design extremescale applications and architectures: programming model exploration,” ACM SIGMETRICS Performance Evaluation Review, vol. 38, no. 4, pp. 4–8, 2011. View at Publisher · View at Google Scholar
  32. A. F. Rodrigues, K. S. Hemmert, B. W. Barrett et al., “The structural simulation toolkit,” ACM SIGMETRICS Performance Evaluation Review, vol. 38, no. 4, pp. 37–42, 2011. View at Publisher · View at Google Scholar
  33. J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Mathematics of Computation, vol. 19, no. 90, pp. 297–301, 1965. View at Publisher · View at Google Scholar · View at MathSciNet
  34. S. G. Johnson and M. Frigo, “A modified split-radix FFT with fewer arithmetic operations,” IEEE Transactions on Signal Processing, vol. 55, no. 1, pp. 111–119, 2007. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  35. M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran, “Cache-oblivious algorithms,” in Proceedings of the IEEE 40th Annual Conference on Foundations of Computer Science, pp. 285–297, October 1999. View at Scopus
  36. T. Hoeer, W. Gropp, W. Kramer, and M. Snir, “Performance modeling for systematic performance tuning,” in Proceedings of the State of the Practice Reports (SC '11), pp. 6:1–6:12, 2011.
  37. T. P. Runarsson and X. Yao, “Search biases in constrained evolutionary optimization,” IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, vol. 35, no. 2, pp. 233–243, 2005. View at Publisher · View at Google Scholar · View at Scopus
  38. S. Johnson, “The NLopt nonlinear optimization package,” http://ab-initio.mit.edu/nlopt.
  39. J. S. Vetter, R. Glassbrook, J. Dongarra et al., “Keeneland: bringing heterogeneous GPU computing to the computational science community,” Computing in Science and Engineering, vol. 13, no. 5, pp. 90–95, 2011. View at Publisher · View at Google Scholar · View at Scopus
  40. A. Danalis, G. Marin, C. McCurdy et al., “The scalable heterogeneous computing (SHOC) benchmark suite,” in Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU '10), pp. 63–74, ACM, March 2010. View at Publisher · View at Google Scholar · View at Scopus
  41. S. W. Keckler, W. J. Dally, B. Khailany, M. Garland, and D. Glasco, “GPUs and the future of parallel computing,” IEEE Micro, vol. 31, no. 5, pp. 7–17, 2011. View at Publisher · View at Google Scholar · View at Scopus
  42. P. Kogge, K. Bergman, S. Borkar et al., “Exascale computing study: technology challenges in achieving exascale systems,” Tech. Rep., DARPA Information Processing Techniques Office, 2008. View at Google Scholar