Review Article

Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments

Table 2

Computing environment.

Optimization levelWorksDescription

(1) Single device[33]A platform based on ARM Cortex A9, 4, 8, and 16 core architectures
[34]Scheduling kernels on a GPU and frequency scaling
[35]A chip with k cores with specific frequencies is considered, and chips with 36 cores are simulated
[36]Finding best application configuration and settings on a GPU
[37]Server-type NVIDIA Tesla K20 m/K20c GPUs
[38]Exploration of thermal-aware scheduling for tasks to minimize peak temperature in a multicore system through selection of core speeds
[39]Comparison of energy/performance trade-offs for various GPUs
[40]Server multicore and manycore CPUs, desktop CPU, mobile CPU
[41]Single CPU under Linux kernel 2.6–11
[42]Intel Xeon Phi KNL 7250 computing platform, flat memory mode
[43]Exploration of execution time and energy on a multicore Intel Xeon CPU

(2) Multiprocessor system[44]Task scheduling with thermal consideration for a heterogeneous real-time multiprocessor system-on-chip (MPSoC) system
[30]Presents a hybrid approach PUPiL (Performance under Power Limits)—a hybrid software/hardware power capping system based on a decision framework going through nodes and making decisions on configuration, considered for single and multiapplication scenarios (cooperative and oblivious applications)
[45]With notes specific to clusters
[14]Systems with 2 socket Westmere-EP, 2 socket Sandy Bridge-EP, and 1 socket Ivy Bridge-HE CPUs
[46]Dual-socket server with two Intel Xeon CPUs

(3) Cluster[47]Proposes integration of power limitation into a job scheduler and implementation in SLURM
[48]Proposes the enhanced power adaptive scheduling (E-PAS) algorithm with integration of power-aware approach into SLURM for limiting power consumption
[49]Approach applicable to MPI applications but focusing on states of processes running on CPUs, i.e., reducing power consumption of CPUs on which processes are idle or perform I/O operations
[50]Proposes DVFS-aware profiling that uses design time profiling and nonprofiling approach that performs computations at runtime
[51]Split compilation is used with offline and online phases, results from the offline-phase passed to runtime optimization, grey box approach to autotuning, and assumes code annotations
[52]Proposes a runtime library that performs power-aware optimization at runtime and searches for good configurations with DFS/DCT for application regions
[53]Approaches for modeling, monitoring, and tracking HPC systems using performance counters and optimization of energy used in a cluster environment with consideration of CPU, memory, disk, and network
[54]Proposed an energy-saving framework with ranking and correlating counters important for improving energy efficiency
[55, 56]Energy savings on a cluster with Sandy Bridge processors
[57]With consideration of disk and network scaling
[58]Including disk, memory, processor, or even fans
[24]Analysis of performance vs power of a 32-node cluster running a NAS parallel benchmark
[59]A procedure for a single device (a compute node with CPU); however, it is dedicated using such devices coupled into a cluster (tested on 8-9 nodes)
[60]Homogeneous multicore cluster
[61]Cluster
[62]Computer system with several nodes each with multicore CPUs
[63, 64]Cluster with several nodes each with multicore CPUs
[65]Cluster with several nodes with CPUs
[66, 67]Cluster in a data center
[68]Sandy Bridge cluster
[69]Cluster with InfiniBand
[70]Overprovisioned cluster which can run a certain number of nodes at peak power and more at lower power caps
[71]Cluster with 1056 Dell PowerEdge SC1425 nodes
[72]A cluster with 9421 servers connected by InfiniBand

(4) Grid[73]A cluster or collection of clusters allowed in the model and implementation
[74]Implementations of hierarchical genetic strategy-based grid scheduler and algorithms evaluated against genetic algorithm variants

(5) Cloud[75]Meant for cloud storage systems
[76]Related to assignment of applications to virtual and physical machines
[77]Used as IaaS for computations