|
Technology/API | Abstraction level/group | Programming model | Programming language | Supported platforms/target parallel system | License/standard |
|
OpenMP | Library | Multithreaded application | C/C++/Fortran | Heterogeneous system with CPU(s), accelerators including GPU(s) [54], supported by, e.g., gcc | OpenMP is a standard [7] |
|
CUDA | Library | CUDA model, computations launched as kernels executed by multiple threads grouped into blocks, global, and shared memory on the GPU as well as host memory for data management | C | Server or workstation with 1 + NVIDIA GPU(s) | Proprietary NVIDIA solution, NVIDIA EULA [8]1 |
|
OpenCL | Library | OpenCL model, computations launched as kernels executed by multiple work items grouped into work groups and memory objects for data management | C/C++ | Heterogeneous platform including CPUs, GPUs from various vendors, FPGAs, etc., supported by, e.g., gcc | OpenCL is a standard [9] |
|
Pthreads | Library | Multithreaded application, provides thread management routines, synchronization mechanisms including mutexes, conditional variables | C | Widely available in UNIX platforms, implementations, e.g., NPTL | Part of the POSIX standard |
|
Open ACC | Library | Multithreaded application | C/C++/Fortran | Heterogeneous architectures, e.g., a server or workstation with x86/POWER + NVIDIA GPUs, support for compilers such as PGI, gcc, accULL, etc. | OpenACC is a standard [10] |
|
Java Concurrency | JVM [14] specific | Multithreaded application | Java, scala | Server, workstation, mobile device | Open standards: [12, 13] |
|
TCP/IP | Network stack | Multi-process | C, Fortran, C++, Java, and others | Cluster, server, workstation, mobile device, and others | TCP/IP [15] is a standard broadly implemented by OS developers |
|
RDMA | Network stack | Multiprocess | C | Cluster | RDMA [17] is a standard implemented by over InfiniBand and converged Ethernet protocols |
|
UCX | Network stack | Multiprocess, multithreaded | C, Java, Python | Cluster, server, workstation | UCX [21] is a set of network APIs with a reference implementation |
|
MPI | Library | Multiprocess, also multithreaded if implementation supports | C/Fortran | Cluster, server, workstation | MPI is a standard [26], several implementations available, e.g., OpenMPI and MPICH |
|
OpenSHMEM | Library | Multiprocess application | C, Fortran | Cluster | Open standard with reference implementation |
|
PCJ | Java library | Multiprocess application | Java | Cluster | Open source Java library [29] |
Apache Hadoop | Set of applications | YARN managed resource negotiation, multiprocess MapReduce tasks [41] | Core functionality in JAVA, also C, BASH, and others | Cluster, server, workstation | Open source implementation of Google’s MapReduce [40], Apache software license-ASL 2.0 |
|
Apache Spark | Set of applications | Resource negotiation based on the selected resource manager (YARN, Spark Standalone, etc.), executors run workers in threads [49] | Scala | Cluster, server, workstation | Apache software license-ASL 2.0 [55] |
|