Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 22, Issue 2, Pages 125-139
http://dx.doi.org/10.3233/SPR-140384

Exploring the Future of Out-of-Core Computing with Compute-Local Non-Volatile Memory

Myoungsoo Jung,1 Ellis H. Wilson III,2 Wonil Choi,1,2 John Shalf,3,4 Hasan Metin Aktulga,3 Chao Yang,3 Erik Saule,5 Umit V. Catalyurek,5,6 and Mahmut Kandemir2

1Department of Electrical Engineering, The University of Texas at Dallas, Richardson, TX, USA
2Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
3Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
4National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
5Biomedical Informatics, The Ohio State University, Columbus, OH, USA
6Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA

Copyright © 2014 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Drawing parallels to the rise of general purpose graphical processing units (GPGPUs) as accelerators for specific high-performance computing (HPC) workloads, there is a rise in the use of non-volatile memory (NVM) as accelerators for I/O-intensive scientific applications. However, existing works have explored use of NVM within dedicated I/O nodes, which are distant from the compute nodes that actually need such acceleration. As NVM bandwidth begins to out-pace point-to-point network capacity, we argue for the need to break from the archetype of completely separated storage. Therefore, in this work we investigate co-location of NVM and compute by varying I/O interfaces, file systems, types of NVM, and both current and future SSD architectures, uncovering numerous bottlenecks implicit in these various levels in the I/O stack. We present novel hardware and software solutions, including the new Unified File System (UFS), to enable fuller utilization of the new compute-local NVM storage. Our experimental evaluation, which employs a real-world Out-of-Core (OoC) HPC application, demonstrates throughput increases in excess of an order of magnitude over current approaches.