Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 21, Issue 3-4, Pages 109-121
http://dx.doi.org/10.3233/SPR-130368

MPI Runtime Error Detection with MUST: Advances in Deadlock Detection

Tobias Hilbrich,1 Joachim Protze,1,3,4 Martin Schulz,2 Bronis R. de Supinski,2 and Matthias S. Müller1,3,4

1Technische Universität Dresden, Dresden, Germany
2Lawrence Livermore National Laboratory, Livermore, CA, USA
3RWTH Aachen University, Aachen, Germany
4JARA – High Performance Computing, Aachen, Germany

Copyright © 2013 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The widely used Message Passing Interface (MPI) is complex and rich. As a result, application developers require automated tools to avoid and to detect MPI programming errors. We present the Marmot Umpire Scalable Tool (MUST) that detects such errors with significantly increased scalability. We present improvements to our graph-based deadlock detection approach for MPI, which cover future MPI extensions. Our enhancements also check complex MPI constructs that no previous graph-based detection approach handled correctly. Finally, we present optimizations for the processing of MPI operations that reduce runtime deadlock detection overheads. Existing approaches often require 𝒪(p) analysis time per MPI operation, for p processes. We empirically observe that our improvements lead to sub-linear or better analysis time per operation for a wide range of real world applications.