Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2015, Article ID 576498, 16 pages
http://dx.doi.org/10.1155/2015/576498
Research Article

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

1Tohoku University/JST CREST, Sendai, Miyagi 980-8579, Japan
2Tohoku University, Sendai, Miyagi 980-8578, Japan
3NVIDIA Research, Santa Clara, CA 95050, USA
4The University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA

Received 15 May 2014; Accepted 29 September 2014

Academic Editor: Sunita Chandrasekaran

Copyright © 2015 Hiroyuki Takizawa et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

In standard OpenCL programming, hosts are supposed to control their compute devices. Since compute devices are dedicated to kernel computation, only hosts can execute several kinds of data transfers such as internode communication and file access. These data transfers require one host to simultaneously play two or more roles due to the need for collaboration between the host and devices. The codes for such data transfers are likely to be system-specific, resulting in low portability. This paper proposes an OpenCL extension that incorporates such data transfers into the OpenCL event management mechanism. Unlike the current OpenCL standard, the main thread running on the host is not blocked to serialize dependent operations. Hence, an application can easily use the opportunities to overlap parallel activities of hosts and compute devices. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. The evaluation results show that the proposed extension can use the optimized data transfer implementation and thereby increase the sustained data transfer performance by about 18% for a real application accessing a big data file.