International Journal of Biomedical Imaging

Volume 2015 (2015), Article ID 590727, 13 pages

http://dx.doi.org/10.1155/2015/590727

## High Performance GPU-Based Fourier Volume Rendering

Biomedical Engineering Department, Cairo University, Giza 12613, Egypt

Received 18 November 2014; Revised 28 December 2014; Accepted 9 January 2015

Academic Editor: Tzung-Pei Hong

Copyright © 2015 Marwan Abdellah et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Fourier volume rendering (FVR) is a significant visualization technique that has been used widely in digital radiography. As a result of its time complexity, it provides a faster alternative to spatial domain volume rendering algorithms that are computationally complex. Relying on the *Fourier projection-slice theorem*, this technique operates on the spectral representation of a 3D volume instead of processing its spatial representation to generate attenuation-only projections that look like *X-ray radiographs*. Due to the rapid evolution of its underlying architecture, the graphics processing unit (GPU) became an attractive competent platform that can deliver giant computational raw power compared to the central processing unit (CPU) on a per-dollar-basis. The introduction of the compute unified device architecture (CUDA) technology enables embarrassingly-parallel algorithms to run efficiently on CUDA-capable GPU architectures. In this work, a high performance GPU-accelerated implementation of the FVR pipeline on CUDA-enabled GPUs is presented. This proposed implementation can achieve a speed-up of 117x compared to a single-threaded hybrid implementation that uses the CPU and GPU together by taking advantage of executing the rendering pipeline entirely on recent GPU architectures.

#### 1. Introduction

Volume visualization is an essential tool for exploring and analysing the anatomy of complex structures and phenomena. It has been extensively used in various scientific and engineering arenas such as medical imaging, geoscience, microscopy, mechanical engineering, and others [1–3]. Several volume visualization techniques have been developed and extensively investigated. The main two categories that have gained broad acceptance by scientific communities were* volume* and* surface rendering*. Each technique has its specific area of application that is associated with its advantages, but in contrast, it also has its disadvantages that give the opportunity for other techniques to survive [4–6].

Volume rendering has shown great significance on the visual interpretation of large amount of 3D scalar and vector data generated by multidimensional sensors, acquisition devices, and supercomputer simulations. It concentrates on visualizing the desired internal features of volumetric objects and their bounding surfaces at the same time. In literature, several volume rendering algorithms have been presented either to improve the rendering speed of large datasets or to enhance the rendering quality of their reconstructed images. By and large, the rendering speed and quality are* traded-off* and no single complete algorithm that can deliver the optimum quality associated with maximum interactivity exists [7].

A radical* domain-based* categorization of volume rendering algorithms classifies them into* spatial-domain* and* other-domain-based* techniques such as frequency domain, compression domain or the wavelet domain. The rendering pipeline of spatial domain techniques runs entirely in this domain. Domain-based methods operate by switching part of their computations to a different domain either to reduce the complexity of the rendering algorithm or to improve the performance of some operations that take considerable amount of time in the spatial domain. For a volume of size , complexity of spatial-domain algorithms is of order , since all the voxels composing this volume must be visited at least once to render a correct image. Although some algorithms use additional optimization techniques to reduce the number of traversed samples, this optimization is in general data dependent and subject to the size of the datasets. This time-complexity limits the usage of spatial domain rendering algorithms for interactive environments in some applications. In such cases,* frequency-domain-based* techniques can be used alternatively.

Frequency-domain volume rendering (FDVR) uses a 3D spectral representation of the volume to compute an image that looks like X-ray radiograph in time relying on the* projection-slice theorem*. It works by transforming the spatial volume into frequency domain. Then it reconstructs 2D projection at any viewing angle by resampling an extracted projection-slice along a perpendicular plane to the viewing direction followed by backtransforming this resampled slice to the spatial domain [8, 9].

Obtaining the spectral representation of the volume is the most computationally intensive step in the algorithm due to its complexity. Nevertheless, this rendering algorithm is extremely efficient because this step is executed once in a preprocessing stage. The rendering loop of this pipeline takes much less computing time.

FDVR is a generic technique that can work with any frequency transform to switch between the spatial and frequency domains. Our rendering pipeline will be based on Fourier transform and so, the rendering method can be called Fourier volume rendering.

The rest of the paper is organized as follows. Section 2 summarizes the related work in the literature. In Section 3, the rendering algorithm is demystified and our implementation strategy is presented. Section 4 elaborates the results gained by mapping the entire rendering pipeline to run on the GPU and Section 5 concludes the paper.

#### 2. Fourier Volume Rendering Literature

Although the technique was introduced around 20 years ago, the literature behind FDVR in general and FVR in particular is quite scarce. In this section, we will try to summarize the most notable contributions that have been presented since that time in a nutshell.

In 1992, Dunne et al. introduced the fundamental idea behind frequency domain volume rendering and its advantages [8]. Malzbender systematically extended the projection-slice theorem to 3D to be a basis for volume rendering [9]. Based on FFT, he presented an implementation of the basic FVR pipeline and briefly discussed some considerations of resampling the frequency domain and their significant effects on the reconstruction quality of the generated projection image. He proposed to use a Hamming-windowed reconstruction filter to reduce the aliasing accompanied with resampling the frequency spectrum. For the same mission, Grosso and Ertl proposed alternatively biorthognal wavelet reconstruction filters [10]. As a drawback, the basic FVR integral does not provide a direct way of modelling emission and scattering in the participating medium. This limitation was a fundamental disadvantage associated with frequency domain rendering that is reflected as lack of occlusion in the resulting reconstructions [11]. Levoy has partially restored the lost visual cues in the basic pipeline by applying several shading models that are linear combinations of Fourier projections [12]. His extension included depth cueing, directional shading, and Lambertian reflection with spherical illumination. The result of Levoy’s work did not exhibit real occlusion, yet it provided acceptable depth and shape cues that can simulate the existence of the missing occlusion. One significant disadvantage that limited the usage of his implementation was the demand of several copies of the volume, which consequently imposes high memory requirements for initial preprocessing. In cooperation with Totsuka, the same results have been obtained after considering alternative frequency domain methods by processing the frequency response of the volume data, which dramatically reduced the memory required before [13].

Until that time, FVR did not gain that broad acceptance by the scientific visualization and medical communities due to the lack of illumination models in the Fourier domain. In 2002, Entezari et al. enhanced the projections quality by incorporating various illumination models into the FVR pipeline [14]. The first model was based on Gamma-corrected hemispherical shading that was proposed by Scoggins et al. [15] and the second one adopted spherical harmonic functions for approximating cubic illumination shading. FVR suffered from reduced reconstruction quality that limited its usefulness in particular medical applications. In [16], a solution for enhancing FVR using contour extraction was proposed. It provided a flexible method for extracting material boundaries on surfaces in the Fourier space. Additionally, it included enhancement of several features for revealing important spatial relationships between interior and exterior structures making it an attractive tool for improved X-ray-like investigations for a given dataset. Cheng and Ching [17] also presented various methods for designing FVR transfer function based on Bezier curves and B-splines. Jansen et al. [18] partially accelerated the rendering pipeline on the GPU by mapping the Split-Stream-FFT on the GPU. As a major tool in digital radiography, Ntasis et al. [19] provided a web-based Fourier volume renderer for real-time preview of digital reconstructed radiographs (DRRs). Their implementation examined carefully the resampling issues of the frequency domain to generate remote high quality DRRs. Extending the technique beyond operating on volume data with regular grids, FVR has been adapted by Corrigan et al. [20, 21] to directly deal with meshless data. A high-level MATLAB-based FVR framework was presented in [22]. This framework abstracts the complex implementation details of the rendering pipeline to allow imaging researchers to develop image enhancement techniques without having prior knowledge of the OpenGL pipeline. Viola et al. have implemented a FVR heterogeneous rendering pipeline that uses the GPU to execute the rendering stage relying on fragment shaders [23]. However, this approach is limited if the current capabilities of unified computing GPU architectures are considered.

Practical implementation of FVR is complicated by two main factors which arise when the projection-slice theorem is applied to discrete sampled data [8]. First of all, conventional FFT algorithms yield frequency-domain output data which is not ideally structured for resampling. To mitigate this effect, it is necessary to add high performance multidimensional fft-shift stages to the rendering pipeline to rearrange the data. Moreover, sampling the frequency domain is equivalent to the replication of the signal in the spatial domain that ultimately accounts for the appearance of ghosting artifacts in the reconstructed images. This issue is resolved by zero-padding the volume in the spatial domain and using high-order interpolation filters in the frequency domain. Our CUDA-based implementation addresses all of these issues.

#### 3. Algorithm and Implementation

The plain FVR algorithm is briefly illustrated to simplify the explanation of the contextual classification of the rendering pipeline. Based on this classification, an efficient strategy has been considered to map this pipeline to run entirely on the GPU.

##### 3.1. Algorithm

The spatial volume must be shifted by a 3D* fft-shift* operation to set its center at the origin of the 3D space. The frequency spectrum of this volume is then obtained by a forward 3D FFT operation. This resulting spectrum is not centered at the origin of the frequency space. Consequently, another 3D fft-shift operation is required to center its zero-frequency component to prepare it for correct slice extraction. Afterwards, a 2D projection-slice passing through the origin of the spectral volume, with a normal that is parallel to the viewing direction, is carefully extracted. This slice represents the 2D FFT of the desired projection and thus, the reconstructed image can be directly obtained by a 2D inverse FFT operation. To reduce the aliasing and ghosting artifacts in the reconstructed image, this projection-slice is processed in a further step to have it resampled. The resampling stage is only mandatory if the desired projection was not orthogonal. After resampling the extracted slice, it is backtransformed to the spatial domain by an inverse 2D FFT operation. The resulting image from this inverse transformation is shifted by another 2D fft-shift operation to move the center of the image from the edge to origin of the grid used to display the image. Changing the viewing angle implies rotating the 3D spectrum to extract a new projection-slice that corresponds to this angle. The algorithm sequence is graphically depicted in Figure 1.