Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2018, Article ID 2135453, 12 pages
Research Article

Multimodal Medical Image Registration Based on an Information-Theory Measure with Histogram Estimation of Continuous Image Representation

1School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou 450007, China
2Laboratory of Image Science and Technology, School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
3Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 210096, China
4Centre de Recherche en Information Médicale Sino-Français (CRIBs), Nanjing 210096, China
5INSERM, U1099, 35000 Rennes, France
6LTSI, Université de Rennes 1, 35042 Rennes, France
7Centre de Recherche en Information Médicale Sino-Français (CRIBs), 35000 Rennes, France

Correspondence should be addressed to Bicao Li; nc.ude.tuz@cbl

Received 27 October 2017; Revised 27 February 2018; Accepted 13 March 2018; Published 29 April 2018

Academic Editor: Bogdan Dumitrescu

Copyright © 2018 Bicao Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


This work presents a novel method for multimodal medical registration based on histogram estimation of continuous image representation. The proposed method, regarded as “fast continuous histogram estimation,” employs continuous image representation to estimate the joint histogram of two images to be registered. The Jensen–Arimoto (JA) divergence is a similarity measure to measure the statistical dependence between medical images from different modalities. The estimated joint histogram is exploited to calculate the JA divergence in multimodal medical image registration. In addition, to reduce the grid effect caused by the grid-aligning transformations between two images and improve the implementation speed of the registration method, random samples instead of all pixels are extracted from the images to be registered. The goal of the registration is to optimize the JA divergence, which would be maximal when two misregistered images are perfectly aligned using the downhill simplex method, and thus to get the optimal geometric transformation. Experiments are conducted on an affine registration of 2D and 3D medical images. Results demonstrate the superior performance of the proposed method compared to standard histogram, Parzen window estimations, particle filter, and histogram estimation based on continuous image representation without random sampling.

1. Introduction

Image registration is the task of finding the optimal geometric transformation between two images. Image registration is widely used in numerous fields, such as medical imaging, computer vision, and remote sensing. Medical images from different modalities can provide various complementary information. Therefore, the registration of multimodal medical images is valuable in multimodal diagnosis and computer-aided surgery [13].

Information-theory-based image registration has become a popular method for multimodal medical images. In these methods, mutual information (MI) is a frequently used similarity measure simultaneously proposed by Collignon et al. [4], Maes et al. [5], Wells III et al. [6], and Viola and Wells III [7]. These approaches do not require any preprocessing and have been found to hold for a range of applications. Many similarity measures based on information theory have been employed for medical image registration, for example, normalized MI [8], MI combined with gradient information [9], cumulative residual entropy (CRE) [10, 11], and cross-CRE [12].

The critical element for all these information-theory-based approaches for image registration is the estimation of probability density functions (PDFs). Standard histograms have been the most frequently used for estimating probability distributions [4], which is a straightforward algorithm and easily implemented. However, Rajwade et al. [13] pointed out that the number of histogram bins determines the estimation of PDFs, and a relatively small number can provide the preferable PDFs. Standard histogram method also yields discontinuous probability estimates, and there are no principle techniques to select the suitable number of bins for calculating the PDFs of images to be registered. The Parzen window (PZW) [7] has been exploited for probability density estimation in image registration. This method was developed further by Thévenaz and Unser [14]. However, the optimal selection of kernel function and kernel width is a challenging task when utilizing PZW to estimate PDFs. Nonparametric (NP) windows [15] were introduced to compute PDFs more accurately than PZW. However, the computational and implementation complexity of NP windows limits the method’s actual application. Joshi et al. [16] simplified the implementation of NP windows by using planar interpolation. Standard histogram and PZW methods possess one common property; that is, both approaches regard an image as a set of pixels or samples [17]. Discrete histogram transform (DHT) was introduced in [18]. This method averages the estimations from several multivariate histograms to alleviate the discontinuities in the boundaries of histogram bins. However, the computational cost of DHT increases quickly as the number of histograms grows. López-Rubio and Muñoz-Pérez [19] proposed the use of multivariate frequency polygons (MFP) to reduce computational complexity. By contrast, probability estimations based on DHT and MFP also rely on training samples and bin width. Maes et al. [5] introduced the partial volume estimation (PVE) to calculate probability distributions. Chen and Varshney proposed the generalized PVE [20] technique, which applies a general kernel function to determine fractional votes of each intensity pair. The estimated methods based on partial volume construct joint probability distribution through continuous image representation (CIR), whereas the PVE approach still substantially requires the choice of kernel function [17].

In our earlier work [21], a novel divergence measure called the Jensen–Arimoto (JA) divergence was presented as registration criterion for 2D-2D rigid medical registration. To calculate the JA divergence, the joint probability distributions between two images to be registered need to be estimated using the standard/simple histogram (SH) algorithm, PVE, or PZW method. Whereas the critical element for SH based estimation methods is selecting the suitable number of bins, PVE and PZW both belong to the continuous estimation of densities. A comparison between these two methods to density estimation has been introduced in [22]. Darkner and Sporring [23] also concluded that PZW is algorithmically a superior estimator for NMI compared to GPV. Nevertheless, these two approaches both need to select the appropriate kernel function. In a word, the number of bins for SH and selection of kernel function for PVE and PZW are the current challenges and limitations in estimation of densities. To overcome these problems, we introduce a CIR for estimating the joint histogram of two images to be registered. Different from the aforementioned approaches, our method avoids binning problem and the choice of kernel function. The most straightforward strategy to estimate joint histograms by CIR is to use all voxels from aligned images. However, this way is time consuming for the images with large sizes. Additionally, the grid effect caused by grid-aligning transformations and the discontinuities of similarity measures are two major problems associated with these information-theory-based methods [24]. Therefore, to solve these issues, we choose a certain number of random samples from the images to be registered, instead of all pixel points, to estimate the joint histogram. The computation time for histogram estimation is then reduced by random sampling strategy. Our proposed histogram estimation method is regarded as “fast continuous histogram estimation” (FCHE). We apply the FCHE algorithm to estimate joint probability and then calculate the JA divergence for multimodality medical registration. Affine transformation models are employed, and the simplex Nelder–Mead method [25] is used to search the maximum of the JA divergence. To evaluate the performance of our registration technique based on FCHE, we implement experiments on 2D and 3D medical images. The results demonstrate that FCHE method can provide better registration accuracy and improve the implementation efficiency.

The remainder of this paper is organized as follows. In Section 2, the FCHE method for joint histogram estimations and a review of the JA divergence measure are introduced. The subsequent image registration technique is detailed in Section 3. Section 4 provides our experimental results on 2D and 3D medical images. This section also compares these findings with those of the registration strategies based on continuous histogram estimation (CHE) without random sampling, SH, PF, and PZW. Concluding remarks and perspectives are presented in Section 5.

2. Theory

2.1. FCHEs

In information-theory-based registration approaches, the joint probabilities between two images to be registered are the critical element to successfully calculate the similarity measure. As mentioned above, several algorithms for computing joint probabilities include SH, PZW, and PVE. In this work, a spatially continuous representation of an image is adopted to estimate a joint histogram for obtaining joint probabilities.

2.1.1. Joint Histogram in 2D

Two images, namely, the reference image () and the floating image (), are considered along with their intensity values normalized into a specified number of bins. For every grid point in each image, around which the intensity values of four neighborhood points were estimated by Rajwade et al. [13], here the distances between the grid point and four neighborhood points are all 0.5 pixels (Figure 1(a)). These intensity values of four neighbors in nongrid points are estimated through interpolation methods, which inevitably increase the computational cost and introduce interpolation error into registration results. To address this problem, four grid points are used to compose a square for histogram estimations (Figure 1(b)), and then the square (a pixel) is split into two triangles (half-pixels).

Figure 1: Process of joint histogram estimations in 2D. (a) Four neighborhood points around the central point on nongrid locations (cross) and nine grid points (black dots). (b) Four points on the grid (four dots). (c) The square constituted by four grid points in (b) is divided into two triangles. (d) The intersection point (black circle) of two isointensity lines. The solid line and dashed line represent the respective isocurves of two images.

The intensity values of the three vertices in each triangle are regarded as linear functions of the vertices’ respective coordinates. The related equations are as follows:where (, ) denotes the coordinates of three vertices (no. ) of the triangle shown in Figure 1(d) and , and represent the corresponding intensity values of images and , respectively. Additionally, , , and are the coefficients obtained by solving the equations defined in (1). For the values of the three coefficients, the intensity values of any coordinate () within each triangle of two images are calculated using . Conversely, given a pair of intensity values (one intensity value in and one in , signified as ) within the range of specified bins, we can yield two isointensity lines, and . The joint histogram is updated if the intersection of two lines () falls into a pair of corresponding triangles from two images as shown below:where represents the joint histogram of the entry () between two images and . We repeat the aforementioned process and cover all entries in the range of bins to acquire the entire joint histograms.

2.1.2. Joint Histogram in 3D

Similar to the joint histogram in 2D, we still account for the CIR when using 3D images. Eight vertices of a voxel in grid points are selected to constitute one cube, which is divided into 12 or 24 tetrahedrons through the cube center [17]. The intensity values of the eight vertices (grid points) can be directly acquired, whereas that of one center is evaluated through interpolation because of its location on a nongrid point. Nevertheless, we partition the cube (or one voxel) into six equal volumes using the eight vertices of this cube. Hence, we obtain six tetrahedrons (Figure 2(b)) without the need of the cube’s center. Similar to the estimation in 2D, the intensity values of the four vertices of each tetrahedron are deemed as linear functions of the vertices’ respective coordinates as shown below:

Figure 2: Process of joint histogram estimations in 3D. (a) Cube. (b) Six tetrahedrons divided from the cube. (c) Intersecting line between two isointensity surfaces (plotted by the solid and dashed lines) with the distance of two black dots (two crossing points between the intersecting line and the tetrahedron) of the segment length.

Likewise, and are the coordinates and intensity values of the four vertices of each tetrahedron, along with of the two images and as the four vertices. In addition, , , , and are the coefficients that determine the linear equations in (3). The values of these coefficients are computed through these equations. On the basis of the coefficients, the two isointensity planes and are established. For the 3D case, we first consider the segment of the intersecting line of two planes. Then, the joint histogram of () is accumulated by measuring the length of the segment within each pair of homologous tetrahedrons from two respective images. The related equation is as follows:where and denote the segment length and the joint histogram, respectively. Considering one tetrahedron, the calculation process for the joint histogram in 3D is shown in Figure 2(c).

2.1.3. Random Sample

Information-theory-based similarity measures rely on the computation of a joint histogram between two images or volumes to be registered. The straightforward strategy for calculating the joint histogram is applying all pixels or voxels. However, the high implementation cost impedes the application of this strategy for large images or 3D images. An alternative approach is selecting a subset of pixels or voxels from an entire image. Generally, the subset is chosen from a uniform grid or random coordinate. Notably, random samples can be obtained from the grid points or the nongrid coordinates [25]. Bhagalia et al. [26] proposed selecting samples that lie on notable image features.

Press et al. [25] demonstrated that random samples on nongrid locations can improve the smoothness of MI. They also compared their experimental results to those obtained from several sample methods provided by Klein et al. [27]. The random sampling off the grid can reduce the grid effect and discontinuities of information-theory-based similarity measures. Accordingly, while applying our registration method, we adopt a subset of samples selected from the nongrid coordinates in the fixed image to estimate the joint histogram in 2D or 3D. Finally, under FCHE (Algorithm 1), the execution time of histogram estimations is substantially shortened while the registration accuracy is preserved. For instance, estimating the joint histogram between two 3D images with dimensions of 41 × 41 × 41 voxels approximately lasts for 21.7 s without the need of a random sample. By contrast, the evaluated time under the FCHE strategy is only about 1.5 s when choosing 5000 random samples from the fixed image.

Algorithm 1: Fast continuous histogram estimation.
2.2. JA Divergence

The Arimoto entropy [28], a generalization of Shannon entropy, was introduced by Arimoto and further developed by Liese and Vajda [29]. Its definition is given by

Unlike Shannon entropy, Arimoto entropy is a nonextensive entropy. The parameter α reflects the degree of nonextensivity. The nonextensivity gradually weakens as the parameter approaches 1, and the limit of the Arimoto entropy equals the Shannon entropy when α→ 1.

We then follow the derivation of the Jensen–Shannon divergence reported by Lin [30]. In our previous work [21], a divergence measure based on the Arimoto entropy called the JA divergence was introduced. The JA divergence was used to measure the distance of two random variables and showed a superior performance in registering medical images to those of other information-theory measures.

Given a random variable with probability distributions of , we define the JA divergence [21] aswhere denotes the Arimoto entropy for , , and represent weight factors, such that with . When the parameter approaches 1, the Arimoto entropy converges to the standard Shannon entropy, and the limitation of the JA divergence in (6) resembles that of the traditional MI obtained by L’Hopital’s rule.

3. Description of the Proposed Method

Herein, we described our medical registration method. First, the FCHE method was applied to estimate the joint histogram of two images to be registered and calculate their joint probability distributions. Then, the JA divergence used as similarity measure was derived in an affine transformation space. Finally, the simplex optimization scheme was employed to search for the maximum of the JA divergence, along with the optimal transformation being obtained. A block diagram of our registration algorithm (Algorithm 2) consisting of five stages is displayed in Figure 3, where the sequence numbers⑤ denote these stages, respectively.

Algorithm 2: Multimodel medical image registration.
Figure 3: Block diagram of our registration algorithm.
3.1. Registration Algorithm

For two misaligned images to be registered, one is selected as the floating image F and the other is considered as the reference image R. As an example, Figure 4 describes the corresponding slices in T1-weighted magnetic resonance (MR) and T2-weighted MR volumes ((a) denotes the reference image, and (b) denotes the floating image). The goal of image registration is to search the optimal transformation between and . We denote and by an arbitrary point in and its corresponding point in , respectively. The spatial transformation relation between the corresponding points in and can be formulated aswhere represents the mapping function and is a vector of parameters.

Figure 4: Two corresponding slices in two registered volumes. (a) Central slice in MR T1 volume. (b) Corresponding slice in the transformed MR T2 volume registered by a known 3D rigid transformation method.

Image registration aimed to align the floating image to the reference image by maximizing a similarity measure between and . In other words, the registration problem is formulated as the following optimization problem:where denotes a suitable similarity measure, which achieves its maximum when the reference image and floating image are completely registered, and stands for the floating image transformed by the space mapping .

3.2. Transformation Model

Rigid, affine, perspective, and curve (elastic) transformation models were reported to simulate geometric transformations of images [3]. Rajwade et al. [17] reported that the CHEs are not suitable for deformation registration because the derivatives of histograms acquired by CHE are not analytically derived. Therefore, in this paper, we restrict the transformation to rigid cases. The parameter is the vector set of 6 degrees of freedom (DoFs) for 3D rigid transformation. These 6 DoFs include three displacements in the , , and directions and three rotation angles around the -, -, and -axes. Hence, we model the transformation and rewrite (7) as follows:where and hold the same denotations as those in Section 3.1, , , and are the translation parameters, and , , and denote the rotation parameters. In the following section, we adopt the transformation model defined by (9).

3.3. Similarity Measure

In image registration, the value of the similarity measure for two misaligned images is expected to be maximum when the images are perfectly aligned. Hence, choosing an appropriate similarity measure is an important task in image registration. As mentioned in Section 2.2, the JA divergence can reach the maximum when the difference between two random variables is minimal. Therefore, the JA divergence measure in (6) is applied as a similarity measure, and is given as follows:

By combining (8) and (10), we formulate the optimal spatial mapping between and aswhere denotes the set of estimated parameters and )), , represents the conditional probability distribution of the transformed floating image given the reference image.

The intensity values in and are denoted by and ), respectively. is the number of the chosen intensity bins for estimating the joint probabilities. By substituting and into the conditional probability , we can rewrite the formula as = . We choose the JA divergence with different weights and values as the similarity measure shown in (10). In (6), the reference image is adopted as basis for image registration, and the marginal probability distributions of the reference image are chosen as the weights; that is, . By substituting and into (6), we rewrite the similarity measure as

The similarity measure defined in (12) is then maximized using an optimization method to search for the optimal transformation parameters.

3.4. Optimization

The optimization algorithm mainly influences the convergence speed of a similarity measure. Hence, an appropriate optimization method is valuable to a registration framework. We do not obtain the derivatives of the probability distributions using FCHE method. Hence, gradient-based optimization techniques are not available herein. In this work, we exploit the downhill simplex method (Nelder–Mead [31] or Amoeba optimization) to maximize the similarity measure defined in (12), considering that this method only requires function evaluations without need for derivatives.

Notably, the downhill simplex optimization scheme is terminated if the difference between the current function value and the best function value is less than 0.01 and the maximum of difference between the current value and the best value for all parameters is less than 0.1. The scheme is also terminated if the number of iterations reaches a previously fixed value MAX. In our work, MAX is set to 100.

4. Experiments and Results

To evaluate the FCHE method, two groups of experiments were implemented on 2D and 3D medical images. In Section 4.1, the datasets used are described. The performance of the FCHE method in registration accuracy is exemplified through 2D registration in Section 4.2. Meanwhile, the method addresses the rigid registration of 3D medical images in Section 4.3.

The FCHE method, CHE algorithm without random sampling, SH, and PZW algorithms were coded by Visual C++ and implemented in Visual Studio 2012. The classes and functions were employed in the Insight Toolkit (ITK) [32]. In this work, all experiments were performed on a 3.10 GHz and 8 GB-memory personal computer. Meanwhile, we set the nonextensive parameter in the JA divergence as = 1.50. We also used the downhill simplex optimization to maximize the JA measure in all experiments.

4.1. Test Images

The datasets for 2D tests were obtained from the Visible Human Project [33]. We chose the head subset, including three protocols of MR images, T1-weighted MR, T2-weighted MR, and proton density- (PD-) weighted MR images (Figure 5) to enforce 2D registration. These images were acquired in the Portable Network Graphics (.png) format and aligned originally with one another at dimensions of 256 × 256 pixels. 3D registration experiments were carried out on simulated 3D datasets. The simulated data were obtained from the Brainweb database of the Montreal Neurological Institute [34]. These data contained 3D brain MR imaging (MRI) volumes simulated through the following three different protocols: T1-weighted, T2-weighted, and PD-weighted with various slice thicknesses, noise levels, and intensity nonuniformities. In the simulated experiments for 3D, the size of the volumes (with voxel coding on 8 bits) was 181 × 217 × 181 voxels at a voxel size of 1 mm × 1 mm × 1 mm. All corresponding slices in these volumes were aligned with each other.

Figure 5: Real 2D brain images. (a) MR T1 image. (b) MR T2 image. (c) MR PD image.
4.2. Registration for 2D Images

The performance of our registration scheme based on FCHE method was first assessed using the real 2D brain data. The role of the number of bins and random samples employed in our method was examined and compared with those in the CHE without random sampling, SH, PZW, and particle filter (PF) [35].

4.2.1. Parameter Setting

The registration parameters were chosen by trial and error from the real data. To demonstrate the effect of these parameters, we performed the experimental process as follows. Using the pair of MR T1 and MR PD volumes mentioned in Section 4.1, we chose the former volume as reference image. By contrast, the latter volume was randomly transformed and considered as the floating image. Twenty initial transformations were generated, where the and coordinates were derived from the range 0–10 mm. To quantitatively evaluate the registration results, we calculated the difference between the true values and the estimated values as the registration error.

Figure 6 displays the mean and standard deviation of the registration errors when varying the number of bins and the number of random samples . The lowest mean of the registration error was obtained when = 32 (herein, = 3000) (Figure 6(a)). By contrast, the number of random samples = 3000 achieved the relatively low errors (Figure 6(b)). In the sequel, we used = 32 and = 3000 for 2D images.

Figure 6: Rigid registration results of the real 2D brain MR T1 and MR PD images. Various values of two parameters and were chosen. The transformation parameters were generated randomly from the range 0–10 mm, with the unit of errors also in millimeters. (a) Registration errors for several various values, and = 3000. (b) Results of different numbers of random samples, with .
4.2.2. Accuracy

To illustrate the registration accuracy of our method, we compared the CHE, SH, PZW, and PR methods with the FCHE in terms of real 2D brain datasets MR T1 and MR PD. In the process, the MR T1 image was also applied as the reference image. We selected 20 affine transformations, which involved displacements along the and directions generated randomly from the range 10–20 mm, rotation angles generated randomly from 0° to 20°, and scales generated randomly from 0.9 to 1.1. Consequently, a total of 20 pairs of test images and 100 affine registrations were obtained for the five methods. Table 1 shows the means and standard deviations of the registration errors in all 100 trials that exploited the five algorithms. Notably, the registration errors incurred under the FCHE method are lower than those acquired under the other four methods. Furthermore, the FCHE method improved the execution efficiency in estimating joint distributions compared with the CHE algorithm.

Table 1: Statistics of the registration errors (mm) for 20 trials on the real MR T1 and MR PD brain images shown in Figure 4. The second and fourth columns provide the means of the errors in the and y directions. The third and fifth columns display the respective standard deviation of the errors incurred under the five methods. The sixth to eleventh columns show the means and standard deviation of rotation and scales errors.
4.2.3. Noise Resistance

This section assesses the performance of our method, as well as the CHE, SH, PZW, and PF algorithms, in the presence of different noise levels. Similar to Section 4.2.1, MR T1 and MR PD images were chosen as the reference and floating images. Three levels of Gaussian noise with zero mean were added to the MR PD image by applying the imnoise function of Matlab. The respective variances were 0.01, 0.05, and 0.1. Then, 20 initial affine transformations were obtained to transform each noisy MR PD image. As a result, 20 pairs of test images were acquired for each noise level. Altogether, 300 registrations were carried out for the five approaches.

The resulting registration errors are displayed in Table 2. In each cell, the numbers in round brackets denote the means of the translation errors in the and directions. The rotation angles with their standard deviations are in square brackets. As shown in Table 2, the registration accuracies of FCHE are superior to those of the other four algorithms for each noise level.

Table 2: Mean registration errors (mm) for 20 tests of the real 2D brain MR T1 and MR PD images using three approaches in the presence of different noise levels with their variances σ = 0.01, 0.05, and 0.1. The numbers in square brackets are the standard deviations of the errors.
4.3. Registration for 3D Images

To evaluate the performance of the FCHE approach applied to 3D image registration, we compared the method with CHE, SH, and PZW on three groups of simulated 3D brain datasets introduced in Section 4.1, namely, MR T1 and MR T2, MR T1 and MR PD, and MR T2 and MR PD. In each pair of test images, the subvolumes of size 41 × 41 × 41 voxels were exploited to implement the registration experiments along with the former image applied as reference image. Then, the zero-mean Gaussian noise with the variance of 0.05 was added to the latter of each image pair using the imnoise function of Matlab. We selected 20 rigid transformations with , , and coordinates generated randomly from the range 0–10 mm to transform the noisy images. Consequently, 20 pairs of test images for each group of data were generated, and a total of 240 3D rigid registrations for the four methods were implemented. Table 3 displays the statistics of the registration errors of the successful experiments in 20 tests conducted on three groups of images using four algorithms. Here, an experiment is regarded as “success” if the registration error in every direction is less than 1 mm. It is observed from Table 3 that the registration accuracies using the FCHE and CHE methods are comparable and better than those employing SH and PZW approaches. Meanwhile, the FCHE method also improved the registration efficiency compared to CHE.

Table 3: Mean and standard deviations (Std.) of 20 registrations of the real 3D brain MR T1 and MR T2, MR T1 and MR PD, and MR T2 and MR PD images using three strategies (noise = 0.01). For each group of tests, the translations parameters in the -, -, and -axes were selected from the range 0–10 mm, and the rotations angles around the -, -, and -axes were selected from the range 0–10 degrees.
4.4. Discussion of Experimental Results

Experiments were conducted on 2D and 3D datasets; the former used real brain data (MR T1, MR T2, and MR PD images), and the latter included simulated 3D images (MR T1, MR T2, and MR PD volumes). In 2D registration experiments, to compare the registration accuracy of our method with those of other four algorithms, 100 trials were carried out based on affine transformations using parameters randomly generated within a certain range. To access the immunity of noise, three levels of Gaussian noise were added to the floating images and 300 trials were performed. The results obtained for the 2D real images showed that our alignment technique provided lower registration errors and a better robustness to noise than those exploiting SH, PZW, and PF strategies and CHE without random sampling.

For 3D case, three groups of simulated 3D brain datasets, MR T1 and MR T2, MR T1 and MR PD, and MR T2 and MR PD, were employed to evaluate the performance of our method. For each group of test images, 20 pairs of test images were generated and totally 240 3D registrations for four methods were implemented. Experimental results on simulated 3D data show a better behavior of FCHE method with low registration errors when compared to the other methods, and the subvoxel registration accuracies are achieved by FCHE method in almost all cases.

5. Conclusion

This study introduced the FCHEs to estimate the joint probability distribution and accelerate the registration process by employing CIR and random samples on nongrid points. Estimation methods for 2D and 3D joint histograms were also elaborated. This work reduced the grid effect caused by grid-aligning transformations between two images to be registered. The reduction of such effect was achieved when a joint histogram was calculated from a subset of pixels randomly selected from nongrid coordinates of the fixed image instead of entire images. Then, a similarity measure based on the Arimoto entropy, called the JA divergence, was introduced for medical image registration. This similarity measure is a generalization of the well-known Jensen–Shannon divergence measure. We employed affine transformation as the parameter space, along with the JA divergence as the registration criteria, to carry out the registration experiments. To search the maxima of this similarity measure, we used the downhill simplex optimization approach. Then, we applied the FCHE estimation algorithm to estimate the joint probability distribution of two images to be registered.

As mentioned above, the similarity measure computed by adopting the FCHE estimation algorithm was not differentiable, which is an important property in nonrigid registrations. As a result, gradient-based optimization schemes were not available for the FCHE-based registration method. In the future, we intend to extend our application to other multimodal medical images, such as ultrasound and MR images, and to other organs.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper. They also solidly confirm that the mentioned received grants in Acknowledgments did not lead to any conflicts of interest regarding the publication of this paper.


This work was supported by the Natural Science Foundation of Henan Province under Grant 162300410338, by the National Natural Science Foundation of China under Grant 61772576, by Key Scientific Research Projects of Henan Province under Grant 17A510006, by Scientific and Technological Program Projects of Henan Province under Grant 172102210071, by the project sponsored by SRF for ROCS, SEM, by Natural Science Foundation of Jiangsu Province under Grants DZXX-031 and BY2014127-11, and by the Qing Lan Project.


  1. J. B. A. Maintz and M. A. Viergever, “A survey of medical image registration,” Medical Image Analysis, vol. 2, no. 1, pp. 1–36, 1998. View at Publisher · View at Google Scholar · View at Scopus
  2. D. L. G. Hill, P. G. Batchelor, M. Holden, and D. J. Hawkes, “Medical image registration,” Physics in Medicine and Biology, vol. 46, no. 3, pp. R1–R45, 2001. View at Publisher · View at Google Scholar · View at Scopus
  3. J. P. W. Pluim, J. B. A. A. Maintz, and M. A. Viergever, “Mutual-information-based registration of medical images: a survey,” IEEE Transactions on Medical Imaging, vol. 22, no. 8, pp. 986–1004, 2003. View at Publisher · View at Google Scholar · View at Scopus
  4. A. Collignon, F. Maes, D. Delaere et al., “Automated multi-modality image registration based on information theory,” Information Processing in Medical Imaging, vol. 3, no. 6, pp. 263–274, 1995. View at Google Scholar
  5. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Transactions on Medical Imaging, vol. 16, no. 2, pp. 187–198, 1997. View at Publisher · View at Google Scholar · View at Scopus
  6. W. M. Wells III, P. Viola, H. Atsumi, S. Nakajima, and R. Kikinis, “Multi-modal volume registration by maximization of mutual information,” Medical Image Analysis, vol. 1, no. 1, pp. 35–51, 1996. View at Publisher · View at Google Scholar · View at Scopus
  7. P. Viola and W. M. Wells III, “Alignment by Maximization of Mutual Information,” International Journal of Computer Vision, vol. 24, no. 2, pp. 137–154, 1997. View at Publisher · View at Google Scholar · View at Scopus
  8. C. Studholme, D. L. G. Hill, and D. J. Hawkes, “An overlap invariant entropy measure of 3D medical image alignment,” Pattern Recognition, vol. 32, no. 1, pp. 71–86, 1999. View at Publisher · View at Google Scholar · View at Scopus
  9. J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever, “Image registration by maximization of combined mutual information and gradient information,” IEEE Transactions on Medical Imaging, vol. 19, no. 8, pp. 809–814, 2000. View at Publisher · View at Google Scholar · View at Scopus
  10. F. Wang, B. C. Vemuri, M. Rao et al., “Cumulative residual entropy, a new measure of information & its application to image alignment,” in Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 548–553, IEEE, Nice, France, 2003.
  11. M. Rao, Y. Chen, B. C. Vemuri, and F. Wang, “Cumulative residual entropy: a new measure of information,” Institute of Electrical and Electronics Engineers Transactions on Information Theory, vol. 50, no. 6, pp. 1220–1228, 2004. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  12. F. Wang and B. C. Vemuri, “Non-rigid multi-modal image registration using cross-cumulative residual entropy,” International Journal of Computer Vision, vol. 74, no. 2, pp. 201–215, 2007. View at Publisher · View at Google Scholar · View at Scopus
  13. A. Rajwade, A. Banerjee, and A. Rangarajan, “Continuous image representations avoid the histogram binning problem in mutual information based image registration,” in Proceedings of the 3rd IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 840–843, 2006. View at Scopus
  14. P. Thévenaz and M. Unser, “Optimization of mutual information for multiresolution image registration,” IEEE Transactions on Image Processing, vol. 9, no. 12, pp. 2083–2099, 2000. View at Publisher · View at Google Scholar · View at Scopus
  15. N. Dowson, T. Kadir, and R. Bowden, “Estimating the joint statistics of images using nonparametric windows with application to registration using mutual information,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 10, pp. 1841–1857, 2008. View at Publisher · View at Google Scholar · View at Scopus
  16. N. Joshi, T. Kadir, and S. M. Brady, “Simplified computation for nonparametric windows method of probability density function estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp. 1673–1680, 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. A. Rajwade, A. Banerjee, and A. Rangarajan, “Probability density estimation using isocontours and isosurfaces: Applications to information-theoretic image registration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 475–491, 2009. View at Publisher · View at Google Scholar · View at Scopus
  18. E. López-Rubio, “A histogram transform for probabilitydensity function estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 644–656, 2014. View at Publisher · View at Google Scholar · View at Scopus
  19. E. López-Rubio and J. Muñoz-Pérez, “Probability density function estimation with the frequency polygon transform,” Information Sciences, vol. 298, pp. 136–158, 2015. View at Google Scholar
  20. H.-M. Chen and P. K. Varshney, “Mutual Information-Based CT-MR Brain Image Registration Using Generalized Partial Volume Joint Histogram Estimation,” IEEE Transactions on Medical Imaging, vol. 22, no. 9, pp. 1111–1119, 2003. View at Publisher · View at Google Scholar · View at Scopus
  21. B. Li, G. Yang, H. Shu, and J. L. Coatrieux, “A new divergence measure based on Arimoto entropy for medical image registration,” in Proceedings of the 22nd International Conference on Pattern Recognition, ICPR 2014, pp. 3197–3202, Stockholm, Sweden, August 2014. View at Publisher · View at Google Scholar · View at Scopus
  22. S. Darkner and J. Sporring, “Locally orderless registration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1437–1450, 2013. View at Publisher · View at Google Scholar · View at Scopus
  23. S. Darkner and J. Sporring, “Generalized partial volume: An inferior density estimator to parzen windows for normalized mutual information,” in Biennial International Conference on Information Processing in Medical Imaging, vol. 6801, pp. 436–447, Springer, Berlin, Germany, 2011. View at Publisher · View at Google Scholar · View at Scopus
  24. P. Thévenaz, M. Bierlaire, and M. Unser, “Halton sampling for image registration based on mutual information,” Sampling Theory Signal Image Process, vol. 7, no. 2, pp. 141–171, 2008. View at Google Scholar
  25. W. H. Press, B. P. Flannery, S. A. Teukolsky et al., Numerical Recipes in C, Cambridge University Press, Cambridge, UK, 3rd edition, 2007.
  26. R. Bhagalia, J. A. Fessler, and B. Kim, “Accelerated nonrigid intensity-based image registration using importance sampling,” IEEE Transactions on Medical Imaging, vol. 28, no. 8, pp. 1208–1216, 2009. View at Publisher · View at Google Scholar · View at Scopus
  27. S. Klein, M. Staring, K. Murphy, M. A. Viergever, and J. P. W. Pluim, “Elastix: a toolbox for intensity-based medical image registration,” IEEE Transactions on Medical Imaging, vol. 29, no. 1, pp. 196–205, 2010. View at Publisher · View at Google Scholar · View at Scopus
  28. S. Arimoto, “Information-theoretical considerations on estimation problems,” Information and Control, vol. 19, no. 3, pp. 181–194, 1971. View at Publisher · View at Google Scholar · View at Scopus
  29. F. Liese and I. Vajda, “On divergences and informations in statistics and information theory,” Institute of Electrical and Electronics Engineers Transactions on Information Theory, vol. 52, no. 10, pp. 4394–4412, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  30. J. Lin, “Divergence measures based on the Shannon entropy,” Institute of Electrical and Electronics Engineers Transactions on Information Theory, vol. 37, no. 1, pp. 145–151, 1991. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  31. J. A. Nelder and R. Mead, “A simplex method for function minimization,” The Computer Journal, vol. 7, no. 4, pp. 308–313, 1965. View at Publisher · View at Google Scholar
  32. L. Ibß, W. Schroeder, L. Ng, and L. Ibáñez, The ITK Software Guide Kitware, 2005.
  33. The Visible Human Project,
  34. BrainWeb: Simulated Brain Database,
  35. E. R. Arce-Santana, D. U. Campos-Delgado, and A. Alba, “Affine image registration guided by particle filter,” IET Image Processing, vol. 6, no. 5, pp. 455–462, 2012. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus