Abstract

We propose a joint segmentation and groupwise registration method for cardiac perfusion images by using temporal information. The nature of perfusion images makes groupwise registration especially attractive as the temporal information from the entire image sequence can be used. Registration aims to maximize the smoothness of the intensity signal, while segmentation minimizes a pixel’s dissimilarity with other pixels having the same segmentation label. The cost function is optimized in an iterative fashion using B-splines. Tests on real patient datasets show that compared to two other methods, our method shows lower registration error and higher segmentation accuracy. This is attributed to the use of temporal information for groupwise registration and mutually complementary registration and segmentation information in one framework, while other methods solve the two problems separately.

1. Introduction

Dynamic contrast-enhanced (DCE) magnetic resonance (MR) images (or perfusion MRI) have developed as a popular noninvasive tool for the functional analysis of internal organs. Contrast agent is injected intravenously into the patient, and a series of MR images are acquired over a period of time. As the contrast agent flows through the blood stream, the intensity of corresponding regions increases. Since the image acquisition process can take up to 20 minutes, patient movement is inevitable. Additionally, elastic deformation of cardiac tissues due to patient breathing needs to be corrected. Perfusion images are characterized by rapid intensity change over time, low spatial resolution, and noise and make registration challenging. Previous techniques mostly employed a pairwise registration approach, that is, all images of a sequence are individually registered to a fixed reference image [13]. The success of such approaches depends upon the robustness of the cost function to intensity change. Although intensity change due to contrast agent flow poses challenges, it also provides important temporal information for registration and segmentation of the perfusion image sequence. In this work, we propose a method which makes use of the temporal dynamics of contrast agent flow to achieve joint segmentation and groupwise registration of an MR cardiac image sequence.

The changing intensity due to contrast agent flow provides valuable segmentation by highlighting the organ of interest. It is interesting to note that different regions of a cardiac image sequence have different intensity-time characteristics (Figure 4). Temporal flow information was used for registration [4] and segmentation [5] of perfusion images. In [6], the registration framework constrained the deformation field such that different regions follow a particular intensity-time profile. But it is an accepted fact that improved registration leads to accurate segmentation and vice versa. In previous works [7, 8], we have approached the problem of joint registration and segmentation of cardiac MRI. However, we did not make use of the temporal information from the image sequence, rather focusing on pairwise registration of images. In this paper, we make use of temporal information to achieve registration and segmentation of the entire image sequence. Instead of pairwise registration, we solve the problem using groupwise registration as it allows us to impose constraints based on temporal information.

Of late, groupwise registration methods have gained popularity because of a need to register large number of datasets for atlas construction [9, 10]. Groupwise registration is generally approached by two techniques. The first approach uses pairwise registration between a template and all other images in the population as in [1113]. Yang et al. in [13] utilize the voxelwise geometry and orientation information for registration. Different measurements and rotation invariant features are used in a deformable matching mechanism for image alignment. However, pairwise registration has two limitations. First, selecting a fixed template may not accurately represent the population. Secondly, pairwise registration is not very effective when registering two images with significant anatomical differences. Only those subjects that are close to the template are registered properly.

To overcome the above limitations, there are many methods that achieve registration using all images from the population. This approach is more faithful to the term “groupwise registration.” The goal is to warp all subjects in a population towards a hidden common space simultaneously within a single framework [14, 15]. The groupwise registration problem is formulated as one of optimization, with a global cost function defined on all aligned images [16, 17]. The cost function in [16, 17] is defined as the stack entropy or the entropy of corresponding voxels in different volumes. The justification is that if images are properly aligned, intensity values at corresponding voxels from all volumes will form a low entropy distribution. With this constraint, groupwise registration is achieved within a B-spline-based free-form deformation framework.

When across-subject variation is very large, it is generally difficult to achieve good registration by simply registering each image to a template image. Therefore, many approaches make us of intermediate templates for registration [1821]. In [18, 21], an intermediate template which does not belong to the original dataset is created to aid registration between two images. Tang et al. in [21] warp the template image with a set of simulated deformation fields learnt using principal component analysis (PCA) on a set of training deformation fields. Such approaches do not guarantee that the intermediate template is realistic which may affect registration results. Jia et al. in [19] construct a tree structure where each node of the tree is represented by an image, and similar images are represented by connected nodes. Each image is registered with the help of intermediate templates determined along its own path with respect to the final template.

Groupwise registration methods are particularly suitable for perfusion DCE-MRI. Each region of the scanned organ is characterized by a different intensity profile over time. For example, in cardiac perfusion MRI, the contrast agent first flows into the right ventricle (RV) and then into the left ventricle (LV) before being flushed out of the cardiovascular system. Thus, a pixel within the RV shows a peak intensity magnitude early in the scanning sequence, while for the LV blood pool, the intensity peak occurs later (Figure 4). With this available information, we can formulate the cost function such that, after registration, pixels from certain regions follow a particular intensity time profile. This is achieved by joint segmentation and groupwise registration of DCE-MRI. There are not many works dealing with joint segmentation and groupwise registration of cardiac perfusion MRI, although Zhang et al. in [22] describe a method for the rigid registration of brain perfusion images. The cost function is derived from the total quadratic variation of image intensity. Metz et al. in [23] propose a method for groupwise registration of dynamic lung data using both spatial and temporal constraints where groupwise optimization of B-splines is used.

The temporal intensity patterns of pixels also determine their segmentation labels (i.e., RV, LV blood pool, myocardium background, etc.). Thus, the image sequence can be segmented along with groupwise registration. It is a well-known fact that registration and segmentation are mutually complementary approaches. Many works have combined them in a joint registration and segmentation framework [2426]. Including segmentation information into the cost function reduces registration error and improves segmentation accuracy. In this paper, we propose a joint segmentation and groupwise registration (JSGR) approach for cardiac perfusion MRI. Our method combines intensity information from the entire image sequence (for registration) and maximizes the similarity between a pixel and other pixels belonging to the same class (for segmentation). We describe our method in Section 2, present experimental results in Section 3, and conclude with Section 4.

2. Theory

JSGR aims to find the transformation for every image to minimize a cost function. We do not have an explicitly defined reference image but constrain the transformations such that the registered images approach a common image space (which is the reference image) and are approximated to be the center of the images being registered. Wu et al. in [27] have highlighted the importance of a sharp mean image for accurate groupwise registration. Their observations are derived from constructing atlases for a large population of brain images. Such datasets show a lot of variability, and a fixed reference image is sure to introduce bias. Our method is used to achieve registration and segmentation using images from the same dataset, where each patient has multiple frames from different time intervals.

The general objective function for JSGR consists of two terms, that is, 𝐸=𝐸data+𝐸smooth,(1) where 𝐸data is the data cost and 𝐸smooth is the smoothness cost. The data cost depends upon the type of images being registered, and the smoothness cost depends upon the optimization framework.

Perfusion images are characterized by rapid intensity change over time. Instead of relying on low-level information, we aim to exploit the temporal information for groupwise registration. First, we give a brief description of B-splines and their optimization. Then we explain the formulation of our data cost (𝐸data).

2.1. B-Spline-Based Registration

A B-spline-based freeform deformation (FFD) transformation model was presented in [28] for the elastic registration of breast images. The basic idea of FFDs is to deform an object by manipulating an underlying mesh of control points. The resulting deformation controls the shape of the 3D (or 2D) object and produces a smooth and continuous transformation. The transformation field consists of a global and local component and is defined as 𝑇(𝐱)=𝑇local𝑇global(𝐱),(2) where 𝑇global is an affine transform obtained using [1], and 𝑇local is the deformation based on B-splines. The DCE-MR images are first affinely registered to a chosen reference image. Note that this reference image is only for the purpose of rigid registration and is usually the scan showing all tissues without ambiguity. For elastic registration, there is no explicitly defined reference image. Further discussion is restricted to 𝑇local.

We define an initial 𝑛𝑥×𝑛𝑦×𝑛𝑧 grid of control points denoted as Φ. The grid points are denoted as Φ𝑖,𝑗,𝑘 and have uniform spacing. The free-form deformation can be written as the 3D tensor product of 1D cubic B-splines, 𝑇local(𝐱)=𝐱+33𝑙=03𝑚=0𝑛=0𝐵𝑙(𝑢)𝐵𝑚(𝑣)𝐵𝑛(𝑤)Φ𝑖+𝑙,𝑗+𝑚,𝑘+𝑛,(3) where 𝐱=(𝐱𝟏,𝐱𝟐,𝐱𝟑) is the displacement vector, 𝑖=𝑥/𝑛𝑥1, 𝑗=𝑦/𝑛𝑦1, 𝑘=𝑧/𝑛𝑧1, 𝑢=𝑥/𝑛𝑥𝑥/𝑛𝑥, 𝑣=𝑦/𝑛𝑦𝑦/𝑛𝑦, 𝑤=𝑧/𝑛𝑧𝑧/𝑛𝑧, and 𝐵𝑙 is the 𝑙th cubic B-spline basis function given by the following equations: 𝐵0(𝑢)=(1𝑢)36,𝐵1(𝑢)=3𝑢36𝑢2+46,𝐵2(𝑢)=3𝑢3+3𝑢2+3𝑢+16,𝐵3(𝑢𝑢)=36.(4)

Since our datasets are in 2D, the corresponding equations are 𝑇local(𝐱)=𝐱+22𝑙=0𝑚=0𝐵𝑙(𝑢)𝐵𝑚(𝑣)Φ𝑖+𝑙,𝑗+𝑚.(5)

B-splines are locally controlled, which makes them computationally efficient even for a large number of control points. In particular, the basis function of cubic B-splines have a limited support, that is, changing control point Φ𝑖,𝑗 affects the transformation only in the local neighborhood of that control point.

The local deformation of the cardiac tissues should be characterized by a smooth transformation. To constrain the spline-based FFD transformation to be smooth, one can introduce a penalty term which regularizes the transformation. The general form of such a transformation in 3D takes the following form: 𝐸smooth=1𝑉𝑋0𝑌0𝑍0𝜕2𝑇𝜕𝑥22+𝜕2𝑇𝜕𝑦22+𝜕2𝑇𝜕𝑧22+2𝜕2𝑇𝜕𝑥𝑦2𝜕+22𝑇𝜕𝑥𝑧2𝜕+22𝑇𝜕𝑦𝑧2𝑑𝑥𝑑𝑦𝑑𝑧,(6) where 𝑉 denotes the volume of the image domain. Our registration algorithm deals with 2D images in the spatial domain. Therefore, the smoothness cost can be formulated as 𝐸smooth=1𝐴𝑋0𝑌0𝜕2𝑇𝜕𝑥22+𝜕2𝑇𝜕𝑦22𝜕+22𝑇𝜕𝑥𝑦2𝑑𝑥𝑑𝑦,(7) where 𝐴 is the area of the image domain.

2.2. Similarity Measure

The images are first affinely aligned with respect to a reference image using [1]. Note that this reference image is only for rigid alignment and is not used for groupwise registration. Seed points belonging to RV, LV blood pool, myocardium, and background are identified (as shown by red arrows in the first image of the first row of Figure 3), and the labels of other pixels are determined using graph cuts. The initial labeling is used to calculate the cost functions in the first round of iteration. After the B-spline grid of each image is updated, the images are transformed and the segmentation labels are also updated based on the transformed images. We define the data cost as a combination of two terms which, individually, exploit the different characteristics of the perfusion datasets. Thus, 𝐸data is defined as 𝐸data=𝑤1×𝐸𝑊+𝑤2×𝐸𝑄,(8) where 𝐸𝑄 calculates the total quadratic variation of the dataset, 𝐸𝑊 calculates the within-class distance of each pixel, and 𝑤1,𝑤2 are weights that determine the relative contribution of each term. 𝑤1=.4 and 𝑤2=1. Two weights are used in order to examine the relative contribution of each term to the results (discussed in Section 3.4). Below we explain each term in greater detail.

2.3. Total Quadratic Variation (𝐸𝑄)

After registration of DCE-MRI, we expect the intensity time variation of pixels to be smooth after motion correction. The total quadratic variation, 𝐸𝑄, measures the smoothness of the intensity signal by a combination of its first- and second-order derivatives. The sum of first derivatives over the entire sequence contributes to a smooth signal, while the sum of second derivatives favours a piecewise linear model of the signal. During the precontrast and postcontrast stages, or for regions without contrast enhancement, the first-order derivative is relevant because we expect the intensity of the same tissue to remain constant (i.e., the first-order derivative of a constant signal is zero). During the wash-in and wash-out stages, we expect the intensity of the same tissue to increase or decrease approximately with a constant rate. In an unregistered image sequence, the derivatives during the wash-in and wash-out stage will alternate between positive and negative values. However, as the image sequence is registered, the intensity changes are gradual and the sum of derivative values is minimal. We also use the second-order derivative of the intensity vector to encourage a piecewise linear intensity signal, as the second derivative of a linear signal is zero. Except for position of peak enhancement, the second derivative is zero at other time points.

Let 𝐼𝑛 denote the image at the 𝑛th time point (or the 𝑛th frame in the dynamic sequence. The intensity of its 𝑖th pixel at time 𝑡 is given by 𝐼𝑛,𝑖(𝑡), where 𝑡=1,,𝑇. Thus 𝐸𝑄 is given by 𝐸𝑄=𝑁𝑛=1𝑖𝑇𝑡=0𝐼𝑛,𝑖(𝑡)+𝐼𝑛,𝑖(𝑡)𝑑𝑡.(9) Here, 𝐼 and 𝐼 are, respectively, the first and second derivatives of the intensity signal, and 𝑁 is the total number of pixels in each image. Note that both the first and second derivatives are used for the entire image sequence.

2.3.1. Within-Class Distance (𝐸𝑊)

The within class distance, 𝐸𝑊, integrates segmentation information into the JSGR framework. 𝐸𝑊 ensures that pixels of the same class have similar intensity time profiles. In other words, pixels with the same label are made similar to a representative signal from that class. We use the mean intensity vector for representing each class of pixels. For a pixel 𝑖 with known label 𝑙, its intensity vector provides greater information about the class labels. The within-class distance is calculated as the difference with respect to the mean intensity vector of class 𝑙(𝐈𝑙) and is given by 𝐸𝑊=𝑖𝐈𝑖𝐈𝑙.(10) Note that if pixel 𝑖 belongs to class 𝑙 (as determined from the current labels), then its difference only with respect to the mean intensity vector of class 𝑙 is calculated; denotes the Euclidean distance of the vectors. If a pixel has been correctly labeled as LV blood pool (or RV), then the residual error from the mean of LV blood pool (or RV) will be low. On the other hand, if the labeling is wrong, then the corresponding error is high. Initially, due to many unregistered pixels, the mean intensity vector may not be a very accurate representation of the class. But after every iteration, the mean intensity vector becomes smooth with the update of segmentation labels and starts to truly represent the particular class. In an iterative method, these constraints (𝐸𝑄 and 𝐸𝑊) ensure that the labels converge correctly. Although the blood pool shows a lot of intensity change, it does not affect our method. By combining the contributions of the registration and segmentation terms, we overcome the effects of intensity change.

Here, we need a representative intensity vector for a particular class. Although the mean vector is not necessarily the best representative vector at the beginning of registration, it is the best choice for a balance between registration accuracy and computational complexity. PCA could be a more accurate choice for the representative vector but significantly increases computation complexity. Moreover, we observe that with increasing number of iterations, the mean vector does converge to the representative vector of the class. This is supported by the fact that the registration and segmentation accuracies are the same when using the mean vector (after convergence) and principal components. But by using PCA, the computation time increases by more than 1.5 times.

2.4. Optimization

B-splines are used to optimize the cost function in (1). A uniform grid is initialized for all images. The first image’s grid coefficients are updated based on the present value of the energy function. The image is transformed, and the segmentation labels are immediately updated. The next image’s grid coefficients are then updated followed by image transformation and update of segmentation labels. This is repeated for all images of the sequence corresponding to different time points. This constitutes one iteration for 𝐽𝑆𝐺𝑅. The segmentation labels are updated after the iteration to be used as the starting point for the next iteration. We repeat the process until the cost function does not decrease further.

The advantage of dynamic update of segmentation labels is that it reflects the value of the energy function of the updated image sequence based on the latest transformations. Thus, if the energy function value does not change while updating the grid coefficients of three consecutive images, then JSGR can be immediately terminated. Once the process has been terminated, the final segmentation labels can be immediately obtained using graph cuts.

In order that the images are registered to a common image space, the average deformation of a pixel is constrained to be zero. This is achieved by making the sum of coefficients for the corresponding grid point over all images to be zero. After the grid of every image is updated, the images are transformed, and the segmentation labels are dynamically updated, thus ensuring that the change in grid coefficients reflects immediately in the registration process. It also leads to more accurate registration than updating the segmentation labels at the end of each iteration. The algorithm is summarized in Algorithm 1.

(1) Determine the initial labeling 𝐿 𝑓 of the image sequence.
(2) Set 𝑡 = 1 for first registration round.
(3) Initialize uniform B-spline grid for all images.
(4) Calculate the value of energy function using the data cost formulation ((9) and (10)) and the smoothness cost (7).
(5) Update the B-spline grid for the image, transform the image, and update the segmentation labels.
(6) Repeat the above steps for all images of the sequence.
(7) Repeat for all images until convergence.

3. Experiments and Results

Cardiac images were acquired on a 1.5T Siemens Sonata MR scanner following bolus injection of Gd-DTPA contrast agent. There are 12 sequences acquired from 12 patients in whom it was important to look at myocardial perfusion. Each sequence comprised of 60 frames with a total of 720 frames. The datasets were acquired with electrocardiographic (ECG) gating such that the images were acquired during the same phase of the cardiac cycle. This minimized cardiac motion, but some deformations were still observed due to patient breathing. The pixel spacing ranges from (1.5×1.5) to (2.8×2.8) mm2. The acquired images were from the same midcavity slice. The images were corrected for rotation and translation motion before segmentation. The initial labeling was obtained using graph cuts [5]. The B-spline grid was of size 10×10 with the spacing between grid control points varying from 7 to 9 pixels.

3.1. B-Spline Grids

An example of B-spline-based illustration is shown in Figure 1. A reference image, floating image, difference images before and after registration, and the deformed grid obtained after registration are shown. The initial grid is uniform and without any deformations. This example has been used for illustration purpose in which a reference image was fixed and all other images were registered using 𝐽𝑅𝐺𝑆. From the difference images and deformed grid, we can conclude that the deformations of the floating image are captured by B-spline registration.

3.2. Registration Results

We compare the registration error from 𝐽𝑆𝐺𝑅 with [16] (𝑀𝑒𝑡1), [28] (𝑀𝑒𝑡2), and our pairwise joint registration and segmentation method in [7] (𝐽𝑅𝑆). 𝑀𝑒𝑡1 does not define an explicit reference image and uses the entropy of the pixel stack for the cost function. 𝑀𝑒𝑡2 uses pairwise registration with normalized mutual information (NMI) as the similarity measure. It has been used before for registering contrast-enhanced breast MRI. 𝐽𝑅𝑆 is a joint registration and segmentation method, but it only uses information from a pair of images and not from the entire image sequence. Our algorithm converged after 6 iterations. The threshold cost difference above which registration continues is set at 0.1. We present qualitative and quantitative results of our method (𝐽𝑆𝐺𝑅) in terms of registration error and segmentation accuracy. For 𝑀𝑒𝑡1, the number of iterations required for convergence was 5, while on an average, 𝑀𝑒𝑡2 needed 4 iterations for convergence while registering a pair of images. 𝐽𝑅𝑆 is a one-time optimization method and is not iterative. The reported results are after convergence of all algorithms.

The outline of the LV (epicardium and endocardium) and RV is manually identified by expert observers in the original image sequence. These contours are denoted as 𝐶org. The transformation of each image is used to map the contours to the registered image space. Let these contours be denoted as 𝐶trans. Since it is practically impossible to have ground truth value for elastic registration, we take a different approach to calculate registration error. In a perfusion image sequence, the shape of the heart should be same for ideal registration of all images. To calculate the registration error, we fix one image as reference (usually the first image of the sequence) and calculate the contour distance between the LV (or any object of interest) in all other images and the reference image. At the end of our 𝐽𝑆𝐺𝑅 method, not only has the sequence been registered but the segmentation labels of each frame have also been determined. Thus, we just need to find the contour distance between regions with the same labels. By contour, we refer to the outline of the segmented region. Note that the same procedure as above can be used to calculate the registration error before registration. The contour distance is calculated using the following steps:(1) let the reference contour for the registered image sequence be denoted as 𝐶reftrans,(2) let the 𝑖th point on 𝐶trans be denoted by 𝐶trans(𝑖) and let the 𝑗th point on 𝐶reftrans be denoted by 𝐶reftrans(𝑗),(3) for every 𝐶trans(𝑖) find the point on 𝐶reftrans(𝑗) such that the distance between 𝐶trans(𝑖) and 𝐶reftrans(𝑗) is minimum𝑑(𝑖,𝑗)=min𝑗𝐶trans(𝑖)𝐶reftrans(𝑗),(11)(4) for each point 𝐶trans(𝑖), the corresponding 𝑑(𝑖,𝑗) is calculated. The contour distance (CD) is the average distance and is defined asCD=𝑖𝑑(𝑖,𝑗)𝑖.(12)

The average error measures for 12 datasets are given in Table 1. Lower the registration error better is the method’s performance. We observe that 𝐽𝑆𝐺𝑅 has the lowest registration, while 𝑀𝑒𝑡2 has the highest error. 𝑀𝑒𝑡2 registers all images to a fixed reference image. Intensity change is common between two images of the dataset. Consequently, registration is prone to error while using a simple NMI-based similarity measure. Groupwise registration has the advantage that information from the entire sequence can be exploited for registration which is particularly important for DCE-MRI. The importance of dynamic information is highlighted by the fact that 𝐽𝑅𝐺𝑆 shows slightly higher accuracy than 𝐽𝑅𝑆 which is a pairwise joint registration and segmentation method. 𝐽𝑅𝑆’s performance is better than 𝑀𝑒𝑡1 although 𝑀𝑒𝑡1 is a groupwise registration method. This indicates that inclusion of segmentation information improves registration more than the inclusion of dynamic information. Between 𝑀𝑒𝑡1 and 𝐽𝑆𝐺𝑅, the latter performs better because it combines segmentation and registration information in the cost function. Subsequently, the final registration errors for 𝐽𝑆𝐺𝑅 are lower than 𝑀𝑒𝑡1. In Figure 2(a), we show the error measures for the LV for each of the 12 datasets.

The time taken for registering one full dataset (including 60 images) is 1 hr 33 minutes using 𝐽𝑆𝐺𝑅, 58 mins using 𝑀𝑒𝑡1, 45 mins using 𝑀𝑒𝑡2, and 55 mins using 𝐽𝑅𝑆. 𝐽𝑆𝐺𝑅 and 𝐽𝑅𝑆 were implemented using MATLAB 7.5 on a PC having a Pentium 4, 3 GhZ processor. 𝑀𝑒𝑡1 was implemented by the authors using ITK and thus has low execution time. 𝑀𝑒𝑡2 was also implemented in MATLAB for pairwise registration. Since 𝐽𝑆𝐺𝑅 is a joint segmentation and groupwise registration method, it takes more than twice the time compared to 𝑀𝑒𝑡2.

3.3. Segmentation Results

Segmentation accuracy is calculated based on Dice metric (𝐷𝑀) values between manual segmentation and automatic segmentation for different methods. Manual segmentations for each slice are obtained after registration by each method. After registration is complete for each method, the segmentation labels are obtained by applying graph cuts on the intensity vectors. Now, the segmentation labels for corresponding pixels on different slices will be the same. These labels are compared with the manual segmentations to get the segmentation accuracy using 𝐷𝑀. For 𝐽𝑆𝐺𝑅, the labels are already obtained after the registration process.

The average 𝐷𝑀 values for the LV over all 12 datasets are shown in Table 1. Figure 2(b) shows the average 𝐷𝑀 values of the LV for each dataset over all frames of the sequence. The 𝐷𝑀 values are highest for 𝐽𝑆𝐺𝑅 thus indicating maximum accuracy amongst the three methods.

Figure 3 shows the segmented contours for LV endocardium overlaid on a representative image of the database. The representative image is chosen such that the blood pool is visible without any ambiguities. The manual segmentations are shown in red, while the automatic segmentations are shown in green. The first row also shows the initial segmentation in yellow. The initial segmentation is a result of applying graphcut to the unregistered image sequence and using the intensity vector of each pixel. Note that the automatic segmentations are the average contours over all frames of the sequence and the manual segmentations are also the average of the manually drawn contours. Since the segmentation labels are calculated from the intensity vectors, the labels will be the same for corresponding pixels in all frames. The first column shows results for 𝐽𝑆𝐺𝑅, second column shows results for 𝑀𝑒𝑡1, the third column shows results for 𝑀𝑒𝑡2, and the fourth column shows results for 𝐽𝑅𝑆. Each row shows results for different datasets. Again we observe that 𝐽𝑆𝐺𝑅 shows the best agreement with manual segmentations due to the combination of registration and segmentation information. The other two methods being solely focused on registration perform inadequately for cardiac perfusion images. The accuracy measures highlight the importance of integrating registration and segmentation information. This combination is particularly important when images have low contrast, and also when segmentation information is available to be exploited for registration. Although the 𝐷𝑀 and registration error for 𝐽𝑅𝐺𝑆 and 𝐽𝑅𝑆 are similar, a 𝑡-test gives 𝑃<0.032, thus indicating statistically different results, and hence improvement in results by using dynamic information.

Figure 4 shows the intensity variations with time for pixels on the LV blood pool, RV, and endocardium. Edge pixels are chosen for epicardium and RV, and therefore, the intensity change before registration (Figure 4(a)) is noisy. After registration using 𝐽𝑆𝐺𝑅, the intensity variation is smoother (Figure 4(b)) and highlights the success of our method for time-varying data.

3.4. Importance of 𝐸𝑄and 𝐸𝑊

It is important to look at the contribution of 𝐸𝑄 and 𝐸𝑊 to the overall registration procedure. 𝐸𝑄 can be termed as the registration energy, while 𝐸𝑊 is the segmentation energy. We examine the improvement brought about by 𝐸𝑊 to registration accuracy, and also the improvement in segmentation accuracy due to 𝐸𝑄. We vary 𝑤1 (8) from 0 to 5 (keeping 𝑤2=1 fixed) and calculate the registration and segmentation accuracy values for the LV (shown in Table 2). It is observed that as 𝑤1 increases from 0, the registration and segmentation performance both improve. In fact, when 𝑤10.4, the registration accuracy is less than or comparable to 𝑀𝑒𝑡1 but improves with greater contribution of 𝐸𝑊. However, if 𝑤1>2, the registration accuracy starts to degrade (evident from higher registration error), and the 𝐷𝑀 values also decrease due to unbalancing of each terms contribution.

Similarly, when we increase the value of 𝑤2 (with 𝑤1=0.4) from 0 to 5, we observe low 𝐷𝑀 values when 𝑤2<1. The best segmentation accuracy is obtained for 1𝑤23. However, when 𝑤2>3, the 𝐷𝑀 values start to decrease and the registration error also increases. Table 3 shows the change in registration error and 𝐷𝑀 values with change in 𝑤2. The best results are obtained for 𝑤2=1 and 𝑤1=0.4.

4. Conclusion

We have proposed a novel method for joint segmentation and groupwise registration of the LV in cardiac perfusion images. By maximizing the smoothness of the temporal intensity signal, our method uses available temporal information from the entire image sequence. This helps to overcome the effects of intensity change. Segmentation information is incorporated by minimizing the error between a pixel’s intensity vector and mean intensity vector of the same class. Compared to manual segmentations, our method gives higher segmentation accuracy than other methods. The registration errors for our method are the lowest from amongst the three methods. Our being a joint segmentation and groupwise registration approach, both registration and segmentation performance are better than conventional methods. This is because of two factors: (1) exploiting temporal information from DCE-MRI sequence in a groupwise registration framework and (2) use of mutually complementary registration and segmentation information, while most other methods solve registration and segmentation separately. Our method has the potential to be used for other data types having time-varying characteristics, and in future, we aim to use our method on other dynamic datasets.