Abstract

This paper proposed a novel algorithm to sparsely represent a deformable surface (SRDS) with low dimensionality based on spherical harmonic decomposition (SHD) and orthogonal subspace pursuit (OSP). The key idea in SRDS method is to identify the subspaces from a training data set in the transformed spherical harmonic domain and then cluster each deformation into the best-fit subspace for fast and accurate representation. This algorithm is also generalized into applications of organs with both interior and exterior surfaces. To test the feasibility, we first use the computer models to demonstrate that the proposed approach matches the accuracy of complex mathematical modeling techniques and then both ex vivo and in vivo experiments are conducted using 3D magnetic resonance imaging (MRI) scans for verification in practical settings. All results demonstrated that the proposed algorithm features sparse representation of deformable surfaces with low dimensionality and high accuracy. Specifically, the precision evaluated as maximum error distance between the reconstructed surface and the MRI ground truth is better than 3 mm in real MRI experiments.

1. Introduction

Organ deformation during operations has imposed substantial challenges for performing precise diagnosis and surgery in minimum invasive surgery (MIS), such as natural orifice transluminal endoscopic (NOTES) [1]. The deformation markedly decreases the precision of the prior surgical plan that is based on the preoperative images (e.g. computed tomography (CT) or magnetic resonance imaging (MRI)), so it must be effectively compensated to lower surgical risks. However, this is not a trivial task due to the high degree of freedom of the 3D nonrigid deformation and limited field of view [1] for observation during MIS. To recovery, the 3D deformation with high resolution in real time, a critical issue is to seek an efficient representation of deformable surface, according to which the sampling and surface recovery strategy can be designed for updating the 3D visualization. This paper focuses on the topic of block sparse representation of deformable surfaces. The later topic of real time tracking an deformable organ with limited access to the organ is explored further in [24].

Various techniques have been proposed for surface description, and each has its own advantage and disadvantage according to the application requirements. Broadly speaking, there are two major categories of surface representation methods: local feature-based models and global or parametric models. The work in [5] based on geometric partial differential equations (PDE) belongs to the former category which derives Euler-Lagrange equation and then a geometric evolution equation (or geometric flow) to describe the surfaces. Similarly, the method in [6] treats the whole surface as a union of localized patches. Global surface representation methods [710], particularly, decomposing surfaces into other primitive shapes, are more appropriate for shape analysis and classification due to the lower dimensionality of the parameter space. This paper falls into the category of parametric global surface description.

Parametric surface representation describes a surface in a single functional form, such that the surface is fully characterized by a set of parameters in a particular domain. Surface harmonics such as spheroidal harmonics, cylindrical harmonics, and spherical harmonics (SH) [7] are widely used as building blocks for global surface description. Each harmonic does not bear localized features but contributes to the entire shape description. Among those different types of harmonics, a well-known approach is the spherical harmonic decomposition (SHD), which has advantages of smoothness and high accuracy [7, 11]. With proper parameterization [11, 12], any genus-0 surface (The genus of a connected surface is an integer representing the maximum number of cuttings along nonintersecting closed simple curves without rendering the resultant manifold disconnected. It is equal to the number of handles on it, so a sphere is genus 0 and a torus is genus 1.) can be analyzed in the harmonic domain with reduced data dimensions. SHD has been widely used in applications related to surface description, including static modeling of kidney [8] and brain [9, 10], as well as spatial-temporal modeling of left-ventricular with known motion period. In [13], hemisphere is also applied to open surfaces.

Sparse signal representation has steadily gained attention over the years in the signal processing community. The aim is to find a representation which is sparse, or compact, such that most of the energy of a signal can be captured with only a few nonzero coefficients in a given dictionary. The first widely applied methods to seek sparse representation are greedy approaches, including matching pursuit (MP) [14], orthogonal matching pursuit (OMP) [15], and orthogonal least square (OLS) [16]. Those methods iteratively first select the most correlated element from dictionary and then remove the contribution of that element with decorrelation, before finding the next atom. Iteration terminates when any stopping criteria is met. The second type is global optimization algorithms, such as Basis Pursuit (BP) [17], FOCUSS [18] and Iterative Thresholding [19]. Global Optimization, in the approximate sense, relaxes the sparseness constrain, and its sparsity is a side-effect of the optimization. For example, the basis pursuit (BP) method approximates the 𝑙0-norm sparsity constraint with an 𝑙1-norm criteria, which effectively converts the problem into a convex optimization one, solved globally with linear programming. The orthogonal subspace pursuit (OSP) method [20] used in our paper belongs to the greedy category, which does not require prior knowledge of the dimension of the subspaces and combines the learned subspaces to produce a data-driven dictionary with good sparseness and generalizability.

Besides sparse decomposition algorithms as mentioned above, an equally important issue for sparse representation is how to select a dictionary for an application. The two main groups for dictionary design methods are structured dictionaries built out of common bases, and trained dictionaries that are inferred from the training data. For the common bases, it is well known that the wavelet transform can be used to generate sparse multiscale representations of images, the short-time Fourier transform (STFT) generates sparse time-frequency representation of speech signals, and the DCT is another transform that has been used for compression in audio coding algorithms due to its good compaction property. For dictionary learning, the applicable approaches include maximum likelihood estimation (MLE) [21], method of optimal directions (MOD) [22], maximum a-posteriori (MAP) [23], and so forth. Those methods attempt to generalize the type of considered signal with the basis identified from a representative training data set. The proposed approach is based on trained dictionary, since, to the best of our knowledge, there is no common basis in which random surfaces can be sparsely represented.

Although sparse representation has been widely applied in the fields of signal compression, image denoising, blind source separation, and compressed sensing, there is still very limited application in 3D surfaces [24]. In fact, the main statistically motivated surface modeling methods are based on principle component analysis which is not sparse [2529]. Those methods first compute the mean shape and then build the model by establishing legal variations learned from a set of training data for a given type of images, such as bone [25]. With PCA, the major variations of the shape populations are described by the first few basis vectors, such that any surface of that shape population can be projected into an orthogonal subspace spanned by the retained vectors. More advanced techniques, such as multiresolution deformable model [27], are provided to improve the accuracy considering limited sample size.

Most of the previous surface modeling/representation works are designed for either static models [810], or particular deformable organs with known physical properties (such as motion cycle) [30]. Further, computation bottleneck caused by large spherical harmonics basis hampered the applicability for real-time application. For PCA-based modeling methods, the resulting space that captures the variation in the population is either a super subspace including all training data or a truncated subspace with sacrificed generalization. In addition, PCA tends to be computational expensive when performing eigenvector decomposition as training data dimension increases, and it does not lead to any structure in the representation.

To bring the demonstrated merits of sparse coding to 3D surface representation, we propose a generally applicable algorithm of parametric sparse representation of deformable surfaces based on SHD and orthogonal subspace pursuit. The main contributions of this paper include the following. (i)Propose an algorithm of sparse representation of deformable surfaces.(ii)Generalize the representation approach for organs involving both interior and exterior surfaces.(iii)Present evaluation results conducted using computer models, ex vivo experiments based on 3D MRI scans of freshly excised porcine kidneys, and in vivo cardiac MRI scans of real patients.

This paper is organized as follows: in Section 2, we describe the proposed algorithm of sparse representation of deformable surface, denoted as SRDS thenceforth. Section 3 presents some experimental results using finite element model (FEM) data, ex vivo and in vivo experimental data. Finally, in Section 4, we finish with a few conclusions.

2. The SRDS Algorithm

The SRDS algorithm consists of three main steps to achieve sparse representations of deformable surfaces, as outlined in Figure 1. Initially, SHD is performed to depict the deformable surfaces in the training set in the harmonic domain. Then OSP is applied in the transformed domain to identify the subspaces in which the SH coefficient vectors of the deformations can be linearly represented. Finally, each deformation is clustered to the proper subspace and represented with the corresponding coefficient vector with block sparsity. This representation method is also extended to organ deformations occurred on both interior and exterior layers as described in Section 2.4. Furthermore, as a practical issue, pixel-wise surface alignment among all the 3D surfaces is also addressed in Section 2.5.

2.1. Step 1: SHD

Spherical harmonics are solutions to Laplace's equation expressed in the spherical coordinate system, defined as𝑌𝑙𝑚(𝜃,𝜑)=(1)𝑚2𝑙+14𝜋(𝑙𝑚)!𝑃(𝑙+𝑚)!𝑙𝑚(cos𝜃)𝑒𝑖𝑚𝜑,(1) where 𝜃 is the polar angle within [0, 𝜋], 𝜑 is the azimuthal angel within [0, 2𝜋), 𝑙 is the harmonic degree within [0, +], and 𝑚 is the harmonic order varying in [𝑙, 𝑙]. 𝑃𝑙𝑚 is the associated Legendre function. After proper parameterization [11, 12], a 3D surface 𝐱 with finite energy can be expanded with SH series as𝐱(𝜃,𝜑)=𝑙=0+𝑙𝑚=𝑙𝑓𝑙𝑚𝑌𝑙𝑚(𝜃,𝜑).(2) Each harmonic coefficient 𝑓𝑙𝑚 is calculated using the inner product of the function 𝐱(𝜃,𝜑) and basis 𝑌𝑙𝑚(𝜃,𝜑)𝑓𝑙𝑚=2𝜋𝜑=0𝜋𝜃=0𝐱(𝜃,𝜑)𝑌𝑙𝑚(𝜃,𝜑)𝑠𝑖𝑛𝜃𝑑𝜑𝑑𝜃.(3)

Assume that harmonics up to level 𝐿 (𝑙𝐿) are involved in the transformation. Let 𝐘 denote the matrix composed of all (𝐿+1)2 discretized harmonics, so 𝐘 has the following formation:|||𝑌𝐘=0,0𝑌1,1𝑌1,0𝑌1,1𝑌𝐿,𝐿|||𝑁×(𝐿+1)2.(4) Then a surface can be represented in the matrix format as𝐱=𝐘𝑓,(5) where 𝐱 stands for a surface with 𝑁 samples and 𝑓=[𝑓0,0𝑓1,1𝑓1,0𝑓1,1𝑓𝐿,𝐿]𝑇 is the harmonic coefficient vector. Notice that this equation is not exactly equal but approximately. For simplicity, we still use equal sign with least square estimation in this paper. The linear problem in (5) can be solved with the least square (LS) constraints outputting 𝑓𝐘𝑓=𝑇𝐘1𝐘𝑇𝐱.(6)

Perform SHD for each of the 𝐾 training deformed surfaces in 𝐗={𝐱𝑘}𝐾𝑘=1, so the group of deformations can be described by matrix 𝐅|||𝑓𝐅=1𝑓2𝑓𝐾|||(𝐿+1)2×𝐾.(7) Consequently, the training set of deformations can be totally characterized by columns in 𝐅 as𝐗=𝐘𝐅.(8)

2.2. Step 2: Subspace Identification with OSP

The aim of the second step is to explore the structures in those training deformations in the transformed harmonic domain and recognize the inherent subspaces in which the SHD coefficient vectors of the training deformations can be projected with high accuracy. The newly developed OSP algorithm [20] is adopted since it features better generalization and less computational cost compared to the gold standard K-SVD algorithm [31]. OSP is an iterative process that terminates when one of the predefined criteria is met. In this paper, we specify the following two stopping criteria: (1) an error threshold for 𝜀 subspaces selection and (2) a maximum number of iterations 𝐸max for both controlling the subspace dimensions and avoiding deadlock searching. Further, the threshold for vector clustering is denoted as 𝜂, that is, we declare that a vector lives in a subspace if it can be projected to that subspace with error (𝑙2 distance) no larger than 𝜂.

A. Subspace Pursuit
Initially, each vector 𝑓𝑘 of length (𝐿+1)2 in 𝐅 is normalized by 𝑙2 norm. For convenience, we still use 𝐅 to denote SH coefficient matrix even after normalization. The algorithm first identifies a subspace from 𝐅 based on the stop criteria, then finds all the vectors in 𝐅 that can be represented in that subspace with error level below 𝜂 and remove those vectors from 𝐅 to prepare for the next subspace pursuit. The process can be generalized as follows, in which 𝐴𝐵 means that elements from 𝐵 are excluded from 𝐴, and 𝐴𝐵 stands for inclusion.
(1)Initialization: 𝑖=0, 𝐃=, 𝐅0=𝐅.(2)Subspace searching and clustering (i)𝑖=𝑖+1; choose a vector 𝑓𝑖 from 𝐅𝑖 (e.g. first column of 𝐅𝑖) and let 𝐅𝑖=𝐅𝑖1𝑓𝑖 to remove 𝑓𝑖. (ii)Find 𝑛𝑖 vectors from 𝐅𝑓=𝐅𝑖 for representing 𝑓𝑖 with error no larger than 𝜀 within 𝐸max iterations and the 𝑛𝑖 vectors form 𝑆𝑖. (iii)Perform SVD decomposition on 𝑆𝑖: 𝑈Σ𝑉𝑇=𝑆𝑖; let 𝐀𝑖 contain the first 𝑛𝑖 vectors of 𝑈; update 𝐃=𝐃𝐀𝑖. (iv)Select vectors from 𝐅𝑖 that can be represented by 𝐀𝑖 with error no larger than 𝜂, and then remove them from 𝐅𝑖.(v)Repeat above steps until all the vectors are clustered.

B. Subspace Pruning
One disadvantage of the traditional OSP algorithm is the presence of “spurious” or redundant subspaces especially as the dimension of the training data set increases. Those subspaces identified in the earlier iterations actually can be better represented by the later identified subspaces. Therefore, a postprocessing step is used to identify and then discard the redundancy among the subspaces without decreasing the performance. This is implemented by repartitioning the training data among the initial subspaces and then eliminating subspaces in which very few or no vector is clustered. In some cases, where the subspace size is limited to some constraint, an optimization step can be applied in conjunction with the pruning step. The details of the subspace optimization design is described in [20].

C. Matrix 𝐅 Factorization
After identifying the inherent subspaces, the coefficient matrix 𝐅 of training set can be partitioned into two-part union of subspaces and corresponding coefficients via the following procedures.
Since each vector 𝑓𝑘 has been clustered into the belonging subspace during the subspace identification process, the corresponding coefficients for each 𝑓𝑘 can be obtained accordingly. Suppose that 𝑓𝑘 lives in subspace 𝐀𝑖 which is spanned by 𝑛𝑖 orthogonal basis, so its corresponding coefficients 𝑐𝑘 can be calculated using LS estimator𝑐𝑘=𝐀𝑇𝑖𝐀𝑖1𝐀𝑇𝑖𝑓𝑘.(9) Then 𝑓𝑘 can be characterized by 𝑐𝑘 in its subspace 𝑓𝑘=𝐀𝑖𝑐𝑘.(10)
If there are totally 𝐽 subspaces identified from 𝐅, a structured dictionary constructed by concatenating all deformation subspaces is established as 𝐃=𝐽𝑖=1{𝐀𝑖}, with dimension 𝐼=𝐽𝑖=1𝑛𝑖. Since each vector 𝑓𝑘 lies in one of the subspaces, 𝑓𝑘 can also be represented in the structured dictionary with a block sparse vector {̃𝑐𝑘}, which is obtained via extending the coefficients {𝑐𝑘}𝐾𝑘=1 by zero padding in positions corresponding to other subspaces in 𝐃. Figure 2 provides an example of 3 subspaces to illustrate the sparsity of coefficient vector ̃𝑐1. If 𝑓1 lies in subspace 𝐀2 which are spanned by the 5th, 6th, and 7th columns in 𝐃, then ̃𝑐1 has nonzero values only at index of 5, 6, and 7. Consequently, 𝐅 can be factorized as𝐅=𝐃𝐂,(11) where ̃𝐂=|𝑐1̃𝑐2̃𝑐𝐾|𝐼×𝐾 is the corresponding coefficient matrix with block sparsity.

2.3. Step 3: Structured Sparse Surface Representation

A. Sparse Representation of Training Surfaces
Integrating the subspace pursuing results in the harmonic domain in (11) with the initial SHD process in (8), the training deformations 𝐗 can be sparsely represented in the original spatial domain as𝐗=𝐘𝐃𝐂=𝐆𝐂,(12) where 𝐆=𝐘𝐃 with size of 𝑁×𝐼 is the desired structured dictionary in the spatial domain. Since 𝐃=𝐽𝑖=1{𝐀𝑖}, 𝐆 is inherently structured by subspaces of 𝐆=𝐽𝑖=1{𝐆𝑖} with 𝐆𝑖=𝐘𝐀𝑖 of size 𝑁×𝑛𝑖.
Up to this point, with matrix 𝐆 that captures the deformation features in the considered population, each training deformation 𝐱𝑘 in 𝐗 can be fully characterized by a sparse coefficient ̃𝑐𝑘 as𝐱𝑘̃=𝐆𝑐𝑘.(13) The sparsity of ̃𝑐𝑘 has already been illustrated in Figure 2.

B. Sparse Representation of Testing Surfaces
For the testing deformations beyond the training set, we utilized the fact that the dictionary identified from an extensive training data features good generalization such that any deformation of that particular population can be represented in the subspaces with high accuracy. This is justified because organs only deform in limited ways due to their mechanical properties, so the deformation variations can be fully learned from a training data set. This applied structure allows fast deformation representation in subspaces of low dimensionality.
The testing set is denoted as 𝐇={𝐡𝑚}𝑀𝑚=1, where 𝑀 is the number of deformations to be represented. The straightforward strategy is to find a best-fit subspace for 𝐡𝑚 by projecting it to every subspace {𝐆𝑖}𝐽𝑖=1 and choose the subspace with minimal projection error. Since the number of subspaces 𝐽 and the dimension of each subspace {𝑛𝑖}𝐽𝑖=1 are both small with the postprocessing of subspace pruning, this best-fit strategy still results in low computational cost. However, when the number of subspaces is too large, an alternative threshold approach can be applied by finding a subspace 𝐆𝑖 in which 𝐡𝑚 can be represented with an error level smaller than 𝜂. The former best-fit method is implemented in Section 3 for performance validation.
Suppose that 𝐆𝑖 is the chosen subspace, so coefficient vector 𝑐𝑚 can be estimated with LS as𝑐𝑚=𝐆𝑇𝑖𝐆𝑖1𝐆𝑇𝑖𝐡𝑚.(14) Then block sparse vector ̃𝑐𝑚=00𝑐𝑚000(15) is obtained according to the rules described in Section 2.2. Further, the sparsity of ̃𝑐𝑘 or ̃𝑐𝑚 can be increased by trimming off nonzero elements with absolute value lower than a given threshold 𝛿.
It is worth noting that, different from the traditional learning approaches relying on orthogonal least square (OLS) [32] or matching pursuit (MP) [33] algorithms which select atoms from the training set and recombine them for representing each surface in the testing set sparsely, our SRDS algorithm avoids this heavy overload caused by reshuffling all the atoms. Instead, we apply the block structure of the dictionary learned from a representative training data set. This essentially enables the representation of each deformable surface compactly and sparsely with high accuracy and low computational cost.

2.4. Extended Sparse Surface Representation

For an organ with both interior and exterior surfaces, such as bladder, deformations can take place on both layers. The above theory can be extended to achieve sparse surface representation for deformations occurred on both interior and exterior wall of the object.

Initially, spherical parameterization is conducted on interior and exterior parts separately. We denote 𝑥in𝑘 and 𝑥ex𝑘(1𝑘𝐾) as the corresponding interior and exterior of each training surface 𝑥𝑘. Then each pair of 𝑥in𝑘 (with 𝑁1 vertices) and 𝑥ex𝑘 (with 𝑁2 vertices) can be approximated by spherical harmonic basis as𝑥𝑘=𝑥in𝑘𝑥ex𝑘=𝐘in𝑂𝑂𝐘ex𝑓in𝑘𝑓ex𝑘(16) where 𝐘in of size 𝑁1×(𝐿+1)2 and 𝐘ex of size 𝑁2×(𝐿+1)2 denote the spherical harmonic basis for inner and outer surfaces, respectively. 𝐿 is the highest degree of harmonics included. 𝑓in𝑘 and 𝑓ex𝑘 are the corresponding harmonic coefficient vectors. Therefore, each deformation is represented by vector 𝑓𝑘=𝑓in𝑘𝑓ex𝑘, and all 𝐾 training frames can be characterized by {𝑓𝐹=𝑘}𝐾𝑘=1 as𝐘𝐗=𝐘𝐅=in𝑂𝑂𝐘ex𝑓in1𝑓in𝐾𝑓ex1𝑓ex𝐾.(17)

The following procedures of subspace identification and sparse surface representation as described in Sections 2.2 and 2.3 can be applied straightforwardly. After identifying 𝐽 subspaces 𝐃=𝐽𝑖=1{𝐀𝑖} from SH coefficient matrix 𝐅, each training deformation can be sparsely represented with block sparse coefficient vector ̃𝑐𝑘 as:𝑥𝑘=𝑥in𝑘𝑥ex𝑘=𝐘in𝑂𝑂𝐘ex𝐃̃𝑐𝑘̃=𝐆𝑐𝑘=𝐆𝑖𝑐𝑘,(18) where 𝐆𝑖=𝐘in𝑂𝑂𝐘ex𝐀𝑖 is the subspace with size of (𝑁1+𝑁2)×𝑛𝑖, and 𝐆=𝐽𝑖=1{𝐆𝑖} is the desired structured dictionary. Accordingly, ̃𝑐𝑘 is the block sparse vector, and 𝑐𝑘 is the nonzero coefficient values in the selected subspace.

2.5. Surface Correspondence

Similar to other surface modeling methods [25, 27, 28], the proposed approach requires point-wise correspondence across difference surfaces besides rigid registration [34]. Specifically, this point-to-point alignment is established, such that an identical spherical parameterization can be applied in the SHD procedure. Figure 3 illustrates the goal of surface correspondence. Same colored vertices on deformation 1 and 2 indicate a matched pair. After established correspondence of the point pairs over the two deformations, vertices on deformation 2 can be numbered in the same order as deformation 1.

Different correspondence methods have been proposed, such as minimum description length [35], SH-coefficient alignment [36], and so forth. We applied the SH based method [36] in this paper as well as ray-casting method for simple surfaces. The former SH-based method is based on the underlying fact that two points with the same parameter pair when mapped to a sphere are considered to be a corresponding pair. Therefore, it fixes parameterization of the template and rotate the other to optimize the surface correspondence by minimizing the root mean squared distance of the two SH coefficient vectors. The latter ray-casting method is introduced in the following section.

2.5.1. Point Correspondence with Ray Casting

For surfaces, if unique intersection exists between a ray starting from its object center and the surface, a ray-casting method can be applied to obtain sample pairs across all the deformations. For illustration, Figures 4(a) and 4(b) depict two different cases of ray-surface intersection in the simplified 2D space. In Figure 4(a), there is only one intersected point (p1 for ray 𝑟1, and p2 for ray 𝑟2) between each ray and the deformed surface S1. By contrast, Figure 4(b) gives an example when multiple intersections (p1, p1′, p1′′ for ray 𝑟1 and p2 for ray 𝑟2) are involved between rays and the surface S2.

If the condition of single ray-surface intersection applies, deformations can be resampled through the following steps to achieve point correspondence. (i)Construct an icosahedron of 𝑊 vertices with radius large enough to embrace the largest deformation volume among those under consideration; larger 𝑊 results in denser samples to maintain the local details but incurs more computational cost. (ii)Align the center of the 3D surfaces to the origin of icosahedron such that rays casting from the origin can intersect with the surface. (iii)For each ray segment originated from the center to a vertex on the icosahedron, find the triangle on the surface mesh that intersects with the segment and use that intersected point as new surface sample.

Figure 4(c) illustrates the desired sample pairs as (𝑝1,1,𝑝2,1) and (𝑝1,2,𝑝2,2) over two deformations S1 and S2. This resampling process also establishes a one-to-one map between a point on the object and a point on the sphere (icosahedron), which naturally meets the purpose of spherical parameterization. As a result, point-wise correspondence can be achieved across all the resampled surfaces, and a uniform spherical harmonic matrix 𝐘 can be applied. As an example, Figures 5(a) and 5(b) compare an original kidney surface with the corresponding resampled surface. We can see that ray-casting procedure well maintains the shape.

3. Experiments

Three types of experiments are conducted to demonstrate the feasibility of the proposed SRDS algorithm. The computer-generated FEM data is first used to demonstrate that the SRDS approach matches the accuracy of complex mathematical modeling techniques, then an ex vivo experiment is conducted using 3D MRI scans of porcine kidneys for evaluation in practical settings, and finally in vivo experiment is carried over dynamic cardiac MRI scans for evaluation in real patients.

3.1. Experiment with FEM Data

Three representative organs are employed in this FEM experiment: 3D cortical mesh as an example of complicated shapes, gallbladder as an instance with geometrically simple shape, and bladder consisting of both interior and exterior walls.

3.1.1. Computer Model Setup

The initial 3D models of different organs are fed into a FEM-based surgical simulation tool to generate deformation data for testing. For instance, Figure 6 demonstrates two examples of shape distortions due to the endoscope poking and grasping one side of the gallbladder.

Table 1 lists the FEM experimental setup of the three organs including number of vertices 𝑁, SH level 𝐿, number of deformations for training 𝐾, and testing 𝑀. “GBL” stands for gallbladder in all the tables. The maximum SH level used for brain model is chosen according to [9], and the levels for gallbladder and bladder are determined when the SHD representation error is below 0.1% (EOF). The complex brain structure requires more vertices and higher SH level for surface representation to achieve sufficient accuracy. To evaluate the representation precision qualitatively, an evaluation parameter EOF is defined as the normalized Euclidean distance between the original surface and the reconstructed surfacê𝐱EOF=𝑘𝐱𝑘2𝐱𝑘2.(19) All surfaces are centered to the origin of the coordinate system so that EOF will not be heavily affected by the denominator.

3.1.2. Results

With the FEM data, the proposed SRDS algorithm is evaluated from three perspectives: (1) subspace dimensionality, (2) sparsity and accuracy of representations, and (3) the effect of subspace pursuit threshold 𝜀 and coefficient truncation threshold 𝛿 on the performance. The sparsity is defined as the 𝑙0 norm of the coefficient vector ̃𝑐𝑚.

A. Training Results
During training stage, we set 𝜀=0.005 for subspace detection, 𝜂=0.01 for clustering, 𝐸max=50 as the maximum iteration times, and 𝛿=0.005 for coefficient truncation. Subspaces on 𝑋, 𝑌, and 𝑍 axis are identified separately. Table 2 shows that the subspace number 𝐽 and dimensions of resulting dictionary (dim(𝐺)=𝐼) are markedly small relative to 𝑁 or 𝐿2 in all three tests. We notice that the subspace dimensions of brain are relatively smaller than the other two. This is because of smaller training data size and minor extent of deformation considered in the brain experiment, which results in smaller dictionary size to capture the deformation features.

B. Sparsity and Accuracy Evaluation
Sparsity is examined in terms of (𝜇/𝜎), where 𝜇 is the average 𝑙0 norm of the coefficient vector ̃𝑐𝑘 (training) or ̃𝑐𝑚 (testing) and 𝜎 is the corresponding standard deviation. To verify that whether our method achieves equivalent sparsity and precision when applying the structure of the dictionary, we also test the case without relying on any structure learned from training set, during which sparse representation of each deformation in the testing set is repursued from the training set using OSP approach. In the following tables, we use “OSP” to refer to the results obtained using such a repursuing process. Table 3 summarizes the sparsity of the SRDS representation of three organs for both training and testing set. It illustrates that the number of atoms needed for representing the complex deformations is much smaller than the dimension of spherical harmonic vectors ((𝐿+1)2), and particularly the sparsity and accuracy via SRDS is very close to that from OSP repursuit for the testing deformations, which indicates the good generalization of the structured dictionary.
The reconstruction error in terms of EOF is further compared with that from standard SHD method, as shown in Figure 7. In general, the accuracy of SRDS is equivalent to that of SHD method. Specifically, it shows that the SRDS method achieves average EOF of 1.32% (brain) and 0.14% (gallbladder) versus 1.29% (brain) and 0.13% (gallbladder) with SHD method. For bladder model with deformations on multiple layers, the overall representation error with SRDS is 0.07%, very close to 0.06% with SHD. Figures 11 and 12 show typical reconstructed deformations of the testing data for the three organs with SHD and SRDS methods. Figures 12(e) and 12(f) demonstrate the interior and exterior representation of the bladder at a same time instance. From those results, we can see that the SRDS algorithm achieves the accuracy equivalent to complex mathematical modeling techniques while significantly lowers the representation dimensionality.

C. Effect of 𝜀
The performance of SRDS algorithm is examined as the subspace pursuit threshold 𝜀 varies. Specifically, we study the effect of 𝜀 on the dimensionality (𝐼) of the structured dictionary 𝐺, sparsity and accuracy of the surface representation. Figure 8 shows how the subspace dimensions on three axis change during the training stage as 𝜀 increases from 0.001 to 0.01. Figure 9 displays the influence of 𝜀 on the average sparsity 𝜇 of the surface representation results for both training and testing data sets. In general, smaller 𝜀 leads to larger subspace size and less description sparsity, since lower 𝜀 usually leads to more recruited atoms to meet the desired representation accuracy. Therefore, there is a tradeoff between representation accuracy and desired sparsity. Figure 10 reveals the representation EOF as a function of 𝜀. Not surprisingly, the reconstruction error is increased as 𝜀 becomes larger. An empirical point can be chosen according to the training curve when space dimension 𝐼 expands significantly but only trivial EOF improvement is gained, that is, 𝜀=0.005 is a preferred value in this test according to Figure 10.

D. Effect of 𝛿
The influence of coefficient truncation threshold 𝛿 on the performance of SRDS algorithm is also tested while 𝛿 is varied among [00.00010.00050.0010.0050.010.050.080.1]. Figure 13 shows the effect of 𝛿 on the average sparsity 𝜇 of the surface representation results. We can see that, as the truncation threshold 𝛿 enlarges, the sparsity of the representation is increased for both training and testing data sets at the price of decreased representation error as shown in Figure 14, so there is tradeoff between sparsity and accuracy. Empirically, one can choose the 𝛿 value when the representation precision remarkably deteriorates while the sparsity is still increasing. Therefore, according to Figures 13 and 14, an appropriate value for 𝛿 is between 0.005 and 0.01.

3.2. Ex vivo Experiment Using MRI

To evaluate the proposed algorithm in real applications, an ex vivo experiment using three porcine kidneys were conducted at the Center for Interdisciplinary Applications in Magnetic Resonance (CIA-MR) of University of Minnesota. Deformations imposed to each kidney were controlled and maintained still during imaging by a customized nonmagnetic mechanical device as shown in Figure 15. Each deformed kidney shape was scanned in 3D MRI mode with spatial resolution of 1.2 mm to generate both training set and testing set. The SH degree 𝐿 of the organ representation is set to be 20, and each 3D kidney mesh after surface correspondence has 𝑁=4002 vertices. Different from computer-generated deformations where surface correspondence is intrinsically established, the shapes from MRI scans are rendered independently, so the method described in Section 2.5 is applied to achieve point-wise correspondence.

Both intramodel and intermodel experiments are conducted. The former uses training and testing deformations from the same kidney; the later utilizes two out of the three kidneys for training and the third one for testing in a cross-evaluation fashion. Besides sparsity and EOF, Hausdorff distance between the represented shape and corresponding MRI surface is also examined as a physical measurement of error. The Hausdorff distance between surface 𝑥 and 𝑥 is defined as 𝑑𝑥,𝑥=max𝑝𝑥𝑑𝑝,𝑥,(20) where 𝑑(𝑝,𝑥) is defined as the distance between a point 𝑝 on surface 𝑥 and the closest point on surface 𝑥, that is,𝑑𝑝,𝑥=min𝑝𝑥𝑝,𝑝2(21) with 2 denoting the Euclidean norm.

3.2.1. Intramodel Test

In the intramodel experiment, 31 deformations of the same kidney were generated and scanned by the MRI machine, among which 20 frames were randomly selected as training set, and the other 11 were applied for testing the generalization of the learned subspaces.

Table 4 shows the trained subspace dimensions (𝐽 as number of subspace, 𝐼 as dictionary size of 𝐆), the sparsity of the descriptors in each axis for both training set and testing set, and the corresponding errors in terms of EOF and Hausdorff distance. Similar to the FEM experiment, the sparsity is also evaluated with OSP repursuit process in the testing set for comparison. The table shows that the sparsity and the accuracy achieved with SRDS is very close to that from OSP repursuing process. However, the SRDS method features delay-free surface representation by applying the structure in the identified dictionary. Further results about computational efficiency are shown in Section 3.4. One may notice that the size of training data in the MRI experiment is smaller than that in FEM test due to the less availability of 3D MRI images. As a rule of thumb, larger training set carries richer deformation information and thus leads to better generalization of the dictionary. However, given the size of training data and extent of deformation involved in the ex vivo experiment, high representation precision is still achieved.

Figure 16 illustrates the accuracy of the surface representation in the intramodel test. The average EOF in Figure 16(a) for training set is 0.24% and 0.64% for testing set, with maximum rate less than 1%. Further, error as Hausdorff distance (shown in Figure 16(b)) is 0.55±0.23 mm with 95th percentile error of 0.86 mm for the training set, and 0.87±0.10 mm with 95th percentile error of 0.96 mm for the testing set. This intramodel experiment demonstrated that the SRDS algorithm identifies subspaces generalizable enough to accurately represent deformations beyond the training set for the same object.

Figure 17 visualizes the color-coded error distribution at all vertices on the represented surface with SRDS relative to the actual MRI scans. Figure 17(a) illustrates the error range for different colors. Figures 17(b) and 17(d) show the error distribution for a typical reconstruction in the training and testing set, respectively. Figures 17(c) and 17(e) show maximum 90% level reconstruction errors, that is, 90% of all deformations in the training or testing set have representation point errors less than the values shown in the figures. Consistent with the EOF and Hausdorff distance results, the color diffusion in Figure 17 indicates that the precision in the testing group is relatively lower than that in the training group. However, among all the pixel-wise errors shown in Figure 17(e), less than 3% of all the surface points have error distance larger than 0.5 mm.

3.2.2. Intermodel Test

Three intermodel experiments are performed to further validate the proposed SRDS method applied to organs from different subjects. In the following context, “Ex1” stands for the experiment training on Kidney 2 and 3 plus one initial shape of Kidney 1 while testing on deformations of Kidney 1, and the like for “Ex2” and “Ex3”. In each experiment, both sparsity and accuracy are examined.

The number of subspaces (𝐽) and dimensions (𝐼) of the identified dictionary are listed in Table 5. The training results vary among the three experiments but all features low subspace dimensions. Table 6 shows the sparsity of the intermodel experiments using the SRDS algorithm, and the error level is evaluated in terms of EOF and Hausdorff distance. Each testing deformation is also sparsely retrained using OSP for comparison. We can see that the sparsity and representation error resulting from SRDS method is very close to that using OSP.

Figure 18 shows the representation accuracy using SRDS algorithm in training and testing sets for the three tests. In general, the error in testing set is larger than that in the training set. Particularly, as for EOF evaluation, “Ex1” leads to the largest EOF error relative to “Ex2” and “Ex3”, but the average error rate is still as low as 0.3% for training set and 2.0% for testing set. Table 7 lists the specific Hausdorff measurements corresponding to boxplots in Figures 18(c) and 18(d), including minimum, 95th percentile and mean. We can see that the 95th percentile Hausdorff distance across all experiments is below 3 mm, and the mean is belong 2.1 mm. Comparing those error levels with the intramodel test, one can see that the homology existing among the training and testing deformations contributes to better dictionary generalization and, thus, leads to higher representation accuracy.

Figures 19, 20 and 21 show the color-coded error fields of a typical representation and at the maximum 90% level for the three intermodel experiments. In either training set or testing set, it is observed that large errors are mostly distributed around the edge area where local details are rich. Consistent with the previous EOF and Hausdorff distance measurements, the color diffusion in Figures 1921 indicates that errors in testing set is larger than that in training set and “Ex1” generates relatively larger error comparing to “Ex2” or “Ex3.”

3.3. In Vivo Experiment Using MRI

The proposed approach is also tested over the in vivo cardiac MR images [37], consisting of automatically segmented images from volumetric MRI scans of a diastole-systole-diastole cycle. For each patient, there are around 22 phases in a cardiac cycle. Surface correspondence of LV shapes within and across patients are accomplished using the approach described in Section 2.5. Since the generated surfaces from automatic segmentation software are quite rough, we use the spherical harmonic representation as a filter to smooth out those surface noises and then apply the smoothed surfaces as training and testing data. Therefore, the demonstrated error in this section is relative to the SHD surfaces, not to the original raw surfaces. The iter-patient results are reported as follows.

Similar to the ex vivo test, we use the segmented left ventricles (LV) of 2 different patients plus an initial LV surface for the third patient as training data, and the remaining LV shapes in a beating cycle of the third patient are used to test the generalization of the identified subspaces. The formulated tests are noted as “Ex1,” “Ex2,” and “Ex3.” Table 8 lists the sparsity test results of the three cross validations for both training and testing sets. We can see that the sparsity in the training set is close to that in the testing set, but the former achieves much higher accuracy. This is because that the identified subspaces generalize perfectly for those elected atoms among the training set after spherical harmonic smoothing. Consistent with the previous experiments, the representation of testing surfaces using SRDS is also compared with that using repursuing OSP. According to the results, SRDS achieves performance slightly worse than but close to that of OSP. However, as demonstrated in Section 3.4, without relying on the structured dictionary learned from the training population, OSP is a computational expensive task, since for each testing surface, it requires to research for atoms from the training set to achieve sparse representation.

Figure 22 provides boxplots for the representation accuracy of the testing set in terms of EOF and Hausdorff distance. Table 9 provides the minimum, 95th percentile, and mean Hausdorff measurements corresponding to Figure 22(b). In coincidence, “Ex1” leads to slightly larger errors than the other two tests, with average EOF of 3.2% (“Ex1”), and mean Hausdorff distance of 1.67±0.39 mm. The 95th percentile Hausdorff distance across all experiments is below 2.2 mm. Figures 23(b) and 23(c) show the color-coded error field of a typical representation and at the maximum 90% level for the testing set in one interpatient experiment (c). As indicated by the color distribution, majority of the point errors are below 0.9 mm. Particularly, in the 90th percentile evaluation in Figure 23, only 3% of all the point-wise errors are above 0.9 mm.

3.4. Efficiency

To examine the efficiency of the proposed SRDS method quantitatively, the computational time to represent each surface in the testing set using SRDS method is compared with that resulting from OSP repursuing approach for the above five organs. The results are summarized in Table 10, including training set size 𝐾, maximum SH level 𝐿, average time (in seconds) required with SRDS (𝑡1) and OSP (𝑡2), respectively, and the ratio between the two. As shown in Table 10, the time consumption for seeking sparse representation of the testing surfaces using the SRDS is at least 10 times lower than that using the original OSP method which does not rely on the dictionary structure learned from the training data set. The advantage is more pronounced when the training data size 𝐾 or the SH level 𝐿 is large. For example, in the brain model, the high SH level 𝐿 leads to substantial computational delay during the search for proper atoms for representation, such that the SRDS achieves a speed orders of magnitude faster than the OSP method without training. On the other hand, for the case of gallbladder, the large training size also increases the time used by repursuing OSP, so it runs 65 times slower than SRDS.

To summarize, considering the test results for sparsity, accuracy, and efficiency given in this experiment section, we can see that the proposed SRDS method achieves sparse surface representation with high computational efficiency and accuracy.

4. Conclusions and Discussion

This paper introduced a new algorithm for block sparse representation of deformable organ surfaces with high accuracy. The proposed SRDS design first identifies the deformation subspaces from the training data set in the transformed spherical harmonic domain and then represents each deformed surface with a block sparse vector in the structured dictionary. SRDS is generalized to applications involving organs with multiple surface layers, such as bladder. The algorithm has been validated with FEM data and real 3D MRI scans under both ex vivo and in vivo conditions. The FEM test results demonstrate that SRDS achieves accuracy matching that of complex mathematical modeling techniques. Further, the maximum representation error in ex vivo experiment is below 1 mm for intramodel test and below 3 mm for intermodel test. For the in vivo experiment, the SRDS achieves an accuracy of better than 2.5 mm.

SRDS algorithm has already been used in tracking organ deformations in minimum invasive surgery [24]. The structure introduced in the dictionary enables efficient surface recovery from limited samples. In addition, the merits of block sparse surface representation presented here can be applied to various medical organ modeling, shape classification, and similarity retrieval where reduced parameter dimension can potentially speed up the implementations.