Abstract

Diffuse optical imaging (DOI) for detecting and locating targets in a highly scattering turbid medium is treated as a blind source separation (BSS) problem. Three matrix decomposition methods, independent component analysis (ICA), principal component analysis (PCA), and nonnegative matrix factorization (NMF) were used to study the DOI problem. The efficacy of resulting approaches was evaluated and compared using simulated and experimental data. Samples used in the experiments included Intralipid-10% or Intralipid-20% suspension in water as the medium with absorptive or scattering targets embedded.

1. Introduction

Diffuse optical imaging (DOI) for detection and retrieval of location information of targets in a highly scattering turbid medium may be treated as a blind source separation (BSS) problem [1, 2]. Various matrix decomposition methods, such as, independent component analysis (ICA) [3], principal component analysis (PCA) [4], and nonnegative matrix factorization (NMF) [5, 6] have been developed for solving the BSS problem and retrieving desired information.

Xu et al. adapted ICA of information theory to develop optical tomography using independent component analysis (OPTICA) and demonstrated its application for diffuse imaging of absorptive, scattering, and fluorescent targets [711]. ICA assumes the signals from different targets to be independent of each other and optimizes a relevant measure of independence to obtain the ICs associated with different targets. The position coordinates of targets in three dimensions are determined from the individual components separately.

PCA assumes that the PCs contributing to the signal are uncorrelated and explain the most variance in the signal. PCA has been widely used in various applications, such as spectroscopy [12], face recognition [13], and neuroimaging [14]. NMF seeks to factorize a matrix into two nonnegative matrices (component signals and weights) and requires the contributions to signal and the weights of the components to be non-negative. It does not imply any relationship between the components. NMF has also been widely used in biological analysis [15] and spectral analysis [16].

The objective of this study is to test and compare the efficacy of these algorithms when used to solve the DOI problem. Results are presented and compared using simulative data and experimental data using absorptive and scattering targets embedded in model scattering media. Our interest in solving the DOI problem derives from the need for a noninvasive modality for detecting, locating, and diagnosing breast tumors in early stages of growth.

The remainder of the paper is organized as follows. In Section 2, the formalisms of the three methods are introduced. Section 3 evaluates the resulting imaging approaches using simulated data. The approaches are further examined in Section 4 for experimental data acquired using absorptive and scattering targets embedded in model scattering media. Section 5 summarizes and discusses the results.

2. Formalism

2.1. Blind Source Separation Problem

Blind source separation (BSS), also known as blind signal separation, is a general problem in information theory that seeks to separate different individual signals from the measured signals, which are weighted mixtures of those individual signals. Assuming individual signals, , , are linearly mixed instantaneously, the BSS problem is modeled as follows. The dimension of is , the number of sampling times. In this study, will be replaced by spatial positions of the excitation light sources. A total of detectors sense different mixtures of . The mixture measured by the th detector can be presented as , or , in a matrix notation, where is a mixing or weighting matrix, , , and . The objective of BSS is to retrieve the signals and their weights, . ICA, PCA, and NMF are statistical analysis methods used to solve the BSS problem.

2.2. Diffuse Optical Imaging Problem

In DOI, one measures the signal at the sample boundary, which includes a weighted mixture of contributions from embedded targets. One uses the diffusion approximation [1719] of the radiative transfer equation [20, 21] as the forward model to describe light propagation in a highly scattering turbid medium. The perturbation in the light intensity distribution measured on the boundary of the sample due to the presence of the targets (which are localized inhomogeneities in the optical properties within the sample volume) may be written, in the first-order Born approximation, as [22, 23] where , , and are the positions of a source of unit power, detector and target, respectively; and are the Green’s functions that describe light propagation from the source to the target and from the target to the detector, respectively; and are the differences in absorption coefficient and diffusion coefficient between the targets and the background medium, respectively; and is the light speed in the medium.

A multisource illumination and multidetector signal acquisition scheme is used to acquire light transmitted through a scattering medium. For small absorptive targets, a perturbation data matrix is constructed using for all sources. The elements of the data matrix pertaining to absorptive targets represented by the first term in (1) may be written in a discrete form as where , , and are the locations of the th detector, th source and th target, respectively; , , and are the numbers of sources, detectors, and targets, respectively; is the optical absorption strength of the th target of volume ; and are the Green’s functions that describe light propagation from th source to th target and from th target to th detector, respectively. The number of targets is assumed to be less than that of sources and detectors, .

The th target may be considered to be a virtual source of strength excited by the real light source located at . The data matrix may be considered to be a set of combinations of light signals from all virtual sources mixed by a mixing matrix . Therefore, this problem can be treated as a BSS problem.

As the second term in (1) suggests, each scattering target is represented by three colocated virtual sources of strength: , where , , and , is the optical scattering strength of the th target [8]. The mixing matrices become for the three virtual sources generated by the th target. The elements of the data matrix for scattering targets may be written as

Since one absorptive target is represented by one centrosymmetric virtual source, while three virtual sources (one centrosymmetric and two dumb-bell shaped) represent one scattering target [7, 8], the number and patterns of virtual sources may be used, in favorable situations, to identify the target as absorptive or scattering in nature. In this paper, only small targets are considered since all three algorithms are suited for small targets, and early detection, when the tumors are more amenable to treatment, is of practical interest.

2.3. DOI as a BSS Problem

The data matrix for the DOI problem may be written as where , , and . For absorptive targets, while for scattering targets, and () are two-dimensional intensity distributions on the source and detector planes, respectively. Source and detector planes are the boundaries of the sample through which light enters and exits the sample volume, respectively. The scaling factors and are related to the target optical strength, . The location of the target and the scaling factors can be retrieved using a least squares fitting viafor absorptive and scattering targets, respectively. However, when a scattering target is embedded deep in a turbid medium, only the virtual source remains significant. So, only may be used for fitting in (6b) [8].

2.3.1. ICA

OPTICA assumes that the virtual sources are independent of each other [8]. So, they can be retrieved through an iterative process which seeks to maximize the independence among the components. In practice, the independent components are found by maximizing some measure of non-Gaussianity, such as kurtosis (the fourth-order cumulant), of the unmixed components. A Matlab program for ICA was adopted from http://sccn.ucsd.edu/eeglab/. The location of the target can be retrieved by fitting the independent component intensity distributions (ICIDs) to Green’s functions or derivatives of Green’s functions using (6a) and (6b).

2.3.2. PCA

PCA assumes that the virtual sources are uncorrelated so that the correlation (covariance) between them is ideally zero and minimal in practice. The covariance matrix of , should be diagonal. The general process of PCA is as follows. The data matrix , where is random noise added to the data, and the same as defined in (4). When is mean centered, elements of the mean-centered matrix are defined as Similarly,PCA looks for a matrix that decomposes into virtual sources, . It also holds that , since is just a rotation matrix which does not change the center of the data. where . The eigenvalues are variances in the covariance matrix. Therefore, , where is orthonormal. PCA is realized by eigenvalue decomposition (EVD) of the covariance matrix of . The eigenvectors with leading eigenvalues (largest variances) are selected to be the PCs using the L-curve [24].

Since , is determined as a matrix including only PCs. is calculated as . Rows of and columns of represent principal component intensity distributions (PCIDs) on the source plane and detector plane, respectively and are proportional to the images of the virtual sources projected on the source and detector planes. The target positions are determined using (6a) and (6b).

2.3.3. NMF

NMF is a group of multivariate analysis algorithms that factorize a matrix into , and , and are nonnegative [6]. Unlike ICA and PCA, NMF does not imply any relationship between the retrieved components; instead, it just enforces non-negativity constraints on and . There are various algorithms developed to solve NMF, such as the multiplicative update method [5] and alternating least squares method [25, 26].

In the multiplicative update implementation of NMF, and can be found by minimizing the square of Euclidean distance as the cost function, where and , using the multiplicative update rule:

The alternating least squares implementation of NMF uses alternate least squares steps to estimate (or ), and use that estimate to optimize (or ), repeating the alternative steps until the desired optimization is obtained. Nonnegativity is ensured by setting any negative element of or equal to 0.

An NMF toolbox was obtained from http://cogsys.imm.dtu.dk/toolbox/ to perform NMF computation. A built-in command nnmf is also available in Matlab (R2011a).

NMF algorithm requires that the non-negativity assumption must hold in the problem. In particular, for absorptive targets, when is constructed with , should be positive, that is, the targets should be more absorbing than the background. If the targets have weaker attenuation properties than the background, needs to be constructed with instead. For scattering targets, should be treated similarly to keep its elements positive.

When NMF is applied to a scattering target, only the centrosymmetric component shows up properly, since the other two components have dumb-bell shape which includes negative values [8]. So without any prior knowledge or some other experimental means to assess if the target is absorptive or scattering, NMF may not distinguish between the two possibilities.

The decomposition methods can be applied with different sample geometries such as slab and cylindrical geometries, and different measurement domains such as time-resolved domain, frequency domain, and continuous wave (CW). In this paper, Green’s functions for slab geometry [23] with CW measurement were used for simulation and experiments.

3. Simulation

The sample was considered to be a 40 mm thick uniform scattering slab with lateral dimension of , as shown in Figure 1. Its absorption and diffusion coefficients were taken to be  mm−1 and  mm (transport mean free path,  mm), respectively, which are similar to the average value of those parameters for human breast tissue. An absorptive and a scattering point targets were placed at (50, 60, 15) mm and (30, 30, 25) mm, respectively. The index of refraction of the medium was taken to be 1.33. The speed of light is  m/s or 299.8 mm/ns in vacuum, and 225.4 mm/ns in the medium. The absorption coefficient of the absorptive target was set to be higher than the background with  mm−1, while the diffusion coefficient was taken to be the same as that of background. The diffusion coefficient of the scattering target was set to be lower than the background (higher scattering coefficient) with  mm ( mm), while the absorption coefficient was taken to be the same as that of the background. The volumes of both targets are set to be 8 mm3. The optical strengths of the absorptive and scattering targets were  mm3/ns and  mm5/ns, respectively. The incident CW beam step scanned the sample at grid points covering an  mm2 area, with a step size of 4 mm. Light on the opposite side was recorded at grid points covering the same area. Multiplicative Gaussian noise of 5% was added to the simulated data. The data matrix was then obtained using (2) and (3) directly and analyzed using the three different algorithms.

3.1. ICA Analysis

One independent component for the absorptive target and three independent components for the scattering target were retrieved by ICA. The independent component intensity distributions (ICIDs) on the detector plane are shown in Figures 2(a), 2(c), 2(d), and 2(e). Similar ICIDs were obtained on the source plane. Figure 2(g) shows the centrosymmetric ICID for the scattering target, and Figure 2(i) shows the ICID for the absorptive target, on the source plane.

The components in either the detector plane or the source plane can, in principle, be used to extract position and optical strength of the target(s). However, in our experimental arrangement signal is collected by a  pixels CCD camera, while the source plane is scanned in an - array of points, which is much smaller than the number of pixels in the CCD camera. Consequently, the resolution in the detector plane is much better, and the data set more robust than the source side. So, we used the images on the detector plane for retrieving target information using experimental data. While it would not matter in simulation, to be consistent with experimental situations, we employed detector plane images when using simulated data as well for all three algorithms. Table 1 lists the locations and strengths of the absorptive and scattering targets retrieved by fitting the spatial intensity profile of the centrosymmetric components on the detector plane to Green’s functions and derivatives of Green’s functions using (6a) and (6b), respectively, as shown in Figures 2(b) and 2(f). Figures 2(h) and 2(j) show the corresponding fits to the profiles on the source plane.

3.2. PCA Analysis

Eigenvalue equation of the covariance matrix of was solved. The eigenvalues found by PCA were sorted in a descending order. Figure 3 shows a plot of leading 20 eigenvalues on a logarithmic scale.

First four leading eigenvalues were selected for PCs. The corresponding PCIDs were calculated. The PCIDs on the detector plane are shown in Figure 4. Similar images for PCIDs on the source plane were obtained. The scattering target has one centrosymmetric (Figure 4(a)) component and two dumb-bell shaped (Figures 4(c) and 4(d)) components, while the absorptive target has only one component (Figure 4(e)).

Figures 4(b) and 4(f) show fits to the spatial intensity profile of the centrosymmetric component of the scattering target and that of the absorptive target, respectively, to retrieve the locations of the two targets. The locations and optical strengths of the targets retrieved by PCA are also shown in Table 1.

3.3. NMF Analysis

The mixing matrix and virtual sources were retrieved from the data matrix using NMF as explained in Section 2.3.3. As in the other two approaches, only one component is extracted for the absorptive target. Since NMF has a non-negativity constraint, only the centrosymmetric component for the scattering target is obtained. Nonnegative component intensity distributions (NCIDs) on detector planes are shown in Figure 5. Similar images for NCIDs on source plane were also obtained using the virtual sources in . The results are also shown in Table 1.

3.4. Results and Discussion

The positions and optical strengths of the targets retrieved by ICA, PCA, and NMF algorithms are shown in Table 1, and compared to the known values. The retrieved results using all three algorithms from this simulated data are in excellent agreement with the known values.

4. Experiments

4.1. Experimental Materials and Methods

In this Section, the algorithms are evaluated using experimental data for absorptive and scattering targets embedded in model scattering media whose absorption and scattering properties are adjusted to mimic the average values of those parameters for human breast tissues. Two different experiments were carried out with two different samples. The first sample used a transparent plastic container filled with Intralipid 10% suspension in water as the background medium. The concentration of Intralipid 10% was adjusted to provide [27, 28] an absorption coefficient of  mm−1, and a transport mean-free path  mm at 785 nm. The second sample used a similar container with dimension of filled with Intralipid 20% suspension in water. The concentration of Intralipid 20% was adjusted to provide [27, 28]  mm−1, and  mm at 785 nm. These optical parameters of the medium were selected to be similar to the average values of those parameters for human breast tissue. The thickness of the samples was also comparable to that of a typical compressed female human breast.

In the first experiment, two absorptive targets were embedded in the medium. The targets were ~10-mm diameter glass spheres filled Indocyanine green (ICG) dye dissolved in Intralipid-20% suspension in water to obtain an absorption coefficient  mm−1 at 785 nm, and to match the background scattering coefficient of 1.97 mm−1. The targets were placed at (57.2, 18.1, 20.0) mm and (19.9, 48.1, 25.0) mm, respectively.

In the second experiment, two scattering targets were embedded, which were also ~10 mm diameter glass spheres, filled with Intralipid-20% suspension in water. The transport mean free path, was adjusted to be 0.25 mm, with scattering coefficient  mm−1, and absorption coefficient same as that of the background medium. The targets were placed in the middle plane ( mm) in the container with a lateral distance of 40 mm from each other (center to center).

The experimental setup is schematically shown in Figure 6. A 10 mW 785 nm diode laser beam was used to illuminate the first sample, while a 100 mW 785 nm diode laser beam was used for the second sample. The input surface (source plane) of the samples was scanned across the laser beam in an - array of grid points to realize the multi-source interrogation of the samples. The transmitted light from the exit surface (detector plane) was recorded by a (pixel size = 24 μm) CCD camera (Photometrics CH350) equipped with a 60 mm focal-length camera lens. Each pixel of the CCD camera can be considered to be a detector implementing the multidetector signal acquisition arrangement. A set of 16 bit images were acquired. The two samples were scanned in an array of and grid points, respectively, with a step size of 5 mm in both cases. The processes of scanning and data acquisition were controlled by a personal computer. At all scan positions, raw transillumination images of the samples were recorded by the computer for further analysis.

4.2. Analysis and Results

A region of interest (ROI) was cropped out from each image. Then, every  pixels in each cropped image were binned to one pixel to enhance signal-to-noise ratio. A background image was generated by calculating an average image for all scan positions to approximate the transillumination image without target(s) embedded.

This averaging method for generating background image is suitable for small targets used in our experiments, as the ratio of the volume of the sample to that of the target was quite high (~500 : 1). For in vivo imaging of tumors in early stages of growth, the breast-to-tumor volume ratio will be similarly high, and the averaging method will be applicable. Alternative approaches for generating a background image include using image of (a) a phantom that has the same average optical properties as the sample [29]; (b) the healthy contralateral breast for breast imaging [30]; (c) the sample obtained using light of wavelength for which the target(s) and the background have identical optical properties [31]. Still another approach is to compute the background using an appropriate forward model [32]. A more detailed discussion of this important issue appears in one of our earlier publications [33].

The background image was also cropped and binned corresponding to the ROI for each scan position. Perturbation in the light intensity distribution, due to targets in each image was found by subtracting the background image from the image. The data matrix was then constructed using the light intensity perturbations at all scan positions. ICA, PCA, and NMF decomposition algorithms were performed on the data matrix separately. Results are shown and discussed below.

4.2.1. Absorptive Targets

The images on the detector plane obtained using the ICA, PCA, and NMF algorithms are shown in Figures 7, 8, and 9, respectively. Similar images on the source plane were also obtained using all three algorithms. The right side of each figure shows the corresponding spatial intensity profile. Locations of the targets are extracted from fits to these spatial intensity profiles, as described in Section 2.3 using (6a) and (6b). The results are presented in Table 2. In Figure 7, images on the source plane are shown in (e) and (g), and Green’s function fits to their spatial profiles are shown in (f) and (h) for comparison.

It follows from the comparison of the results in Table 2 that the positions retrieved by all three algorithms are in good agreement with the known positions. The errors in the retrieved locations of the two targets were within 1.7 mm. The PCIDs were not totally separated. Some “residue” was observed in one PCID from the other. ICA and NMF separated two components from this dataset more clearly.

4.2.2. Scattering Targets

The “images” corresponding to the centrosymmetric components of the virtual sources (targets) on the detector plane obtained using the ICA, PCA, and NMF algorithms are shown in Figures 10, 11, and 12, respectively. Similar images on the source plane were also obtained. The right side of each figure shows the corresponding spatial intensity profile. Locations of the targets are extracted from fits to these spatial intensity profiles, as described in Section 2.3 using (6a) and (6b). The results are presented in Table 3.

Both targets were detected by all three algorithms. The target locations retrieved by three algorithms are shown in Table 3 and compared with known locations. Overall, all three algorithms detect and locate the scattering and the absorptive targets with good accuracy, the maximum deviation of any one coordinate from the known value being ~3 mm. Since the maximum difference between the known and retrieved position coordinates was larger for the scattering targets, we calculated the squared correlation coefficient to assess the fitting quality. NMF retrieves the position coordinates better (within 0.5 mm) for the scattering Target 2 than done by ICA and PCA (deviation from known values being between 2-3 mm). NMF retrieved the position coordinates for Target 1 with 3.0 mm error in z direction, which is not as good as that done by ICA and PCA. But is 0.783 and 0.778 in the fittings for ICA and PCA, respectively, as compared to 0.993 for NMF, indicating that the quality of the fitting is better for NMF. The quality of fitting is presumably affected by the efficacy of decomposition. The decomposed NCIDs by NMF were more “clean” than those decomposed by ICA and PCA. We ascribe the observed higher errors in ICA and PCA estimates of the position coordinates of the scattering Target 2 than the NMF estimates to the interference from the other virtual source (corresponding to Target 1) in ICA (Figure 10(c)) and PCA (Figure 11(c)) images. It is commonly believed that errors in locating a scattering target are higher than that for locating an absorptive target, and the results of this study conform to that notion.

5. Summary and Discussion

Diffusive optical imaging was modeled as a BSS problem. ICA, PCA, and NMF were used to decompose the data matrix and locate the targets embedded in a highly scattering turbid medium. Only the components corresponding to the targets were extracted from a large dataset for target detection and localization.

It may be instructive to compare the objectives, scope, and computational complexity of these decomposition methods with model-based reconstruction methods. Decomposition methods obtain the 3D locations of targets (the number of targets is generally small). Based on the retrieved locations, the methods may then be further extended to retrieve size and optical property information of the targets [9]. The common practice of model-based inverse reconstruction methods is to discretize the sample volume into voxels and estimate absorption and/or scattering coefficient in each voxel iteratively. Voxels with significantly different optical properties than the surrounding are regions of interest and may be identified as targets. While estimating the optical properties, the forward model is solved repeatedly to calculate the intensity of the multiply scattered light on the sample boundary. The difference between the intensity of the multiply scattered light predicted by the forward model and the experimental measurements is minimized by seeking an optimal set of the optical properties of every voxel in the sample volume. The number of variables is thus, on the order of . To determine location(s) of target(s) in three dimensions, the decomposition methods process the data matrix to retrieve the main components ( and ). Here, and are two-dimensional matrices with the number of unknowns on the order of . The number of unknowns is, hence, reduced times in the decomposition methods compared to the model-based approaches, which leads to a substantial saving in the computational time when is large. No repeated solution of the forward model is involved in decomposition methods. Consequently, decomposition methods are considerably faster.

A comparison of the computational complexity of these two types of approaches may shed further light on their relative computation economy. For a model-based iterative reconstruction method, an equation of the form is solved to find the targets, where is a weight matrix of size , , , and are the numbers of detectors, sources and voxels, respectively, is an vector describing the perturbation in the detected light intensity due to the presence of targets, and is the perturbation in the optical properties from the background values with dimension of . The computational complexity is typically for a single iteration. For the decomposition approach, is written as a 2D matrix with dimension . To decompose matrix , the computational complexity per iteration is typically of order for ICA [34], and for NMF [16], where is the number of components that relates to the number of targets and is usually a small number. For PCA using SVD, the complexity is [34]. The computational complexity of the intrinsic iterative process involved in the matrix decomposition algorithms is much lower than that in the model-based inverse reconstruction methods.

All three matrix decomposition methods presented in this manuscript can potentially be used in in-vivo real-time breast cancer imaging. The three algorithms have different assumptions, which may lead to different favored conditions. In this study, the algorithms were evaluated using simulative and experimental data using model scattering media and absorptive and scattering targets. The positions of the targets were retrieved with good accuracy. The decomposition provided by ICA is “cleaner” than that of the PCA. PCA did not clearly separate the two absorptive targets used in the first experiment. NMF decomposition seems to provide residue-free “cleaner” images than the other two methods in this study. However, since NMF is based on nonnegativity assumption, the results might deteriorate when such a non-negativity assumption does not hold well. While continuous wave measurements were used in the work presented in this paper, the approaches could be used with frequency domain and time domain measurements as well.

The work presented here focuses on detecting and locating small targets, which derive impetus from the need to detect tumors in early stages of growth when those are more amenable to treatment. All three methods are applicable for extended targets as well and are expected to provide the “center of optical strength” as the location of the target.

All three approaches are applicable for both scattering and absorbing targets and may be used in clinical setting. The contrast between a tumor and surrounding normal tissue can be due to differences in absorption, scattering, or both absorption and scattering properties and may depend significantly on the wavelength of light used. However, a priori knowledge of the optical characteristics (absorptive or scattering) is not crucial. As has been shown in (2) and (3), the expression for elements of the data matrix for absorptive targets involves Green’s Functions , while that for scattering targets involves where in CW [9]. This relationship with provides basis for detection and localization of target(s), whether contrast is due to absorption, scattering, or both. We are using transillumination geometry, which is one of the approaches used by other researchers, and adequate signal for in vivo breast imaging is obtained [29, 3538].

In this paper, we presented results when the approaches were used to detect and obtain three-dimensional location information of the targets. We have demonstrated, while developing OPTICA that a backprojection formalism can be further implemented to get a cross-section image of the target [11], or the retrieved target locations can be fed into other DOI methods as a priori information to get three-dimensional tomographic images. Since the approaches are suited for small targets, this hold promise for detecting and locating breast tumors in early stages of growth, which is crucially important for effective treatment. Further work involving ex vivo (model) and in vivo imaging of cancerous breast will be needed to establish the full potential of these approaches.

Acknowledgment

The paper is supported in part by US Army Medical Research and Materiel Command under Contract no. W81XWH-07-1-0454.