Abstract

Hyperspectral imaging has been proved as an effective way to explore the useful information behind the land objects. And it can also be adopted for biologic information extraction, by which the origin information can be acquired from the image repeatedly without contamination. In this paper we proposed a target detection method based on background self-learning to extract the biologic information from the hyperspectral images. The conventional unstructured target detectors are very difficult to estimate the background statistics accurately in either a global or local way. Considering the spatial spectral information, its performance can be further improved by avoiding the above problem. It is especially designed to extract fingerprint and tumor region from hyperspectral biologic images. The experimental results show the validity and the superiority of our method on detecting the biologic information from hyperspectral images.

1. Introduction

Physical direct extraction method is commonly used as a traditional method to extract the biologic information attached or to make sure that there is any abnormal characteristic information existing in body surface or a body region, and it will always cause damage to the information carrier. Recent works on graph theory and its applications draw great attentions [13]. There is a reliable method [4] to obtain the information which is capturing an image with the target characteristic information and processing the image. In general, the image-based biologic feature extraction method is mostly a single-band full-color image, which obtains target region shape [59] by calculating the shape of the image or filtering enhancement of the image. However, there are obvious defects in the traditional single-band full-color image. The characteristics of single-band full-color image will change greatly in the case of existing differences in the external environment such as the angle of light and other factors, which increases the difficulty of extraction. With the advent of imaging spectrometers, we can obtain hyperspectral images with many bands, continuous spectral curves, and consistent features [10]. We can obtain more abundant and stable information by using hyperspectral technology to image for biological carrier, and it facilitates the further extraction of biological information. As for the extraction of some characteristic information on the biological carrier, it can be done by target detection [11]. In recent years, many object extraction and target detection methods for hyperspectral images have emerged. The most common methods are matched filter (MF) method [12], Constrained Energy Minimization (CEM) method [13], Adaptive Cosine Consistency Estimator (ACE) method [14], Spectral Angle Method (SAM) [15], Orthogonal Subspace Projection (OSP) method [16], and so on. Among these methods, the MF and SAM methods do not suppress the background information. The CEM, ACE, and OSP methods suppress the background information while being affected by the target information, resulting in that the estimation of the background information is not accurate enough. In this paper, we propose a Background Self-Learning Target Detection(BSLD) algorithm for hyperspectral image, which makes the target information to be extracted more prominent by suppressing the background information accurately.

2. Information Extraction Methods by Background Information Self-Learning

Target information extraction is a process of separating the target area from the nontarget area, that is, the background information. In order to get the difference between the target information and the background information, the key is to calculate the accurate background estimation information, which directly determines the performance of the information extraction algorithm. The method, such as ACE, based on unstructured background is a common method of hyperspectral image object extraction. It is assumed that the not interested region is homogeneous and can be represented by a multivariate normal distribution, and, by using the sample data, we could estimate the background covariance matrix if the background information and a priori target information are known [17]. Kelly proposed the generalized likelihood ratio structure detection algorithm which is based on unstructured background firstly, and we obtain adaptive cosine consistency evaluation ACE and adaptive matched filter AMF on this basis [18]. These methods are calculated for the background covariance estimates directly from the entire image data, but this will cause some error due to the absence of the object information to be extracted. In recent years, many local background estimation methods have emerged, which take into account the spatial information in the data to improve the accuracy of background information estimation. One of the common methods is to segment the image and then select the space closest to the test pixel to estimate the background of the statistical data [19]. Another common way is to use a sliding window, by limiting the size of the window area, and, at the same time, also need to introduce the “internal window” to exclude information that may not be background. In this way, not only will the size of the inside window and outside window affect the effect of the whole algorithm and the calculation of different sizes, but also irregular area background statistics is very time-consuming, affecting the implementation efficiency of the algorithm.

To solve the above problems, according to different background spectral information to construct different multiple normal distribution model, this paper will propose a background self-learning method. The background pixel is clustered according to the spectral information, and the background type structure detector is determined based on the spectral information of the test pixel and the pixel point associated with its spatial when the information is extracted. The method based on the background self-learning framework can use all kinds of statistical information flexibly and suppress the background better and highlight the target to be extracted. At the same time, it will reduce the computational complexity of the above method. Specifically, the self-learning of the background information presented in this paper is divided into the following five steps.

2.1. Estimating the Number of End Elements in the Image

The end element is also called the basic component unit. The spectral information of each end element can approximately represent a class of signals in the image. The minimum error hyperspectral signal identification method (HySime method) was used to estimate the number of end elements in the image [20]. The HySime method inputs the original hyperspectral image named and its autocorrelation matrix:

The noise is estimated by noise and the noise autocorrelation matrix is calculated:

And then calculate the autocorrelation matrix of original image which removed the noise information:

By the above formula to get the characteristics of the group named , we could use the method to propose the error minimization formula:

Calculate the subspace dimension of the image, that is, the estimation of the number of end elements, where the dimensions space is specified:

is not related to constant; and are expressed as

The resulting end number is the subsequent parameter of the -means method in Step 3.2.

2.2. Background Clustering Fusion

We have to cluster the image in order to get the statistical information of the background subspace formed by the same background element. The number of end elements has been obtained by HySime method in Step 3.1, and the background information is clustered by introducing -means method as cluster number. Assume that we have data points which need to be clustered into classes, and, in this paper, K-means is not described in detail:

The value of is 1 when are clustered; otherwise, it is 0. In this paper, -means detailed solution is not too much; as for the specific implementation method, they could been seen in [21].

After obtaining the clustering information, the spectral information of the cluster center is regarded as the vector, the spectral angle among clusters is calculated, and the spectral angle merge is calculated if is true. For the center of two clusters which are , the spectral angle can be calculated by the following formula:

In order to avoid generating morbid matrices during the process of calculating the covariance matrix, we merge the classes with few pixels into the nearest class at the same time. In this experiment, the number of images’ bands has to be no more than the number of each cluster’s pixels. After the above adjustment, we can get the clustering information we need and calculate the covariance matrix of each cluster on this basis.

2.3. Excluding Target Classes

To avoid the nonbackground information (target information) being clustered into the background class collection, we find the closest class to the previous information spectral feature and compare the cluster center with the spectral angle of the previous information by formula (8). If , then it is determined as the target information class, and we need to abandon the class; otherwise, there is no similar class to the prior information. Skip this step.

2.4. The Choice of the Best Background Class

Before extracting the target information, we should determine a background class for each pixel firstly. And the statistics of the background class will be used in the process of the subsequent target information extraction calculation. There are two ways to determine the background class for each pixel, as is shown in Figure 1. One is a method of selecting a background class based on a previous information, which selects the category of the cluster center that is closest to the prior spectral information as background estimation. When we determine the appropriate background class for the test pixel, if the dark red clustering center is closest to the orange prior information, then the background class is chosen to calculate the covariance matrix, so that it can suppress alarm pixels better which are too similar to previous information to distinguish with it.

The other method is to select the background class based on the test pixels, which chooses the class closest to clustering center of the test pixel spectral information as background estimation. In Figure 1, the green test pixel is closest to the blue clustering center, so we decided to use the blue background class to calculate the background information of the test pixel. For different locations of the pixel, with different background statistics, the latter is more feasible than the former. Through the above steps, we can get the background pixel set and obtain the background statistical covariance .

2.5. Target Information Extraction

In order to describe the pixel information more accurately, we also need make use of the spatial information of the pixel spectrum in the image. Here, we introduce the concept of sliding window, that is, selecting the current pixel as the center, delineating a square area, and all the pixels’ information in the region determines the current pixel background class together, just as is shown in Figure 2. We should use the reasonable spatial information to avoid the spectral fluctuations caused by the pixel and unknown information and abnormal mixed information, so we could have a more realistic description of the real pixel spectral information test. Conventional calculation method is time-consuming, so this paper takes the method that averages all the pixels on the window, the window size can be adjusted according to the actual situation, the value of size often uses 1, 2, 3, and so on, and each pixel of the window has an assigned value: indicates the center location of the window, ; .

Input threshold and obtain the size of window and the weight of window center element, using the method proposed in this paper to complete the target information extraction process shown in Figure 3.

Weighted average of pixels within the window is as follows:

After obtaining the spectral information of the test pixel with spatial information, the generalized likelihood ratio method can be used to express the algorithm:

is the spectral pixel of the test pixel. The weighted average result in (10) is taken into (11), which is the background spectral information, which is the background class covariance determined by the test pixels. Each pixel is calculated from the above formula, and the resulting value is determined by threshold division as the target information we need to extract.

Assume that the original hyperspectral image data is data.

3. Experiment and Analysis

To verify the feasibility and effectiveness of the method described in this paper, two sets of hyperspectral biological images were used in this paper. We compare the BSLD method with the adaptive cosine coincidence (ACE), the constrained energy minimization method (CEM), and the matched filter.

3.1. Indicators of Evaluation Performance

In this paper, ROC (Receiver Operating Characteristic) curve is used to evaluate the performance of the algorithm. The ROC curve is a curve that calculates the correct extraction rate and the false alarm rate corresponding to the point of the unit coordinate system. In Section 2.5, we obtain a decision value for each pixel by the extraction decision algorithm. Through changing thresholds, the correct extraction rate and false alarm rate of the algorithm are obtained. The higher the threshold, the lower the correct extraction rate and the false alarm rate. We hope to obtain a higher correct extraction rate in the case of low false alarm rate. In theory, that is, the more curve to the upper left corner of the protrusion, the better the performance of the algorithm. Another comparison method is to compare the area under the curve (AUC) (area under the curve of ROC); when the value of AUC is larger, the performance of the algorithm is better [22].

3.2. Experiment of Fingerprint Information Extraction

The fingerprint information extraction experiment data is an envelope local hyperspectral image with a band number of 31 and a size of 640 512. As is shown in Figure 4, the gray-scale images of the bands are 1, 11, 21, and 31, respectively. From the gray-scale point of view, only the first band can clearly identify the fingerprint, but it is also overlapped with the background information or block, so it greatly enhances the difficulty of fingerprint shape extraction.

According to the data provided by a total of 8694-pixel ROI (Regions of Interest) [23] as the correct fingerprint extracting reference value, we selected pixel = as previous information to experiment. Taking the experimental parameters, the window size is 5. The best experimental results can be obtained when the window center pixel weight is 3. As is shown in Figure 5, the fingerprint information can be clearly extracted except the area covered by the ink, and the area covered by the ink can also be extracted partly. In Figure 6, we can see that the BSLD method (red curve) is at the top of the ROC curve of other methods. The AUC value in Table 1 also reflects that this method has the best fingerprint information extraction performance.

3.3. Experimental Study on Tumor Area Information Extraction in Medical Images

In the tumor area information extraction experiment, the data we selected is a hyperspectral image of the trunk part of the mouse in which band’s number is 13, spectral resolution is 5 nm, and the size is 220 200. In Figure 7, the gray-scale images of bands 1 and 5 and the color images of the red, green, and blue bands are shown. There is silver powder on the back and accumulated food in the abdomen. This data gives the samples of prostate tumor cells, but it cannot clearly identify the exact and complete position of the tumor area from the band image. The region is extracted exactly and accurately. The region is extracted accurately through this algorithm.

As is shown in Figure 8, it is not difficult to find that the distribution information of the prostate area from the hind leg root upward is very obvious, and the tumor area which is difficult to be positioned in the original hyperspectral image is clearly extracted. Although there are some false alarms around the location of the body, the effects in the medical application field are not significant. We need to define a higher decision threshold to get accurate images if we try to get a more accurate image. The ROC curve is plotted according to the ROI information given by the control data as shown in Figure 9. We could find that BSLD method is still superior to the other three methods. AUC in Table 2 is also improved slightly, and its value reaches to 99.1%. So we can conclude that BSLD could accurately determine the tumor area.

4. Conclusion

In this paper, a new method of hyperspectral image biometric information extraction is proposed, which can make up for the deficiency of background hyperspectral image processing method so that background information is suppressed and target information is easier to be extracted. The two groups of experiments show that this method can extract the target biological information completely and clearly, and the extraction effect is superior to the classical ACE, CEM, and MF methods.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the CRSRI Open Research Program (Program no. CKWV2016380/KY).