Medical Image Segmentation Algorithm Based on Feedback Mechanism CNN

An, Feng-Ping; Liu, Zhi-Wen

doi:https://doi.org/10.1155/2019/6134942

Contrast Media & Molecular Imaging

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2019 | Article ID 6134942 | https://doi.org/10.1155/2019/6134942

Medical Image Segmentation Algorithm Based on Feedback Mechanism CNN

Feng-Ping An^1,2and Zhi-Wen Liu²

Academic Editor: Barbara Palumbo

Received22 Dec 2018

Revised18 May 2019

Accepted16 Jun 2019

Published01 Aug 2019

Abstract

With the development of computer vision and image segmentation technology, medical image segmentation and recognition technology has become an important part of computer-aided diagnosis. The traditional image segmentation method relies on artificial means to extract and select information such as edges, colors, and textures in the image. It not only consumes considerable energy resources and people’s time but also requires certain expertise to obtain useful feature information, which no longer meets the practical application requirements of medical image segmentation and recognition. As an efficient image segmentation method, convolutional neural networks (CNNs) have been widely promoted and applied in the field of medical image segmentation. However, CNNs that rely on simple feedforward methods have not met the actual needs of the rapid development of the medical field. Thus, this paper is inspired by the feedback mechanism of the human visual cortex, and an effective feedback mechanism calculation model and operation framework is proposed, and the feedback optimization problem is presented. A new feedback convolutional neural network algorithm based on neuron screening and neuron visual information recovery is constructed. So, a medical image segmentation algorithm based on a feedback mechanism convolutional neural network is proposed. The basic idea is as follows: The model for obtaining an initial region with the segmented medical image classifies the pixel block samples in the segmented image. Then, the initial results are optimized by threshold segmentation and morphological methods to obtain accurate medical image segmentation results. Experiments show that the proposed segmentation method has not only high segmentation accuracy but also extremely high adaptive segmentation ability for various medical images. The research in this paper provides a new perspective for medical image segmentation research. It is a new attempt to explore more advanced intelligent medical image segmentation methods. It also provides technical approaches and methods for further development and improvement of adaptive medical image segmentation technology.

1. Introduction

Medical image enhances the sharpness of the original image by denoising, restoring, etc. or highlights certain information in the image for the segmentation of organs and tissues of interest for quantitative analysis [1–3]. Although there are many kinds of medical image segmentation methods, medical image segmentation methods are mainly divided into the following five categories: medical image segmentation based on thresholds, medical image segmentation based on region growing arithmetic, medical image segmentation based on the deformation model, medical image segmentation based on graph theory, and medical image segmentation based on machine learning [4–6].

The first category is a threshold-based medical image segmentation method. It splits the image based on the difference in grayscale values of the pixels in the image. The method has the advantages of being simple and fast and can achieve efficient segmentation when the image quality is high, the grayscale difference between the target and the background to be segmented is large, and the boundary is clearly distinguished. For example, Wilkins et al. [7] used a threshold method to segment the retinal cystic edema region on the OCT image. However, when the histogram has no obvious troughs, the threshold segmentation method usually does not have a suitable threshold, leading to false segmentation. The threshold segmentation method is sensitive to noise and unevenness. In the segmentation application of medical images, threshold segmentation is usually used as a preprocessing method.

The second category is a medical image segmentation method based on region growing arithmetic. Its basic principle is the process of aggregating image pixels or subregions into larger regions based on user-defined similarity functions. The advantage of this algorithm is that it is simple to calculate and has good accuracy and high efficiency for uniform connected targets. It is sensitive to noise, resulting in a certain void or disconnection in the extracted area. Therefore, it needs to be used in combination with other image processing operations [8, 9].

The third category is a medical image segmentation method based on a deformation model. It comprehensively utilizes regional and boundary information and is the most widely used method. The most typical is the active appearance model (AAM), in which “appearance” [10] refers to outlines and textures, which is an extension of the active shape model. It combines the features of contours and image textures to segment the image more accurately. Mitchell [11] and others used a three-dimensional active appearance model for left and right ventricle segmentation on MR images and achieved good results. The advantages of this method are that it can produce parameter curves or surfaces that are closed in parameters (such as heart contours) or not closed (five description lines in face recognition) and can improve accuracy by using prior knowledge learned in training sets. The disadvantage is that the parameters in the model constrain its flexibility. When the test data are significantly different from the training set data, the accuracy will be greatly reduced.

The fourth category is a medical image segmentation method based on graph theory. It is a new unsupervised image segmentation technique that does not require initialization. Its basic idea is to establish each pixel in the medical image as each node in the graph structure, and they can be abundant. There is also a connection between the foreground and background seed points, and the entire graph is segmented by the method of maximum flow and minimum cut. Both graph cut and graph search methods are effective segmentation methods based on graph theory, and their effectiveness has been proven in images of different modalities [13–15]. One advantage is that the global optimal solution can be obtained according to the designed energy function to avoid falling into a local optimum. One disadvantage is that it is too sensitive to the seed point.

The machine learning-based method can effectively solve the problems that many traditional medical image segmentation methods have had difficulty solving in the past. The main methods are as follows: (1) Medical image segmentation based on boosting: Boosting [12] is a method used to improve the accuracy of learning algorithms by constructing a series of predictive functions and then combining them into a predictive function in a certain way. Freund and Schapire proposed the AdaBoost algorithm; it solved many practical difficulties in the early boosting algorithm. However, it has a weak adaptive ability. (2) Medical image segmentation based on a support vector machine: For example, a deep learning model is used to perform segmentation of MRI images [13] and segmentation of the knee articular cartilage [14]. The support vector machine has a certain effect on medical image segmentation. However, the kernel function selection and weak adaptive ability involved in this method affect its further promotion and application. (3) Medical image segmentation based on neural networks: This method has been promoted and applied in the field of medical image segmentation [15, 16]. However, this kind of method cannot better reflect the characteristics of association and multiscale in the process of medical image segmentation. (4) Medical image segmentation based on deep learning: The convolutional neural network (CNN) was proposed by Lecun et al. It was the first true multilayer structure-learning algorithm that could reduce the number of parameters by using spatial relative relations to improve the training performance [17–19]. It is characterized by the creation and simulation of a neural network for human brain analysis and learning, and it is used to simulate human brain analysis and interpretation of data. Deep learning has also been successfully applied in medical image segmentation, such as prostate segmentation in MR images [20], knee cartilage segmentation [21], ventricular segmentation in ultrasound images [22], and tissue segmentation in breast images [23]. However, there are still many problems in the deep learning method. That is, the information transfer between neurons is one way. In the human visual neural network, in addition to the feedforward connection, there are a large number of feedback connections and lateral connections [24, 25]. Studies have shown that the number of feedback connections is several times the number of feedforward connections [26, 27]. The feedforward neural network can perform certain sensing tasks, but there are still problems with it being sensitive to noise, relying on a large number of samples, poor interpretability, and lack of adaptability and robustness. The feedback neural circuit constitutes an important feedback regulation mechanism. The feedback mechanism plays a very important role in selective attention [28], target location and segmentation [29], feature clustering [30], long-term and short-term memory [31], multitask coordination, and environmental adaptability. Thus, to achieve more advanced intelligence, it is not sufficient to rely solely on a feedforward network with powerful mapping capabilities. For this reason, many scholars began to study the feedback convolutional neural network. For example, Cao proposed a feedback convolutional neural network. In this feedback mechanism, the advanced semantic tag is set as a preamble to infer the activation state of hidden layer neurons. It helps us better visualize and understand the deep work of neural networks and capture the visual attention of intended objects [32]. However, it does not propose a targeted solution to the optimization of the feedback mechanism. Wang et al. proposed a feedback mechanism for convolutional neural networks by constructing a residual attention network. Although the network has improved in object recognition accuracy, there are no corresponding explanation and optimization operation for the feedback mechanism [33]. Amir et al. proposed a network architecture based on general feedback, using existing RNNs for instantiation, and the experimental results are improved compared with the existing feedforward network. But it does not explain the specific logic of the feedback mechanism [34]. At the same time, relevant scholars have proposed a number of convolutional neural networks with feedback mechanisms and applied the network to image classification and image recognition [35–38]. Although these practical applications have achieved certain effects (such as improved accuracy), there are still problems such as unclear feedback mechanism, optimization of feedback mechanism, and computational efficiency. Moreover, there is no corresponding feedback mechanism convolutional neural network model for medical image segmentation. These defects of deep learning theory limit its further development and application in the field of medical image segmentation and have become the technical bottleneck of medical image segmentation technology that affects modern medical diagnosis. Therefore, a feedback convolutional neural network with higher optimization characteristics must be established.

In view of this, this paper analyzes the working mechanism of visual attention in the deep convolutional neural network, uses the target-driven method to perform neuron screening to construct the feedback adjustment mechanism, and then proposes the feedback optimization problem in deep convolutional nerves. In the method of exploring the feedback optimization problem, two feedback optimization algorithms based on greedy strategy are proposed. Through these two different algorithms, a new feedback adjustment mechanism is proposed in the deep convolutional network. This paper refers to this as the feedback adjustment mechanism of the convolutional neural network (FCNN). Finally, the algorithm is used to analyze and summarize the actual medical image segmentation.

Section 2 of this paper mainly explains the deep learning model of the feedback adjustment mechanism. Section 3 systematically expounds the greedy method proposed in this paper to solve the feedback optimization problem. Section 4 applies the deep learning model with the feedback mechanism proposed in this paper to the field of medical image segmentation compared with the mainstream segmentation algorithm. Finally, the full paper is summarized and discussed.

2. Mathematical Modeling of Feedback Mechanism CNN

2.1. Human Vision and Feedback Mechanism

Although the feedforward convolutional neural network has achieved great success in computer vision tasks, it does not have more feedback connections. CNN is a method of simulating human characteristics. However, it has no feedback mechanism. In view of this revelation, this paper considers adding a feedback connection model to the convolutional neural network to make the convolutional neural network more humanized to obtain better application effects.

Through target-driven feedback control, the accuracy of human detection and recognition of targets is improved in complex scenes. It enables the vision system to generate selectivity for neuron responses when processing visual information [39]. In addition, convolutional neural networks have powerful object recognition capabilities [40]. Recent studies have shown that [41, 42] internal neurons of convolutional neural networks for classification purposes can learn to express a variety of visual semantic patterns from massive images, for example, from simple edge features and color features to complex target local features or even complete targets. It shows that the convolutional neural network can segment objects in the image from layers that are simple to complex mode representations.

Inspired by the above phenomena, this paper can imitate the working mechanism of visual attention in the deep convolutional neural network for classification purposes and perform neuron screening in a target-driven manner to construct a feedback adjustment mechanism. Here, a simple example is given to clarify the feedback-modeling problem mentioned in this article. As shown in Figures 1(a) and 1(b), given an input image, the image has a simple face. Assuming that a convolutional neural network is trained to determine whether there is a human face in the image, the image is sent to the convolutional neural network. In the classification neurons at the highest level of the network, the neurons corresponding to the face category will be highly activated. The neurons here are the target neurons in this paper, denoted as P. In this process, there are multiple paths between a pixel in the input image and the target neuron F. We abstract all of these pathways into a connecting pathway (CP), which is used to indicate that a pixel is connected to the target neuron. Typically, all pixels within the field of view of the target neuron will be connected to the target neuron. This paper assumes that the target neuron field of view covers the full picture. Therefore, all the pixels and F in the figure have their own connection paths, as shown in Figure 1(c). Let P be the set of all these pathways. The visual information of the face and the background in the image is transmitted to the target neuron F through P in a bottom-up manner. In this paper, R is used to indicate a rule for determining whether a connected path is connected to a target pixel and a target neuron. Then, the set P can be divided into two subsets, T and B, according to rule R. Therefore, the following questions are defined in this paper.

Abstract definition of feedback-modeling problem is as follows: assume P = {all pixels to F′s CPs}, T = {all target pixels to F′s CPs}, and B = {all nontarget pixels to F′s CPs}; find the rules satisfying R. Let P = T ∪ B and T ∩ B = Ф.

2.2. Mathematical Modeling of the Feedback Adjustment Mechanism

2.2.1. New Explanation of Deep Neural Networks

Excellent deep convolutional network models are constructed by stacking simple operational layers, including the convolutional layer, the ReLU layer, and the max layer. For each layer, assuming the input is X, it is known that it is not the signal or the output of the previous layer. It is assumed that x is composed of C channels and the length and width are represented by S and T, that is, . The output thereof is assumed to be y and is composed of C′ channels, and the length and width are S′ and T′, that is, . Therefore, this paper can build the convolutional layer, the ReLU layer, and the largest layer separately.

The convolutional layers are used to extract different features of the input. The convolutional layer consists of C′ convolution kernels, each convolution kernel , and then the operation of the convolutional layer is described by the following formula:

The ReLU layer is mainly used to increase the nonlinearity of the network without affecting the receptive field of the convolutional neurons. Its corresponding input and output function relationship is as follows:

The max layer is mainly used to reduce the dimensions of the output vector and to obtain a degree of invariance to ensure that similar structures can achieve the same output. Max acts on the neighborhood N of each signal (i, j), specifically

The selectivity in the feedforward process is to better understand how selectivity works in neural networks. It also models the feedback mechanism. Therefore, this paper needs to reinterpret the role of the ReLU and the largest layer. In this paper, the max( ) operation in equations (2) and (3) can be replaced by a series of binary switches . Therefore, the ReLU and max layers can be represented in the form of . More specifically, the ReLU layer is represented as , in which o represents the multiplication of the signal class level, and the max layer is represented as , in which represents the convolution operation and z represents the convolution kernel with a value of 0 and 1.

By reinterpreting the ReLU and max layers as gate operations controlled by input x, deep convolutional neural networks can be understood as a bottom-up approach to select the useful information for decision-making in the feedforward process by these gate operations. Then, the information that contributes little to the decision is discarded to achieve the final decision. To ensure versatility and generalization, a large amount of information can be filtered through the ReLU and max layers, and thus, a large number of neurons are activated.

2.2.2. Basic Ideas and Mathematical Modeling of the Feedback Adjustment Mechanism

To ensure the versatility and generalization of the model, the deep convolutional neural network opens up almost all gate operations for the input signal and allows as much information as possible to pass through the entire network.

In this paper, a binary switch-type hidden variable is introduced for each hidden layer of neurons, which is called a feedback neuron. The control of these new neurons is realized by constructing the feedback connection between the target neurons and all feedback neurons. Figure 2 shows this process in brief. Figure 2(a) shows the original CNN, and Figure 2(b) shows the addition of switch-type feedback neurons to each hidden layer neuron. Figure 2(c) shows the construction of a feedback connection between the target neuron and all feedback neurons. Bottom-Up. It inherits the feature selectivity of the ReLU and max layers and passes the image information to the next layer. Top-Down. It is implemented by the feedback layer, which passes high-level semantic information to the data layer through gate operations. These gate operations only allow neurons associated with the target to be activated.

The connection path-clipping problem in the previous section is equivalent to the neuron-screening problem under the basic idea and the feedback neuron state control problem. This is given an input signal, and all neurons associated with a given target signal can be successfully screened out from the activated neurons. Then, the connecting pathway formed by these neurons becomes the connecting pathway that we need to screen out. Therefore, this paper needs to further clarify the feedback neuron state control problem.

As mentioned earlier, this paper proposes a large number of switch-type feedback neurons in deep convolutional neural networks. Simultaneously, a simple feedback connection is constructed between the target neuron and these feedback neurons to indicate that the state of the feedback neuron is controlled by the target neuron. By introducing the binary switch a, this paper further transforms the feedback mechanism into a numerical optimization problem. Given a signal I and a well-learned neural network, the parameter is and a set of binary switches in the network. This paper assumes that the target neuron output is S, and the mapping function of the signal I to the target neuron S is . This paper attempts to maximize the target output by adjusting the switch state of all feedback layers. The specific description is as follows:where represents the binary switch of the position coordinate of the channel of the feedback layer being (i, j). Because our goal is to maximize the target output by activating the fewest neurons, this paper uses the L1 norm to constrain the number of activations of z. Thus, this paper applies the mathematical model of the feedback mechanism to the deep convolutional neural network. The problem described in equation (4) is the feedback optimization problem. The solution to the feedback optimization problem is not easy, and it is difficult to obtain the global optimal solution. For the construction of the feedback connection mentioned above, since the purpose of the feedback connection path is to transmit a feedback control signal to the feedback neuron, the feedback neuron works in a predetermined manner. Therefore, this paper can construct a feedback connection in the process of solving the feedback optimization problem. However, from the feedback problem, since it is difficult to obtain the global optimal solution, different solution methods indicate that the calculation method of the feedback control signal is different, which requires different feedback adjustment mechanisms.

3. Feedback Optimization Problem by the Greedy Method

3.1. Linear Approximation of the Objective Function

CNN is a nonlinear mapping function with a large number of nonlinear mapping layers, for example, the ReLU layer and the max layer. Therefore, T_s(I) is a function that is highly nonlinear with respect to the input image I. When an input image I₀ is given, Taylor expansion is performed on T_s(I) near I₀ and T_s(I) is linearly approximated [42–44]. The result of the first-order Taylor expansion is as follows:

In this paper, two approximations are used to achieve the approximation of . Specifically, when the input image is known, the state of the neurons inside the network is activated when the network completes the first pretransmission. (1) At this time, this article fixes the door state of the ReLU and max layers. The closed door is always closed, and the open door is fixed to open. The state of the two types of layers is no longer changed. (2) The approximate expression of the remaining nonlinear layers is obtained by Taylor expansion. After completing these operations, is converted into the output of a linear neural network. Here, the feedback layer is added to each ReLU layer, and the target function is updated into a linear nested function. It is assumed that , can be expressed by any feedback layer to form a linear combination of functions, specificallywhere is the input of the feedback neuron at (i, j) above the channel c of the feedback layer l, represents the feedback neuron at the corresponding position, and is represented as the contribution weight (CW). is determined by the neural connection pathway between the feedback neuron and the target neuron T_s. The flow of the neuron contribution coefficient is shown in Figure 3. It can be seen from Figure 3 that, in the linear neural network, it is assumed that there are two paths between the target neuron T and a certain neuron in the middle layer, and each path has its own corresponding weight, such as , , , and . In this paper, can be obtained by the weight on the path in Figure 3 so that the target neuron T and the neuron can be abstracted into a connection path with a weight of .

In this paper, we obtain the linear approximation of the objective function in the feedback optimization problem by formula (6). To further simplify the problem, this paper abandons the regularization requirement of Z so that all Z’s that contribute to the objective function are opened. At this time, the feedback optimization problem is transformed into the following problem:

Formula (7) has no constant term because in is the output of the ReLU neuron. The constant term is calculated in .

3.2. Feedback Optimization Problem by the Greedy Method

3.2.1. Feedback Recovery Algorithm

This paper proposes a top-down layer-by-layer optimization method to update the feedback neurons Z of each layer to obtain the maximum objective function T. For a particular feedback layer 1, input is given to determine some type of visual pattern in the image space, and the contribution coefficient is used to express the contribution of the visual mode to the target neuron. Therefore, in this paper, the positive contribution of can be retained in a top-down order, while the negative contribution value of is eliminated, thereby maximizing the target neuron T. Specifically, at a certain layer, the state of the switch is updated according to the symbol , and the neuron contributing to the negative value is turned off. At this time, the neural network of the layer to the target neuron is roughly cropped. Then, retained in the layer is expanded to the next layer, and the contribution coefficient of each neuron in the next layer is recalculated in the new network structure and processed according to the same strategy. In this paper, such a strategy is applied to each feedback layer in a top-down manner and iterates until convergence. This algorithm is called the feedback recovery algorithm. In this algorithm, it is assumed that the neural network has a total of N feedback layers, and we record the target function after updating the feedback layer l as T_l. For the convenience of description, subscript k is used instead of i, j, and c. The mathematical proof process of the algorithm is given below.

To prove that the feedback recovery algorithm can make the feedback optimization problem obtain the local optimal solution, this paper needs to prove that the objective function T will increase the value of the objective function after each iteration; that is, this paper needs to prove T_N ≤ T₁. We can use mathematical induction to complete the proof. To this end, this paper first proves that T ≤ T_N and T_N ≤ T_N−1 and proves that T_l ≤ T_l−1 under the assumption that T_l+1 ≤ T_l is established.

(1) Assumption l = N

This paper expands T through the Nth feedback layer, namely,where is the Nth ReLU layer output neuron; thus, . Available from ,

Let and , and then

After updating all of the feedback layer, T_N can be expressed by the feedback layer. Here, depends on . Therefore, when is adjusted, is also updated to ; thus,

Then, this paper uses the above method to update and to obtain T_N−1:

(2) Assumption

Fix , , and then

can be represented by together with the convolutional layer weight :

If , then T_l is equivalent to one or more 0 items, which can be ignored. Therefore, this article only needs to focus on the situation when , and then

Therefore,

Since ,

Update the control gate state of the feedback layer on the basis of S_ll so that

It is available by updating as follows:

Since and ,

That is,

In summary, after the first iteration is completed,

Because the number of neurons given in this paper is limited, each iteration performs a cropping operation, so the value of the objective function T will continue to increase until it converges.

To qualitatively understand the feedback effect of FR, this paper combines the FR algorithm with the commonly used deep convolutional neural network framework VGGNet [45]. VGGNet pretrained the object classification task on the Imagenet2012 dataset. The visualization is generated after the network is cropped using the FR algorithm. The visualization and energy diagrams are explained in detail here.

(1) Visualization Map and Energy Map. When the FR algorithm converges, the gradient of the target neuron is set to 1, and the gradient backtransfer calculation is started from the target neuron. Finally, a gradient map is obtained in the image space. The gradient map is also 3-channel, which is consistent with the input image size. To visualize the gradient map, the normalization process is performed by a min-max method with a constant, specifically , which in turn forms a visualization map. At the same time, to measure the target correlation of each pixel, the sum of the absolute values of the three channels of the gradient map at each position is calculated, and then the energy map is normalized by the L2 norm.

3.2.2. Feedback Selective Algorithm

Because the FR algorithm constantly adjusts the contribution coefficient of each neuron in the optimization process, the FR algorithm loses the neuron-screening ability, which inspires us to adjust the contribution coefficient as much as possible in the optimization process. In this section, this paper proposes another optimization method that updates the feedback neuron state of each layer without changing the contribution coefficients during an iterative process. To achieve this goal, this article optimizes the objective function in a bottom-up manner. The objective function T is optimized by means of layer adjustment. The process of one iteration of the FS algorithm is shown in Figure 4. Given an input image containing an aircraft, this article uses the category of neurons corresponding to “bus” as the target neuron and then optimizes each feedback layer layer-by-layer in a bottom-up manner, repeating this process until convergence. Once the optimization is complete, it is equivalent to completing the selective filtering of the network. Finally, a visual map and an energy map are obtained by a gradient from the target neuron to the image input space.

The proof of convergence of the FS algorithm is given below. To prove that the FS algorithm can be a feedback optimization problem to obtain a local optimal solution, it is also proven by mathematical induction. In this case, this article needs to prove that T₁ ≤ T_N. Thus, it is first necessary to prove that T ≤ T₁ and T₁ ≤ T₂, and then the condition T_l− ≤ T_l+1 can be proven on the premise that T_l−1≤T_l− is assumed.(1)l = 1: This paper expands T through the first feedback layer, namely,where represents that the first ReLU layer is an output neuron, so .

Available from ,

Let , then

After updating all of the first feedback layer, T₁ can be expressed by the second feedback layer. Here, depends on . Therefore, when is adjusted, will also be updated to , so

Then, this paper uses the above method to update and to obtain T₂, so(2)Assumption : Fix , ,…, , then

S_l can also be represented by , so

Among them,

If , then T_l is equivalent to one or more 0 items, which can be ignored. Therefore, this article only needs to focus on the situation when , and then

Therefore,

That is, .

Update by the following formula:

Based on mathematical induction, after the first iteration, the following can be obtained:

Because the number of neurons in this paper is limited, each iteration performs a selection operation, so the value of the objective function T continues to increase. It will eventually converge.

4. Medical Image Segmentation Based on FCNN

4.1. Medical Image Segmentation Design

In this paper, a convolutional neural network with a feedback mechanism is constructed. First, fixed-size image block samples are extracted from the trained image set that has been preprocessed. Feature learning is performed through unlabeled image block samples, and the initial parameters of each layer of the network are trained. Then, further fine-tuning through the labeled image block samples is performed so that the convolutional neural network has a classification function. Then, the image block samples to be segmented are classified, and the part of the content to be marked is added to the black and white binary image as the initial segmentation result. Finally, the results of threshold segmentation and morphological processing are used to optimize the results of accurate segmentation of certain medical images. The technical details of the medical image segmentation method in this paper are shown in Figure 5.

4.2. Medical Image Segmentation Process Based on FCNN

First, the original sample data are generated. The ultimate goal of constructing the deep neural network in this paper is to classify the image pixels and then use the classification results to achieve medical image segmentation, taking into account the neighboring relationship of similar pixels in the segmentation task. Therefore, an image block with a target pixel point of 25 × 25 is used as the sample to be tested, and the category of the central pixel is determined according to the classification result of the image block. To standardize the data and eliminate the influence of the dimensions, it is necessary to first normalize the image, and the gray value range of the image block is limited to 0∼1. The formula is as follows:where max and min represent the maximum and minimum values, respectively, in the gray values in the image.

To obtain a robust learning network, it is necessary to add a certain degree of noise pollution to the original sample data and then use the contaminated data as an input to the first-layer learning network. The first-layer network is trained to minimize the error between the reconstructed output and the uncontaminated raw data.

Using the trained parameters, the output of the hidden layer (middle layer) in the first-layer network is calculated and trained as the input of the second layer, and the second-layer network is generated by using the newly obtained parameters. Then, the output of the hidden layer in the second-layer network is used as the input to the third layer. In this iterative training process, the feedback mechanism is constructed by using the third section, and finally, the convolutional neural network with the feedback mechanism is obtained. In this process, the hidden layer output in each layer of the network is the “depth feature” of the image. To ensure that the deep neural network has the classification function, it is necessary to use supervised learning to fine tune the whole network to ensure that the features correspond to the categories. The specific method constructs a complete feedback mechanism convolutional neural network by using the parameters of each layer of the previous network and adds an output layer at the end of the whole network to construct a feedforward deep neural network. Then, the output result is compared with the true value of the data, and the network parameters are adjusted according to the difference between them so that the input sample can output its corresponding category after a series of network mapping, thus ensuring the ability to classify the sample.

4.3. Image Segmentation Result Optimization

After obtaining the classification result of the image block, the corresponding central pixel point is mapped into the appropriate category according to the category label of each image block, thereby obtaining an initial segmentation result. However, due to the use of the gray level of the image block as the classification basis, it may be missegmented. To eliminate the phenomenon of missegmentation, the initial segmentation results are initially processed by threshold segmentation; the threshold is set according to the grayscale distribution of all pixels classified into a certain feature region, and then the pixel points that do not obviously belong to the tumor tissue are deleted.

To obtain a better segmentation effect, this paper uses open and closed operations in the morphological processing method to optimize the segmentation results [46]. The main function of the open operation is to eliminate the bulging edges of the image and the isolated spots. The main function of the closing operation is to fill the gaps and the concaves inside the image. In this paper, to optimize the segmentation results, the above two morphological operations will be used in combination.

5. Experimental Analysis

5.1. Tumor Image Segmentation Experiment

In this experiment, to verify the validity and robustness of the proposed method, the image segmentation algorithm proposed in this paper is used to segment the medical image. The medical image is segmented by other methods, and finally, a performance comparison table of these algorithms is given. In this paper, the Dice ratio algorithm is used to evaluate the accuracy of the segmentation result, which indicates the similarity between the experimental segmentation result and the expert manual segmentation gold standard. The images of the MRI modes used in this paper are from the BRATS [47–50] contest, which contains the four modes T1, T1c, T2, and FLAIR. The training data contain 30 patients’ real datasets and 50 simulated patient datasets. In this paper, 70% of the data are used as training data, 30% of the data are used as test data, and a cross-validation method is used to obtain the segmentation results. All training data are standard data, which have been segmented by professionals in advance, wherein pixel values 1, 2, 3, and 4 represent necrotic tissue, edema tissue, nonenhanced tumor, and enhanced tumor, respectively, and 0 represents normal tissue.

The comparison between the method proposed in this paper and other methods in the BRATS [49, 50] contest is shown in Table 1. This paper selects the best performing Zhao method (the Monte Carlo random-based supervoxel clustering method), the Baner method, the ordinary Menze algorithm, and the CNN method as the comparison objects.

Table 1 shows that the correct rate for the convolutional neural network segmentation method proposed is as high as 85.9%. This method is not only superior to the general tumor image segmentation method for tumor segmentation and recognition accuracy but also 4.2% better than the current (Zhao) algorithm. At the same time, the stability of the method is superior to that of the Zhao method (the method variance is 0.08, and the variance of the Zhao method is 0.09). In addition, the feedback mechanism convolutional neural network proposed in this paper is also superior to the general CNN method (the accuracy of the traditional CNN method is 80.2%). It can be seen from this that although the traditional CNN segmentation method is less effective than the proposed method, the CNN method can also achieve the accuracy of Zhao’s proposed algorithm (the accuracy of the CNN method is 80.2%, and the accuracy of the Zhao method is 81.7%), which fully demonstrates the great advantages of deep learning theory in medical image segmentation.

5.2. Spine Image Segmentation Experiment

Spinal CT images for deep learning are derived from two datasets from SpineWeb [51], one of which contains five body data that are labeled to segment only the vertebral body without the transverse processes, spinous processes, and pedicles. The image resolution is 1.0 × 1.0 × 1.0 mm³, the scan matrix size is 512 × 512, the number of slice images is between 30 and 88, and the other dataset contains 20 individual data. The segmented complete vertebral body has a resolution of 0.35 × 0.35 × 1 mm³, the acquired slice image size is 512 × 512, and the number of sliced images is 255 to 950.

In the dataset of 5 individuals’ data, the numbers of network training, testing, and verification are 3, 1, and 1, and the dataset containing 20 individuals retains one data for verification. All other data are used for network training and cross-testing, and the ratio is 8 : 2. Since the black background area is large and unevenly distributed in the image of the dataset of 5 individuals, the data are preprocessed, and the training dataset is cropped to 128 × 128 without reducing the image resolution. The image is normalized for each batch-read image, and the specific processing method is shown in formula (36). There are insufficient annotated images for network training. Therefore, it is necessary to expand the original training dataset. In each training iteration, the input training image is elastically deformed by the density deformation field obtained using the 3 × 3 grid control points and fractal interpolation. A new variant of the training dataset is derived, which is mainly used to verify the validity and reliability of the data expansion method.

Figure 6 shows segmentation results of a set of verification experiments that is loaded into a small dataset-training model and is randomly selected. The network prediction results are shown in Figure 6, along with the results of the segmentation and the standard overlapping of the combined results (IoU) and the difference between them. The red outline in the second column in Figure 6 is the standard outline drawn by hand, the blue area is the prediction result, and the third column is the algorithm division result. In the fourth-column difference map, the white area represents the coincident area, the pink area is undersegmented, and the green area represents the overdivided area. The trained network model was used to analyze the volume of data that were not involved in training and testing. It contained 58 slice images, the average IoU was 0.8037, and the average Dice value was 0.8579. From the analysis of the results of the verification output, the data expansion method has a good effect, and to some extent, it can compensate for the lack of training data.

6. Conclusion

Inspired by the visual attention mechanism in the human visual system, this paper presents the problem definition of the feedback adjustment mechanism in the deep convolutional neural network with object classification as the task. The essential goal of feedback is to target neuron screening in a target-driven manner. From the perspective of feedback, the composition of the convolutional neural network is reinterpreted, and the mechanism of stimulus-driven neuron screening and its problems in the feedforward process of convolutional neural networks are noted. Then, the paper introduces the feedback neuron and the feedback layer with the goal of maximizing the target neuron output, constructs the mathematical model of the whole feedback mechanism, and clarifies the feedback optimization problem. On this basis, this paper proposes a feedback optimization problem based on the greedy method. Two solving algorithms are also given: a feedback selective (FS) algorithm and a feedback recovery (FR) algorithm. This paper proposes a new framework for the FCNN method based on the FR and FS algorithms. It can effectively capture high-level semantic concepts and project them back into the image space to generate various energy maps with great practical value. The feedback convolutional neural network has the ability to locate and segment the target objects from the image.

At the same time, because it is difficult to find and extract effective features based on medical image segmentation, combined with the feedback convolutional neural network method presented in this paper, a medical image segmentation algorithm based on feedback mechanism convolutional neural network is proposed. To verify the reliability and advantages of the medical image segmentation algorithm proposed in this paper, this method is used to segment the tumor image and the spine image and compare it with the excellent algorithms in the CNN method and BRATS contest. The experimental results show that the proposed method can not only segment the tumor image more accurately but also segment the spine image. The recognition effect is not only superior to the CNN method but also superior to the excellent algorithm of the BRATS competition.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (No. 61701188) and China Postdoctoral Science Foundation (No. 2019M650512).

References

F. Milletari, N. Navab, and S. A. Ahmadi, “V-net: fully convolutional neural networks for volumetric medical image segmentation,” in Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV) IEEE, pp. 565–571, Stanford, CA, USA, October 2016.
View at: Google Scholar
E. Smistad, T. L. Falch, M. Bozorgi, A. C. Elster, and F. Lindseth, “Medical image segmentation on GPUs—a comprehensive review,” Medical Image Analysis, vol. 20, no. 1, pp. 1–18, 2015.
View at: Publisher Site | Google Scholar
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241, Springer, Munich, Germany, October 2015.
View at: Google Scholar
Y. Xue, T. Xu, H. Zhang et al., “Segan: adversarial network with multi-scale l 1 loss for medical image segmentation,” Neuroinformatics, pp. 1–10, 2018.
View at: Google Scholar
W. Zhang, R. Li, H. Deng et al., “Deep convolutional neural networks for multi-modality isointense infant brain image segmentation,” NeuroImage, vol. 108, pp. 214–224, 2015.
View at: Publisher Site | Google Scholar
D. Shen, G. Wu, and H.-I. Suk, “Deep learning in medical image analysis,” Annual Review of Biomedical Engineering, vol. 19, no. 1, pp. 221–248, 2017.
View at: Publisher Site | Google Scholar
G. R. Wilkins, O. M. Houghton, and A. L. Oldenburg, “Automated segmentation of intraretinal cystoid fluid in optical coherence tomography,” IEEE Transactions on Biomedical Engineering, vol. 59, no. 4, pp. 1109–1114, 2012.
View at: Publisher Site | Google Scholar
K. Cleary and T. M. Peters, “Image-guided interventions: technology review and clinical applications,” Annual review of Biomedical Engineering, vol. 12, no. 1, pp. 119–142, 2010.
View at: Publisher Site | Google Scholar
N. Senthilkumaran and S. Vaithegi, “Image segmentation by using thresholding techniques for medical images,” Computer Science and Engineering: An International Journal, vol. 6, no. 1, pp. 1–13, 2016.
View at: Publisher Site | Google Scholar
T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681–685, 2001.
View at: Publisher Site | Google Scholar
L. Grady and E. L. Schwartz, “Isoperimetric graph partitioning for image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 3, pp. 469–475, 2006.
View at: Publisher Site | Google Scholar
T. G. Dietterich, “An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization,” Machine Learning, vol. 40, no. 2, pp. 139–157, 2000.
View at: Publisher Site | Google Scholar
J. Zhang, K. K. Ma, M. H. Er et al., “Tumor segmentation from magnetic resonance imaging by learning via one-class support vector machine,” in Proceedings of the International Workshop on Advanced Image Technology (IWAIT’04), pp. 207–211, Singapore, January 2004.
View at: Google Scholar
K. Zhang, J. Deng, and W. Lu, “Segmenting human knee cartilage automatically from multi-contrast MR images using support vector machines and discriminative random fields,” in Proceedings of the 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 721–724, IEEE, Brussels, Belgium, September 2011.
View at: Google Scholar
X. He, H. Zhang, M. Landis, M. Sharma, J. Warrington, and S. Li, “Unsupervised boundary delineation of spinal neural foramina using a multi-feature and adaptive spectral segmentation,” Medical Image Analysis, vol. 36, pp. 22–40, 2017.
View at: Publisher Site | Google Scholar
Q. Li, Z. Gao, Q. Wang et al., “Glioma segmentation with a unified algorithm in multimodal MRI images,” IEEE Access, vol. 6, pp. 9543–9553, 2018.
View at: Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, pp. 1097–1105, 2012.
View at: Google Scholar
G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006.
View at: Publisher Site | Google Scholar
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
View at: Publisher Site | Google Scholar
S. Liao, Y. Gao, A. Oto et al., “Representation learning: a unified deep learning framework for automatic prostate MR segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 254–261, Springer, Nagoya, Japan, September 2013.
View at: Google Scholar
A. Prasoon, K. Petersen, C. Igel et al., “Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 246–253, Springer, Nagoya, Japan, September 2013.
View at: Google Scholar
G. Carneiro, J. C. Nascimento, and A. Freitas, “The segmentation of the left ventricle of the heart from ultrasound data using deep learning architectures and derivative-based search methods,” IEEE Transactions on Image Processing, vol. 21, no. 3, pp. 968–982, 2012.
View at: Publisher Site | Google Scholar
R. Rouhi, M. Jafari, S. Kasaei, and P. Keshavarzian, “Benign and malignant breast tumors classification based on region growing and CNN segmentation,” Expert Systems with Applications, vol. 42, no. 3, pp. 990–1002, 2015.
View at: Publisher Site | Google Scholar
A. Angelucci and P. C. Bressloff, “Contribution of feedforward, lateral and feedback connections to the classical receptive field center and extra-classical receptive field surround of primate V1 neurons,” Visual Perception-Fundamentals of Vision: Low and Mid-Level Processes in Perception, vol. 154, pp. 93–120, 2006.
View at: Publisher Site | Google Scholar
V. A. Lamme, H. Supèr, and H. Spekreijse, “Feedforward, horizontal, and feedback processing in the visual cortex,” Current opinion in neurobiology, vol. 8, no. 4, pp. 529–535, 1998.
View at: Publisher Site | Google Scholar
E. M. Callaway, “Feedforward, feedback and inhibitory connections in primate visual cortex,” Neural Networks, vol. 17, no. 5-6, pp. 625–632, 2004.
View at: Publisher Site | Google Scholar
J.-M. Hupé, A. C. James, P. Girard, S. G. Lomber, B. R. Payne, and J. Bullier, “Feedback connections act on the early part of the responses in monkey visual cortex,” Journal of Neurophysiology, vol. 85, no. 1, pp. 134–145, 2001.
View at: Publisher Site | Google Scholar
J. Moran and R. Desimone, “Selective attention gates visual processing in the extrastriate cortex,” Science, vol. 229, no. 4715, pp. 782–784, 1985.
View at: Publisher Site | Google Scholar
A. Torralba, A. Oliva, M. S. Castelhano, and J. M. Henderson, “Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search,” Psychological Review, vol. 113, no. 4, pp. 766–786, 2006.
View at: Publisher Site | Google Scholar
I. S. Dhillon, S. Mallela, and R. Kumar, “A divisive information-theoretic feature clustering algorithm for text classification,” Journal of Machine Learning Research, vol. 3, pp. 1265–1287, 2003.
View at: Google Scholar
N. Cowan, “Chapter 20 what are the differences between long-term, short-term, and working memory?” Progress in Brain Research, vol. 169, pp. 323–338, 2008.
View at: Publisher Site | Google Scholar
R. Rouhi, M. Jafari, S. Kasaei, and P. Keshavarzian, “Benign and malignant breast tumors classification based on region growing and CNN segmentation,” Expert Systems with Applications, vol. 42, no. 3, pp. 990–1002, 2015.
View at: Publisher Site | Google Scholar
A. Angelucci and P. C. Bressloff, “Contribution of feedforward, lateral and feedback connections to the classical receptive field center and extra-classical receptive field surround of primate V1 neurons,” Visual Perception-Fundamentals of Vision: Low and Mid-Level Processes in Perception, vol. 154, pp. 93–120, 2006.
View at: Publisher Site | Google Scholar
V. A. Lamme, H. Supèr, and H. Spekreijse, “Feedforward, horizontal, and feedback processing in the visual cortex,” Current Opinion in Neurobiology, vol. 8, no. 4, pp. 529–535, 1998.
View at: Publisher Site | Google Scholar
E. M. Callaway, “Feedforward, feedback and inhibitory connections in primate visual cortex,” Neural Networks, vol. 17, no. 5-6, pp. 625–632, 2004.
View at: Publisher Site | Google Scholar
J.-M. Hupé, A. C. James, P. Girard, S. G. Lomber, B. R. Payne, and J. Bullier, “Feedback connections act on the early part of the responses in monkey visual cortex,” Journal of Neurophysiology, vol. 85, no. 1, pp. 134–145, 2001.
View at: Publisher Site | Google Scholar
J. Moran and R. Desimone, “Selective attention gates visual processing in the extrastriate cortex,” Science, vol. 229, no. 4715, pp. 782–784, 1985.
View at: Publisher Site | Google Scholar
A. Torralba, A. Oliva, M. S. Castelhano, and J. M. Henderson, “Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search,” Psychological Review, vol. 113, no. 4, pp. 766–786, 2006.
View at: Publisher Site | Google Scholar
N. Kruger, P. Janssen, S. Kalkan et al., “Deep hierarchies in the primate visual cortex: what can we learn for computer vision?” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1847–1871, 2013.
View at: Publisher Site | Google Scholar
M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proceedings of the European Conference on Computer Vision, pp. 818–833, Springer, Zurich, Switzerland, September 2014.
View at: Google Scholar
B. Zhou, A. Khosla, A. Lapedriza et al., “Object detectors emerge in deep scene cnns,” 2014, https://arxiv.org/abs/1412.6856.
View at: Google Scholar
K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: visualising image classification models and saliency maps,” 2013, https://arxiv.org/abs/1312.6034.
View at: Google Scholar
M. Figurnov, A. Ibraimova, D. P. Vetrov et al., Perforatedcnns: Acceleration through elimination of redundant convolutions//Advances in Neural Information Processing Systems, 2016.
G. Montavon, S. Lapuschkin, A. Binder, W. Samek, and K.-R. Müller, “Explaining nonlinear classification decisions with deep Taylor decomposition,” Pattern Recognition, vol. 65, pp. 211–222, 2017.
View at: Publisher Site | Google Scholar
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014, https://arxiv.org/abs/1409.1556.
View at: Google Scholar
R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital image processing using MATLAB, Pearson-Prentice-Hall, Upper Saddle River, NJ, USA, 2004.
B. H. Menze, A. Jakab, S. Bauer et al., “The multimodal brain tumor image segmentation benchmark (BRATS),” IEEE Transactions on Medical Imaging, vol. 34, no. 10, pp. 1993–2024, 2015.
View at: Publisher Site | Google Scholar
https://sites.google.com/site/braintumorsegmentation/home.
P. Blanc-Durand, A. Van Der Gucht, N. Schaefer et al., “Automatic lesion detection and segmentation of 18F-FET PET in gliomas: a full 3D U-Net convolutional neural network study,” PLoS One, vol. 13, no. 4, Article ID e0195798, 2018.
View at: Publisher Site | Google Scholar
C. Cao, Y. Huang, Y. Yang et al., “Feedback convolutional neural network for visual localization and segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, pp. 19–32, 2018.
View at: Google Scholar
https://www.spineweb.digitalimaginggroup.ca/spineweb/index.php?action=home.

Copyright

Copyright © 2019 Feng-Ping An and Zhi-Wen Liu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

4094

Downloads

1883

Citations