Research Article | Open Access
Automatic Lung Segmentation Based on Texture and Deep Features of HRCT Images with Interstitial Lung Disease
Lung segmentation in high-resolution computed tomography (HRCT) images is necessary before the computer-aided diagnosis (CAD) of interstitial lung disease (ILD). Traditional methods are less intelligent and have lower accuracy of segmentation. This paper develops a novel automatic segmentation model using radiomics with a combination of hand-crafted features and deep features. The study uses ILD Database-MedGIFT from 128 patients with 108 annotated image series and selects 1946 regions of interest (ROI) of lung tissue patterns for training and testing. First, images are denoised by Wiener filter. Then, segmentation is performed by fusion of features that are extracted from the gray-level co-occurrence matrix (GLCM) which is a classic texture analysis method and U-Net which is a standard convolutional neural network (CNN). The final experiment result for segmentation in terms of dice similarity coefficient (DSC) is 89.42%, which is comparable to the state-of-the-art methods. The training performance shows the effectiveness for a combination of texture and deep radiomics features in lung segmentation.
Interstitial lung disease (ILD) is a generic term of the clinicopathological entities that are composed by an inhomogeneous group of diseases based on the pathological basic changes of diffuse lung parenchyma, alveolar inflammation, and interstitial fibrosis . It is estimated that the morbidity of ILD is 26–32 cases per 100,000 people per year . Though ILD develops slowly, without early treatment, it may not be eradicated after the breakout, causing great harm to the patients. High-resolution computed tomography (HRCT) can supply such a clear image of the tiny structures of lung tissue so that it is considered as the preferred method to diagnose ILD [3, 4]; the examples are shown in Figure 1. Four main categories of features may be showed at HRCT for ILD: reticular pattern, nodular patterns, increased lung attenuation, and decreased lung attenuation [5, 6].
But due to the capability of radiologists, level of facilities, and nonspecific lung lesion patterns, it also leads to high unpredictability in HRCT interpretations. Computer-aided diagnosis (CAD) system has been widely used to eliminate these defects by quantitative analysis of the characteristics of the pulmonary lesions and by automatic diagnosis. Segmentation of the lung fields in HRCT images into different regions of interest (ROI) is the first step for CAD of lung disease. However, there are challenges now in segmentation of HRCT images for ILD: (1) several noises always occurring in HRCT images resulting in fuzzy edges; (2) depending on low-middle-high level features to distinguish the similar areas; and (3) essential requirements for accuracy of the segmentation algorithm.
Radiomics extracting large amounts of quantitative features from radiographic images plays an important scenario for automatic segmentation . Among the various categories of radiomics features, it can be learned significant information from the ROI through both texture features and deep features for accurate segmentation. Since the texture is formed by the repeated appearance of gray-level distribution in the spatial position, a certain spatial correlation property for the grayscale exists in the image. The gray-level co-occurrence matrix (GLCM) can be used to extract the texture features from abnormal tissues to explain this spatial grayscale relationship . Recently, deep learning as an end-to-end method consisting of multiple neural network layers has been widespread in medical image processing. It can extract deep features using the most popular convolutional neural networks (CNNs) .
In this work, we build an automatic segmentation model based on radiomics with deep features and texture features. The contributions of this work are as follows: (1) proposing a new automatic method using the noise preprocessing, deep features, and texture features to make robust lung segmentation and (2) extracting radiomics features to provide support for ILD diagnosis. The rest of this paper is organized as follows: (1) Section 2 reviews some segmentation models used in previous studies. (2) Section 3 describes the proposed method including a detailed process. (3) Section 4 evaluates the feasibility and effectiveness of clinical application for ILD on HRCT images. (4) Section 5 summarizes the research and highlights of the future work.
2. Related Works
Lung segmentation methods are mainly divided into four categories [10–12]: threshold methods, edge-based methods, region-based methods, and intelligent methods. The fact that lung looks obviously different from the surrounding regions in CT scans makes the threshold-based methods more easy to understand and operate because of its basic needs that compute a threshold to separate the lung from other tissues [13–16]. However, the main disadvantage of threshold methods is the inaccurate lung segmentation since some of the pulmonary components are similar to the chest structures. The edge-based segmentation functions under edge detector filters at different directions to distinguish the lung boundaries from radiographs [17–19]. Each edge point located by the tracing procedure constitutes a spatially closed outline for the final pulmonary segments. Depending on the fact that adjacent pixels are similar within one region, region-based segmentation is spatially performed by comparing one pixel with the neighbors to ascertain if they belong to the same set. For the region-based methods, the best-known method is the region-growing method [20, 21]. Seed (a small patch) that is first initialized as the most representative voxel continuously grows to extract the target lung region to be segmented [22–24]. Although region-based methods are more efficient than the threshold-based methods, they may need preprocessing and postprocessing when high levels of abnormality are shown in segment regions, for example, noise from CT data. Intelligent methods fuse advanced algorithms in the field of image processing with segmentation, such as pattern recognition , fuzzy theory , Markov random theory , and wavelet analysis , which achieves more accurate and realistic results for lung segmentation.
Though these segmentation technologies strive to obtain the final output by defining an initial threshold and combining with other methods to constantly optimize it, no single segmentation method achieves globally optimal performance for all cases.
3. The Proposed Automatic Segmentation Method
The target of the proposed automatic segmentation model is to accurately segment the lung for ILD. The diagram of the method is shown in Figure 2, and the procedure of the proposed model is preprocessing and segmentation. Preprocessing mainly indicates the denosing, and segmentation focuses on the radiomics features having two stages including texture feature extraction and deep feature extraction. The first stage uses GLCM, of which the input is denosing images and the output is initial segmented images. The second stage uses U-Net  (one classic deep learning network), of which the input is denosing images with the output of the first stage and the output is final refined segmented images. The procedure finishes when the segmentation contour is the same with the previous contour.
The lung graphs for segmentation produced by the machines may add some noise in the process of collection and transmission, leading to the distortion of the HRCT graphs. However, it is very essential to keep the original quality of the radiographs for segmentation to ensure the accuracy of the CAD for ILD. Gaussian noise is the most common noise type caused by the poor light or high temperature in the image. Gaussian noise is a kind of noise whose probability density function obeys Gaussian distribution , defined as follows:where x and y are the position of every pixel on the image, which denotes the original input image is the pixel for every position, and , respectively, are the expectation and standard deviation of the noise. After the Gaussian noise is added, the image is defined as follows:
Wiener filter is commonly known as the optimal method for CT image denoise . Meanwhile, Wiener filter is often used to cancel the Gaussian noise and better solve the blurring edge for image segmentation [32–34]. Therefore, in this paper, we employ Wiener filter to reduce the Gaussian noise. Wiener Filter function here is defined bywhere is the Fourier transform for the input image and is the blurring function. The main principle of Wiener filter is to use the linear estimation to make the mean square error (MSE) between the and minimal, i.e., the Gaussian noise is removed.
3.2. Texture Features
The texture features are extracted from the gray-level co-occurrence matrix (GLCM). The GLCM builds the mutual occurrence of different gray levels between a pair of pixels separated by a certain distance and oriented at a particular direction in an image space (ROI) ranging from gray level 0 to . After that, the GLCM element can be defined as follows:where are the pixels in ROI, is the gray level of the pixel, and is the number of the pixels which meet the condition. For direction , the values of parameters at different are given in Table 1. In this paper, for the texture calculation, the GLCM must be symmetrical, and each entry of the GLCM should be a probability value with a normalization process . The element of the normalized gray-level co-occurrence matrix (NGLCM) is defined as follows:
Figure 3 shows an example of computation for GLCM and NGLCM where every cell contains the probability value. It can be seen in Figure 3(a) that a image including 5 gray levels (from 0 to 4) has a reference pixel (2, 2) with the four directions. For example, the element (0, 2) is 2 in Figure 3(b) as the occurrence of the pair (0, 2) in the input image is 2 at according to formula (5). In reference , the formulas of 14 features (Angular Second Moment, Contrast, Correlation, Difference Variance, Difference Entropy, etc.) extracted based on GLCM were described in detail. Then, 120 NGLCMs are computed (four directions) and 1680 single values are resulted (14 features).
3.3. Deep Features
The deep CNN features are extracted from the classic U-Net. The U-Net which yields more accurate segmentation is based on the fully convolutional network  and suitable for few medical image training. Figure 4 shows the U-shaped architecture of U-Net. The network consists of two parts, i.e., downsampling and upsampling. In this paper, the downsampling is like an encoder including 3 times of operations with two 3 × 3 convolutional networks followed by a rectified linear unit (ReLU) and a 2 × 2 max pooling layer. Moreover, the upsampling of feature map is a decoder which also consists of 3 times of operations with a 2 × 2 upconvolutional layer followed by a cropping operation from the downsampling, two 3 × 3 convolutional networks, and a ReLU. At every cropping step, one concatenation is added to make up for the loss of border pixels in each convolution. Finally, it obtains a convolutional deep feature map for the segmentation result. The loss function is the combination of softmax and cross-entropy :where is the weight function and is activation function for the channel.
Based on the overall segmentation architecture with the denoising and feature fusion after training, we can eventually recognize the ROI from the lung area. The detailed steps are illustrated in Algorithm 1.
4. Experiments and Discussion
In this section, we validate the method on the medical images for clinical application. First, we introduce the dataset, technical experiment details, and evaluation standard. Then, we, respectively, show and discuss the performance of denoising, segmentation, and training process by comparing with the baseline methods.
4.1. Dataset and Technical Details
We experimented on the ILD Database-MedGIFT  and selected from 128 patients (47 females and 81males, mean age of 59 years). 108 HRCT image series are stored in DICOM format and reconstructed to 1946 ROIs in PNG format. To have a balance preserving resolution and computational complexity of the models , the ROI images here are cropped to pixels 512 × 512, of which 80% (1557) are training data and 20% (389) are testing data. We performed the experiment on the single GPU NVIDIA RTX 2070 using Python language, and CNN was implemented on the framework of TensorFlow, the batch size is 20, the learning rate is 1e−4, and the epoch is 500. Besides, we see the masks annotated by the database (manual lung segmentation) as ground truth. We adopt dice similarity coefficient (DSC) , sensitivity (SEN) , and training time (T, one epoch) as evaluation metrics for the proposing method, defined as follows:where is the area of ground truth and is the area of segmentation lung using the proposed method. The value of DSC is between zero and one:where TP is true positive and FN is false negative.
4.2. Segmentation Results
In order to illustrate the effectivity of the proposed method, we compared it with the following methods: (1) GLCM , (2) U-Net , (3) fully convolutional networks (FCNs)  (another commonly used method in segmentation), and (4) GU: GLCM + U-Net (without denoising).
We first show some examples of the segmentation results obtained by the five methods and the ground truth for clarity in Figure 5. We can see from Figure 5 that the achieved segmentation results of our method are the best. Though the results achieved by other methods are similar to ground truth, they often have some false segmented areas. For example, the regions segmented by GU and our method are more accurate, while ours yields slightly better without so much noise. Besides, it can be seen from the second row of Figure 5 that the results received by GLCM, U-Net, and FCN contain some confounding areas.
Then we present the DSC (average value) and SEN of segmentation results on testing dataset with the T on training dataset using our method and four compared methods in Table 2. It is significant that our method (in bold) is better than the other four methods all in terms of DSC and SEN. Moreover, the training time of ours is shortest, showing that the complexity is lower and it is easy to perform our method. In particular, the DSC of our method (89.42%) is obviously higher than (80.47%) GLCM which explains that deep features are much more important than texture features for accurate segmentation. On the contrary, the SEN of GLCM is slightly better than U-Net and FCN, which implies that texture features perform better on the problem that much more samples generate accurate segmentation. Hence, the combination of the deep features and texture features is a necessary step in lung segmentation. Besides, U-Net is better than FCN, illustrating that our method can improve the performance by comparing it with the conventional deep learning method.
4.3. Influence of Combined Radiomics Strategy
In this group of experiment, we illustrate the effectiveness for combination of texture and deep radiomics features in lung segmentation with ILD. We compare the segmentation results of the proposed method in terms of DSC and SEN, respectively, with GLCM, U-Net, and our method, as shown in Figure 6. We can see from Figure 6 that our method is significantly better than using only GLCM or U-Net. U-Net generates much higher DSC value than GLCM, while the SEN value of U-Net is almost the same with GLCM. The combination of the two features promotes better performance.
We further show about the training performance (according to equation (5)) for U-Net, GU, and ours in Figure 7. The training loss in Figure 7 also shows that our method with lower loss performs better than the other two methods by combining the texture and deep radiomics features.
We propose a novel automatic segmentation method using radiomics for ILD patterns from HRCT images. After the preprocessing denoising with Wiener filter, we fuse texture features based on GLCM and deep features based on U-Net for the segmentation contour. In the experiments of lung segmentation with ILD, the model reveals higher accuracy and overall performance than the conventional methods. The segmentation results demonstrate both the necessity of denoising and the utility of radiomics features for segmentation. The results of DSC, SEN, and T show the usefulness of combination of deep features and texture features. In future, we will try to combine the segmentation model and lung tissue classification for better CAD of ILD.
The HRCT data used to support the findings of this study have been deposited in the ILD Database-MedGIFT repository ([http://medgift.hevs.ch/wordpress/databases/ild-database/]).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
- T. E. King Jr., H. R. Collard, and L. Richeldi, “Interstitial lung diseases. Preface,” Clinics in Chest Medicine, vol. 33, no. 1, 2012.
- A. M. Russell and T. M. Maher, “Detecting anxiety and depression in patients diagnosed with an interstitial lung disease. Can we do better?” Respirology, vol. 19, no. 8, pp. 1095-1096, 2014.
- V. Verónica, B. João, M. Luis, and S. S. José, “Enhanced classification of interstitial lung disease patterns in HRCT images using differential lacunarity,” BioMed Research International, vol. 2015, Article ID 672520, 9 pages, 2015.
- I. Sluimer, A. Schilham, M. Prokop, and B. van Ginneken, “Computer analysis of computed tomography scans of the lung: a survey,” IEEE Transactions on Medical Imaging, vol. 25, no. 4, pp. 385–405, 2006.
- J. G. Goldin, D. A. Lynch, D. C. Strollo et al., “High-resolution CT scan findings in patients with symptomatic scleroderma-related interstitial lung disease,” Chest, vol. 134, no. 2, pp. 358–367, 2008.
- J. J. Xiu, Y. X. Li, and Y. F. Cui, “The diagnosis of interstitial lung disease in high resolution CT,” Journal of Medical Imaging, vol. 14, no. 7, pp. 585–588, 2004.
- P. Lambin, E. Rios-velazquez, R. Leijenaar et al., “Radiomics: extracting more information from medical images using advanced feature analysis,” European Journal of Cancer, vol. 48, no. 4, pp. 441–446, 2012.
- P. Yang and G. Yang, “Feature extraction using dual-tree complex wavelet transform and gray level co-occurrence matrix,” Neurocomputing, vol. 197, pp. 212–220, 2016.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
- R. P. Ammi, B. K. Giri, K. R. E. Venkata, and B. I. Ramesh, “Automated lung segmentation from HRCT scans with diffuse parenchymal lung diseases,” Journal of Digital Imaging, vol. 29, no. 4, pp. 507–519, 2016.
- M. Awais, B. Ulas, F. Brent et al., “Segmentation and image analysis of abnormal lungs at CT: current approaches, challenges, and future trends,” Radiographics, vol. 35, no. 4, pp. 1056–1076, 2015.
- Z. H. Shi, J. J. Ma, M. H. Zhao et al., “Many is better than one: an integration of multiple simple strategies for accurate lung segmentation in CT images,” BioMed Research International, vol. 2016, Article ID 1480423, 13 pages, 2016.
- A. R. Amanda and R. Widita, “Comparison of image segmentation of lungs using methods: connected threshold, neighborhood connected, and threshold level set segmentation,” Journal of Physics Conference Series, vol. 694, pp. 1201–1207, 2016.
- N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979.
- W. Zhang, X. L. Zhang, J. J. Zhao et al., “A segmentation method for lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise,” PLoS One, vol. 12, no. 9, Article ID e0184290, 2017.
- R. E. Van, B. DeHoop, D. V. S. Van, M. Prokop, and G. B. Van, “Automatic segmentation of pulmonary segments from volumetric chest CT scans,” IEEE Transactions on Medical Imaging, vol. 28, no. 4, pp. 621–630, 2009.
- A. Qaisar, “Segmentation of differential structures on computed tomography images for diagnosis lung-related diseases,” Biomedical Signal Processing and Control, vol. 33, no. 3, pp. 325–334, 2017.
- A. Gupta, O. Martens, Y. L. Moullec, and T. Saar, “Methods for increased sensitivity and scope in automatic segmentation and detection of lung nodules in CT images,” in Proceedings of the IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 375–380, Abu Dhabi, UAE, December 2015.
- P. Campadelli, E. Casiraghi, and D. Artioli, “A fully automated method for lung nodule detection from postero-anterior chest radiographs,” IEEE Transactions on Medical Imaging, vol. 25, no. 12, pp. 1588–1603, 2006.
- S. A. Hojjatoleslami and J. Kittler, “Region growing: a new approach,” IEEE Transactions on Image Processing, vol. 7, no. 7, pp. 1079–1084, 1998.
- M. S. Haleem, L. Han, J. van Hemert et al., “A novel adaptive deformable model for automated optic disc and cup segmentation to aid glaucoma diagnosis,” Journal of Medical Systems, vol. 42, no. 1, p. 20, 2018.
- J. K. Dash, V. Madhavi, S. Mukhopadhyay, N. Khandelwal, and P. Kumar, “Segmentation of interstitial lung disease patterns in HRCT images,” in Proceedings of the Medical Imaging: Computer-Aided Diagnosis, Orlando, FL, USA, February 2015.
- D. N. Giorgio, T. Eleonora, A. Antonella et al., “Automatic lung segmentation in CT images with accurate handling of the hilar region,” Journal of Digital Imaging, vol. 24, no. 1, pp. 11–27, 2011.
- Z. Shi, J. Ma, M. Zhao, Y. Liu, Y. Feng, and M. Zhang, “[Article withdrawn] novel method using multiple strategies for accurate lung segmentation in computed tomography images,” Journal of Medical Imaging and Health Informatics, vol. 6, no. 5, pp. 1271–1275, 2016.
- J. R. Ferreira, M. Koenigkam-Santos, F. E. G. Cipriano, A. T. Fabro, and P. M. de Azevedo-Marques, “Radiomics-based features for pattern recognition of lung cancer histopathology and metastases,” Computer Methods and Programs in Biomedicine, vol. 159, pp. 23–30, 2018.
- Q. Mao, S. Zhao, T. Gong, and Q. Zheng, “An effective hybrid windowed fourier filtering and fuzzy C-mean for pulmonary nodule segmentation,” Journal of Medical Imaging and Health Informatics, vol. 8, no. 1, pp. 72–77, 2018.
- A. Soliman, F. Khalifa, A. Elnakib et al., “Accurate lungs segmentation on CT chest images by adaptive appearance-guided shape modeling,” IEEE Transactions on Medical Imaging, vol. 36, no. 1, pp. 263–276, 2017.
- O. Talakoub, J. Alirezaie, P. Babyn, and Ieee, “Lung segmentation in pulmonary CT images using wavelet transform,” in Proceedings of the 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA, June 2007, vol. 1, Pts 1–3, p. 453.
- O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, October 2015.
- B. Tudor, “Variational image denoising approach with diffusion porous media flow,” Abstract and Applied Analysis, vol. 2013, Article ID 856876, 8 pages, 2013.
- S. P. Gou, Y. Y. Wang, Z. L. Wang et al., “CT image sequence restoration based on sparse and low-rank decomposition,” PLoS One, vol. 8, no. 9, Article ID e72696, 2013.
- H. Qin and S. X. Yang, “Adaptive neuro-fuzzy inference systems based approach to nonlinear noise cancellation for images,” Fuzzy Sets and Systems, vol. 158, no. 10, pp. 1036–1063, 2007.
- T. D. Pham, “Estimating parameters of optimal average and adaptive wiener filters for image restoration with sequential Gaussian simulation,” IEEE Signal Processing Letters, vol. 22, no. 11, pp. 1950–1954, 2015.
- S. L. Yi and J. F. He, “Image denoising method based on BEMD and adaptive Wiener filter,” Computer Engineering and Applications, vol. 49, no. 10, pp. 156–158, 2013.
- R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features for image classification,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 3, no. 6, pp. 610–621, 1973.
- S. Beura, B. Majhi, and R. Dash, “Mammogram classification using two dimensional discrete wavelet transform and gray-level co-occurrence matrix for detection of breast cancer,” Neurocomputing, vol. 154, pp. 1–14, 2015.
- J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440, Las Vegas, NV, USA, June 2015.
- A. Depeursinge, A. Vargas, A. Platon, A. Geissbuhler, P.-A. Poletti, and H. Müller, “Building a reference multimedia database for interstitial lung diseases,” Computerized Medical Imaging and Graphics, vol. 36, no. 3, pp. 227–238, 2012.
- H. Salehinejad, S. Valaee, T. Dowdell, E. Colak, and J. Barfett, “Generalization of deep neural networks for chest pathology classification in X-rays using generative adversarial networks,” in Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 990–994, IEEE, Calgary, Canada, April 2018.
- J. Zhang, A. Saha, Z. Zhu, and M. A. Mazurowski, “Hierarchical convolutional neural networks for segmentation of breast tumors in MRI with application to radiogenomics,” IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 435–447, 2019.
Copyright © 2019 Ting Pang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.