Abstract

Disease detection, diagnosis, and treatment can all be done with the help of digitalized medical images. Macroscopic medical images are images obtained using ionizing radiation or magnetism that identify organs and body structures. In recent years, various computational tools such as databases, distributed processing, digital image processing, and pattern recognition in digital medical images have contributed to the development of Computer-Aided Diagnosis (CAD), which serves as an auxiliary tool in health care. The use of various architectures based on convolutional neural networks (CNNs) for the automatic detection of diseases in medical images is proposed in this work. Different types of medical images are used in this work, such as chest tomography for identifying types of tuberculosis and chest X-rays for detecting pneumonia to solve the same number of classification problems or detect patterns associated with diseases. Finally, an algorithm for automatic registration of thoracic regions is proposed, which intrinsically identifies the translation, scale, and rotation that align the thoracic regions in X-ray images.

1. Introduction

By analyzing medical images that contain a representation of the human body or parts of it, diseases can be detected, diagnosed, and treated. Even when medical history, vital signs, and laboratory tests are used to confirm diagnoses, medical images can help improve diagnoses made by highly trained medical specialists. Unfortunately, factors such as the patient’s position, the image’s poor quality, or the doctor in charge of capturing or interpreting the data’s lack of expertise can lead to a misdiagnosis, resulting in the patient’s death. In recent years, digital images have enabled the development of systems for computer-assisted diagnosis, which serves as an auxiliary tool in medicine, reducing costs and allowing for more efficient disease detection through pattern recognition. Some systems can automatically detect specific diseases in medical images. They extract significant patterns associated with diseases using a variety of sophisticated algorithms. Segmentation methods such as Cuenca [1], image processing techniques such as Gaussian Laplacian, and local binary pattern (LBP) [2], to name a few, are some of the most widely used algorithms. Convolutional neural networks are a type of deep learning model that is currently one of the most successful approaches to medical image analysis (CNN). Efficient models called Mobile Nets for mobile and embedded vision applications. Mobile networks are based on an optimized architecture that uses separable depth convolutions to build light and deep neural networks, two simple global parameters that efficiently trade latency and precision. These hyperparameters allow the model builder to choose the correct size model for their application based on the problem’s constraints. There are Mobile Nets in many applications and use cases, including object detection, classification, face attributes, and large-scale geolocation [3].

This work proposes the use of convolutional neural network architectures for automatic disease detection in medical images and a preprocessing architecture that uses neural networks to perform image registration. Understanding a convolutional neural network can contribute to automation, reduce processing time and cost, and add preprocessing stages, bringing machine learning methods and computer-aided diagnostic systems to the forefront. CNNs can reach up to 96% likelihood (for this reason, they are currently used by Google and Facebook) in such a way that understanding a convolutional neural network can contribute to automation and reduction in processing time and cost, and adding preprocessing stages brings machine learning methods, and computer-aided paper explains the essential components of a CNN architecture. It examines how different parameter variations affect the performance of the disease detection task. The application of computational tools and mathematical methods allows the analysis, recognition, and classification of patterns so that these methods can be applied to the detection of diseases through the analysis of macroscopic medical images. Due to the great success and promising results obtained from classifying images, there is currently an interest in using deep learning techniques in medical imaging. Although not a new invention, convolutional neural networks [4] can classify and identify specific patterns in images. We now see many CNN-based models that achieve cutting-edge results in classification, location, segmentation, and task action recognition, among others [5]. Following that, work on the classification task in medical images to detect pneumonia is shown, followed by work on image registration using convolutional neural networks and, finally, chest radiograph registration. This research uses a convolutional neural network architecture model to solve a tuberculosis classification problem using CNN architecture to register on chest X-ray images.

2. CNNs on Medical Image Recording and Classification Issues

Image registration is the process that seeks the geometric transformation to align two images in the best possible correspondence; this section focuses on a registration proposal using CNNs and principal component analysis. The results can be seen visually in the previous section. The proposed method eliminates unnecessary information and better delimits the thorax region. Regarding the evaluation regarding the classification of types of pneumonia in X-ray images, although the improvement is marginal by preprocessing the images with the registration method, the results suggest that it is necessary to improve the preprocessing to reveal the physiological details of the patterns associated with each type of pneumonia.

3. Registration of the Chest in X-Ray Images Using CNN

Image registration is the geometric transformation process to align two images in the best possible correspondence. This process can be applied in medical imaging to improve classification algorithms or automatic detection of disease-associated patterns. This section proposes using convolutional neural networks and principal component analysis (PCA) to record chest images. The results of pneumonia detection on unregistered images are then compared to registered images. The evaluation shows a marginal improvement in the classification precision when the proposed method aligns the images. Medical images allow you to observe patterns that depend on certain aspects, such as the angle at which the image was taken, the age of the patient, and the quality of the image. For this reason, the registration process in chest X-ray images is carried out. The result of a registration process is to obtain the parameters of the translation, scaling, and rotation transformations that align the images. Convolutional neural networks have local maximum clustering layers that allow a network to be spatially invariant to the position of features. However, due to the typically small spatial support for maximum clustering (e.g., pixels), this spatial invariance is only realized through a deep hierarchy of maximum clustering and contributions and intermediate feature maps (convolutional layer activations). However, CNNs are not invariant to significant transformations of the input data [6]. This limitation of CNNs is because it only has a predefined and limited grouping mechanism to deal with variations in data space [7]. For this reason, it is proposed to carry out the registration process in medical images, specializing in chest X-rays, to later train the convolutional neural network model Inception V3 specializing in the classification of images and see how the results of images without process regarding images processed with our proposal. To carry out this process, digital image processing, training specialized convolutional neural networks in detection, and the PCA method that allows adjusting the orientation of the images are proposed.

3.1. Basis of X-Ray Images of Healthy and Pneumonia Subjects

The Radiological Society of North America (RSNA) is an international society of radiologists, physicians, and other professionals with more than 54,000 members from 146 countries worldwide. This work uses a database provided by RSNA [8], which contains a set of images showing the presence of pneumonia and features showing abnormal lungs. The action of the registry is conditioned to samples of said images for the realization of this section.

3.2. Classification of Pneumonia

The worldwide research on the analysis of medical images using convolutional neural networks is a proposal that shows high-impact results. Stanford in 2017 presents a CheXNet convolutional neural network: detection of pneumonia in chest radiographs with deep learning [9], analyzing images and diagnoses of patients; this translates into an accuracy of 85% using a comparison between the tool and a doctor. The research focused on detection shows that current tools are helpful but still fall short of optimal automatic detection (reducing error in detections).

3.2.1. Registration in Images Using Convolutional Neural Networks

To complement convolutional network processing, a Google DeepMind group proposed adding a space transformer module in 2015. The basic idea behind this module is to transform the input image so that subsequent layers can classify the examples more quickly. The authors propose changing the image before entering the relevant convolution layer rather than changing the central CNN architecture. This module focuses on posture normalization (where the object is tilted or scaled) and spatial attention (drawing attention to the correct object in a multi-object image). To learn to be invariant to images with different scales and rotations, traditional CNNs require many training examples.

3.2.2. Record of Chest Radiographs

Reference [10] investigates and compares the performance of various registration algorithms based on different methods of feature extraction and comparison in chest radiographic images. The combination of three descriptors of points of interest (SIFT, an algorithm used in artificial vision to extract relevant characteristics from images that can later be used in object recognition, motion detection, image registration, and other tasks), SURF (an algorithm capable of obtaining a visual representation of an image and extracting detailed and specific information from the content), and ORB (a fast and robust local detector of characteristic) is studied in particular. These were used as feature detectors, and the SIFT and SURF methods were used separately for the correspondence search. The tests were carried out on chest X-ray images obtained from actual patients at various times. When the SIFT and SURF descriptors are combined for POI extraction and the SIFT algorithm is used for feature comparison, the highest registration precision is achieved [1119]. We present a method to detect rotated lungs on chest radiographs for quality control and to increase automated anomaly detection in automatic detection of rotation on chest radiographs using the primary measure of rib orientation for quality control. The method increases automated anomaly detection by calculating a primary measure of rib orientation using a generalized line histogram technique for quality control [20].

The automatic article extraction of checkpoints for chest X-ray images and elastic registration [21] of the thorax presents an algorithm for detecting control points and the extraction of the region of interest (ROI) for X-ray images. Using the mean, variance, and difference, a search strategy is used to find the control points and spatial intensity distribution in chest X-ray images [22].

3.3. Proposed Registration Algorithm

This section describes the formulation of the specialized registration process for chest X-ray imaging. The mechanism of the registration process is divided into seven stages, as shown in Figure 1.

The first stage is the preprocessing of the images; this includes the equalization of the images and readjustment of the size. Subsequently, it is observed in the diagram that there are two detection processes: one of the thorax and the other of the spinal column. These processes are carried out using the MobileNet 1.0 convolutional network model. The first detection focuses on the rib cage; this model was trained with 140 images; 42 images were used for testing and 98 for training. The training results were as shown in Table 1.

The convolutional neural network retraining results can be seen in Figure 2.

With the results of the detection, a cropping operation is carried out, readjusting the dimensions of the images, which is equivalent to finding the translation and scaling parameters of some parametric models of feature-based registration (thus, the proposed model is partially size-invariant) image or if it is moved. A new detection process was applied to the resulting images with a CNN model trained to detect the spine; the model was trained with 140 images. The training results are presented in Table 1.

The convolutional neural network retraining results can be seen in Figure 3. The information provided by the spine detection tool is rectangular boxes with regions of the spine in the evaluated image. A color selection process allows the point cloud of the perimeter of each rectangle to be found. Thus, from the point cloud coordinates, the principal component analysis (PCA) is performed to locate the eigenvector with the highest associated eigenvalue to calculate the angle of rotation of the column in the rib cage.

The principal component analysis is one of the critical tools of statistical data analysis; it rotates the axes on which the dimensions with which data are described are defined. The axes are selected so that the variance of the residuals (the dispersion of the data points with respect to the axes defined by the principal components) is minimized. In this way, the original data is projected in subspaces in which its variance is maximized. Typically, only the first principal components remain, sufficient to represent the variability that exists in the data, which will allow us to reduce dimensionality.

Let be the point cloud coordinate matrix of the perimeter of the detected regions with backbone portions, for detected points.

Next, we present the PCA algorithm to detect the axis corresponding to the spine. The PCA is calculated as follows. (1)Midpoints: (2)Covariance matrix: (3)Decomposition of eigenvalues:

is an eigenvalue, and is the corresponding eigenvector. The eigenvalue measures the wealth of information in the orthogonal direction. Assuming that , then, is the main direction. The squares formed in Figure 1(g), and points are the set of pixels used to calculate the eigenvectors, in black dotted lines, the obtained eigenvectors. Once the main direction is obtained, the image is rotated. The leading eigenvector coincides with the Cartesian axis, resulting in the spine’s alignment after processing the entire image base.

3.4. Evaluation of CNNs Focused on Detection

The evaluation of the convolutional neural networks focused on the detection is carried out in two ways; the first focused on the detection evaluating 20 images, 10 with pneumonia and 10 without pneumonia; these images were rotated in 45°, 90°, 135°, 180°, 225°, 270°, and 315°. The total number of images produced was 160; Table 2 shows the results of the neural networks focused on detecting the thorax cage.

Sample images are shown, illustrating the results of the table above. The same test was performed to detect the spine with the same images of the previous test; the results are those shown in Table 2.

There are no incorrect images in this table, as you can see. The first neural network oriented to the detection of the thorax cage is found to be invariant to image rotation in this evaluation; as a result, the results are 145 images with a correct detection rate of 160. The neural network to detect the spinal column, on the other hand, only manages to detect 20 images out of 160, indicating that the convolutional neural network has difficulty detecting the spine due to the cloudiness of the images. The second evaluation is carried out with a total of 1000 images, 500 images showing pneumonia, and 500 images without pneumonia. Table 3 shows the results of the evaluation of the CNNs retrained to detect the thorax cage and the spine with said images.

As can be seen in the previous table, the neural network does not mark any error to detect the thorax cage; on the other hand, the neural network to detect the spine shows an 82% percent success rate.

3.5. Evaluation of the Registration Algorithm

The evaluation of the proposed method is carried out with a total of 1000 images: 500 chest X-ray images showing pneumonia and 500 images without pneumonia. When evaluating the CNNs MobileNet 1.0 models trained for detection, the following is observed.

In the images without pneumonia, the chest and spine detectors have a detection rate in 100% of the images. However, only in the pneumonia images did the neural networks not detect the spine in several images; below, the errors are shown in Table 4.

As can be seen in the previous table, the neural network does not mark any error to detect the chest cage; on the other hand, to detect the spine, it shows an eighteen percent error when patients have pneumonia. With these 1000 images processed using the proposed algorithm to record chest radiographs, three evaluations are carried out: the first two focused on the training of different pretrained models that allow the classification of images; in this way, we measure the improvement in the performance of the CNNs, and the third evaluation analyzes the result of the average sum of the images without and with the registration process. The first two evaluations compare the results of training CNNs models with processed and unprocessed images, showing the existence of improvements if they exist. The third evaluation analyzes the resulting images using a sum of images; in this way, the histograms of the raw and processed images are compared.

3.5.1. First Evaluation

The first evaluation focuses on retraining the Inception v3 model in order to classify images with and without the presence of pneumonia. This model was retrained in three different ways; the first was with images that are not processed by our registration method, the second was with the images resulting from our method without readjusting the sizes (which vary depending on the size of the box. thoracic), and the third resizes all the resulting images by our method to a size of . Table 5 shows the results of this evaluation. When retraining the Inception v3 model, the data resulting from the training is the columns, and the final accuracy of the tests is our primary metric. A marginal improvement of 3% can be seen in the processed images in this test without resizing.

3.5.2. Second Evaluation

The second evaluation focuses on retraining the MobileNet v1 model to classify images with and without the presence of pneumonia, similar to the previous evaluation with the difference of the model to be retrained. This model was retrained in three different ways; the first was with images that are not processed by our registration method, the second was with the images resulting from our method without readjusting the sizes (which vary depending on the size of the box. thoracic), and the third resizes all the resulting images by our method to a size of . Table 5 shows the results of this evaluation.

When retraining the MobileNet 1.0 model, the data resulting from the training in the test images are shown in the last column; the final accuracy in the test images is the most critical metric of neural networks. A slight improvement of 0.4% can also be observed in this test with the resized processed images.

3.5.3. Third Evaluation

This evaluation analyzes the result of the average sum of 500 images that were not aligned with our method and 500 images that, if recorded, the images were evaluated included pneumonia. As shown in Figure 4 on the left side, the average sum of the original images without the registration process is shown. On the right side, it is shown that when the images are recorded, the thorax occupies a greater area in the average image, eliminating unnecessary information, which allows better detection. The dark areas that correspond to the lungs are more evident. Average image of 500 raw samples and average image of the thorax recording the samples by the proposed method using CNNs are shown in Figure 4.

The following histogram shows the frequency in pixels with which the different intensity levels appear in the previous images and the intensity level is in the range from 0 to 255, where the value 0 represents the black color and 255 the color white; observing the histogram, it is appreciated how the illumination intensity is provided more equitably in the average image of the images processed by the registration algorithm. The following histogram shows (Figure 5) the frequency in pixels with which the different intensity levels appear in the previous images and the intensity level is in the range from 0 to 255, where the value 0 represents the black color and 255 the color white; observing the histogram, it is appreciated how the illumination intensity is provided more equitably in the average image of the images processed by the registration algorithm.

A similar test is performed with 500 images (Figure 6) but without pneumonia. Note that the average image obtained by recording the images also improves the definition of the features, making it less blurred in the section of the lungs.

The following histogram shows the frequency in pixels with which the different intensity levels appear in the previous images and the intensity level is in the range from 0 to 255, where the value 0 represents the black color and 255 the color white; observing the histogram shows how the illumination intensity is provided more equitably in the average processed images. This evaluation, which is carried out by calculating the average sum of 500 images (Figure 5), is shown in the first case (images with pneumonia) that there is cloudiness in the area of the lungs. The second case (images without pneumonia) shows that the cloudiness decreases, and the details are better appreciated in the average image. In both cases, registering chest X-ray images using CNN delimits the images to take better advantage of the information in the images.

4. Conclusions

Image registration is the process that seeks the geometric transformation to align two images in the best possible correspondence; this section focuses on a registration proposal using CNNs and principal component analysis. The results can be seen visually in the previous section. The proposed method eliminates unnecessary information and better delimits the thorax region. Regarding the evaluation regarding the classification of types of pneumonia in X-ray images, although the improvement is marginal by preprocessing the images with the registration method, the results suggest that it is necessary to improve the preprocessing to reveal the physiological details of the associated patterns to each type of pneumonia. In a future work, we propose using the PCA (principal component analysis) to decompose the X-ray images to form pseudo color images before the CNN input layer to detect the spine, which requires a decomposition in the images.

Data Availability

The data underlying the results presented in the study are available within the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.