Abstract

Since manual hemolysis test methods are given priority with practical experience and its cost is high, the characteristics of hemolysis images are studied. A hemolysis image detection method based on generative adversarial networks (GANs) and convolutional neural networks (CNNs) with extreme learning machine (ELM) is proposed. First, the image enhancement and data enhancement are performed on a sample set, and GAN is used to expand the sample data volume. Second, CNN is used to extract the feature vectors of the processed images and label eigenvectors with one-hot encoding. Third, the feature matrix is input to the map in the ELM network to minimize the error and obtain the optimal weight by training. Finally, the image to be detected is input to the trained model, and the image with the greatest probability is selected as the final category. Through model comparison experiments, the results show that the hemolysis image detection method based on the GAN-CNN-ELM model is better than GAN-CNN, GAN-ELM, GAN-ELM-L1, GAN-SVM, GAN-CNN-SVM, and CNN-ELM in accuracy and speed, and the accuracy rate is 98.91%.

1. Introduction

Hemolysis is the process by which the red blood cell membrane breaks down, and the hemoglobin in the cell is released into plasma, serum, and other intercellular substances. This abnormal expression can easily lead to many diseases. In particular, neonatal hemolysis has become one of the most important diseases affecting newborn health [1, 2]. In 2018, Zhang [3] analyzed 200 cases of neonatal jaundice. Neonatal hemolysis occurred in 61.61% of the 99 female infants. Among 101 baby boys, neonatal hemolysis accounted for 50.49%. Thus, it can be seen that the probability of jaundice caused by hemolysis of newborns is still relatively high. However, many patients become sicker and even lose their lives since hemolysis is not detected early enough [4].

In clinical hemolysis tests, the specimen character and quality judgment need to be visually measured and recorded by medical staff one by one, and the degree of hemolysis is easily affected by subjective factors. A Vitros 5600 biochemical immune analyzer is used to detect the hemolysis index, realizing automatic quantitative determination of the hemolysis degree [5] and avoiding errors and missing terms caused by manual operation. The concentration of NSE in neonatal hemolysis samples is calculated by the neuron-specific enolase (NSE) correction formula, which avoided repeating sample collection. It is particularly difficult to collect blood samples from newborns [6]. These methods rely on chemical instruments and the practical experience of medical personnel. The measurement of relevant data is complex and takes a long time.

In recent years, with the rapid development of deep learning technology, computer-aided detection technology has provided better results than traditional methods [7, 8]. In particular, medical image analysis provides medical personnel with more effective diagnostic and therapeutic procedures [9]. However, large-scale tagging data are difficult to obtain. Labeling medical data requires professional knowledge and invests considerable time and manpower. By training the existing data, generative adversarial networks (GANs) can automatically generate a large quantity of data to effectively solve this problem. For example, Qi et al. [10] generated a large number of diverse pulmonary nodule images through GAN, which saved time and cost. Convolutional neural networks (CNNs) also provide great help in medical image analysis. Huynh-the et al. [11] converted skeletal information into an image representation based on CNN. The experimental results with the highest accuracy were obtained on the most challenging datasets. Bueno et al. [12] achieved higher accuracy by using the sequential CNN segmentation-classification strategy. As a new training framework, the extreme learning machine (ELM) has been widely used in classification, identification, and diagnosis [13]. For example, Wang and Li [14] used ELM to classify pulse data, with an accuracy of 90.3%. Wang et al. [15] used the ELM method to quickly and accurately determine whether the human body has fallen. Cai et al. [16] verified the effectiveness and superiority of ELM in lithology identification through comparison.

To detect hemolysis images using computer-aided techniques instead of manual work and to improve the related algorithm, this paper proposes a hemolysis image detection method based on GAN-CNN-ELM. The prediction accuracy of this method is up to 98.91%, and the recognition accuracy and efficiency are better than GAN-CNN, GAN-ELM, GAN-ELM-L1, GAN-SVM, GAN-CNN-SVM, and CNN-ELM models. Furthermore, it realizes the efficient detection of hemolysis image and helps medical staff diagnose the condition in time and strives for more precious treatment time for patients.

This paper has three main contributions. (1)A large number of hemolysis images are generated by GAN to expand the dataset(2)The CNN algorithm is used to extract multilayer features from the preprocessed hemolysis experiment images(3)The ELM classifier is constructed to classify the eigenvalues and obtain more accurate prediction results

The paper is organized as follows. Section 1 briefly describes the problems existing in hemolysis detection and the solution model. In Section 2, some specific methods adopted are introduced. Section 3 gives the experimental process and results. Finally, in Section 4, several conclusions are presented, and the direction of our future work is mentioned.

2. Methodology

This paper presents an effective hemolysis image detection technique. The algorithm framework is shown in Figure 1. The main method includes three stages as follows: (a)Image preprocessing(b)CNN extraction features(c)ELM classification

2.1. Image Preprocessing

The preprocessing of images mainly includes image segmentation and clipping, image enhancement, graphic transformation data enhancement, and generative adversarial network data enhancement. Image preprocessing is carried out with images in original image sets as the training atlas. - images represent the experimental atlas.

2.1.1. Picture Initial Cropping and Labeling

training pictures are segmented and cropped to obtain pictures of the hemolysis area in the test tube. Then, experts judged and labeled the hemolysis area and divided the images into three categories. The first category is insoluble, and the category number is 0, mainly based on the obvious precipitation in the test tube and obvious stratification. The second category is microsolubility, and the category number is 1. This is mainly based on the fact that the hemolysis solution in the test tube is partially precipitated, but the dissolution is relatively uniform in the upper layer, with obvious particles. The third category is the normal dissolution category, and the category number is 2, mainly based on the solution in the test tube dissolved evenly, excluding extreme cases (completely insoluble or clear water test tube); no larger particles or precipitation appeared. The results of the classification are shown in Figures 2 and 3.

2.1.2. Image Enhancement

The collection of hemolysis images may be affected by uncontrollable factors such as light and photo angle, leading to the phenomenon of extremely concentrated grayscale of hemolysis images or unobvious feature areas [17, 18]. Therefore, image enhancement of the collected hemolysis image can cause different angle features and detail features of the image to appear, reduce the impact of irrelevant information, promote the model to extract more features of the image, and identify hemolysis from multiple features.

This algorithm mainly adopts histogram equalization [19] to enhance the image. For example, for a discrete grayscale image , the number of occurrences of grayscale is expressed as , and the probability of occurrence of pixels with grayscale in the image is expressed as where is all the grayscale numbers in the image, and the cumulative distribution function corresponding to is defined as

In this paper, the probability distribution of the pixels of the original image on the grayscale level is changed by the method of histogram equilibrium to achieve the effect of image enhancement.

2.1.3. Data Enhancement

Because of the unique characteristics of medical images, the position and angle of the subject image and other factors directly affect the judgment of the testing personnel. In CNN training, irregular images are likely to make model training unstable, resulting in the model not converging. Therefore, the data after image enhancement are enhanced to highlight the main information of the hemolytic image, standardize the dataset, and reduce the noise interference.

(1) The Graphic Transformation. The algorithm model adopts three methods of translation transformation, rotation transformation, and horizontal flipping to enhance the data [2022].

Image translation transformation: assuming that there is a point in the original image whose coordinates are , which is translated by units along the -axis and units along the -axis and the new coordinate is , its matrix expression is shown as

Horizontal flip: setting its original image width to width and height to height. The coordinates of one point in the original image is . After the horizontal mirror flip coordinates become , the matrix expression is represented as

Rotation transformation: rotating the image clockwise around the origin by angle , then the coordinates of after point in the original image is changed to and the coordinates of is changed to . The matrix expression is given as

In this paper, the original image is first translated and transformed and then rotated at random angles. Finally, the image is randomly cropped to discard some irrelevant information in the image. Then, the data are expanded by means of horizontal inversion, and some error samples caused by excessive enhancement are removed.

(2) Generative Adversarial Networks. After the graph transformation, the unavailable samples are eliminated, resulting in a further reduction in the available training data. To solve the problem of insufficient training data, the existing data are trained by generating models to obtain a large amount of data to enhance the generalization ability of the network and improve the stability of the network and the universality of new samples.

GANs [23] are composed of two networks: a generator and a discriminator. The generator learns the distribution of real data as much as possible. The discriminator learns to judge whether the input data are from real data or generated data, and they play games and train against each other to finally reach the equilibrium state [24, 25].

Let the formula of sample generated by a group of random noise after the generator [17, 18] be as follows: whereis the final output after several convolution, pooling, and activation operations (see Equations (14) and (15), function with discriminator.

The output results of the real sample () and the generated sample () after passing through the discriminator are shown in Equations (7) and (8).

The discriminant result can be expressed as with the dichotomy result. where represents the probability that belongs to the real sample distribution, and the loss function [26, 27] is established. It is expressed as

The GAN is optimized with a backpropagation algorithm.

When optimizing the generator, minimize Maximize when the discriminator is optimized. The gradient descent method can be used to optimize the solution so that the sample distribution generated by the generator is consistent with the actual sample.

2.2. CNN Feature Extraction

CNN is a biologically inspired neural network, which is composed of an input layer, several convolutional layers and pooling layers alternately, a fully connected layer, and an output layer. Among them, the convolution layer is used for convolution calculation, and the pooling layer is used for descending sampling. The main features of the image are extracted through these two layers. This algorithm uses the ResNet50 model [22] to extract features, as shown in Figure 4.

Assume that the input image matrix, the convolution kernel, and the feature map matrix are all in the form of a square. Here, the size of the input image matrix is , the convolution kernel size is , convolution kernels are , convolution stride is , zero-padding layer number is , and the size of the feature graphs generated by the convolution calculation is given as

For an image after several layers of convolution, pooling operation, and the activation function, a number of feature graphs are obtained. Each feature graph is in matrix form, where is the activation function. The Leaky ReLU function is selected in this paper.

After the convolution calculation of each pixel block in the image, the feature map matrix with the size of is obtained, as shown in the following equation:

The pooling operation is carried out. During the pooling process, the size of is selected as the pooling kernel. The calculation formulas of the two pooling modes are shown in Equations (14) and (15).

After several operations, such as layer pooling, convolution, and activation, the last layer feature graph is expanded into the row vector, which is denoted as , as the training data of ELM, where is the mapping function and is the characteristic matrix.

2.3. ELM Classification

By constructing the ResNet50 model, the pretrained model weight was loaded for feature extraction and preservation. The results of the last layer of the neural network are used for output. After each picture goes through the ResNet50 model, the feature vector is extracted, where is the feature number of each picture and is the row vector, as shown in the following equation: corresponding to the one-hot code labeled.

The feature matrix is extracted from all image features to constitute the dataset for ELM [28] classification, where represents the number of pictures and represents the number of features.

The ELM model is built as shown in Figure 5.

The implied nodes in the ELM network do not need to be adjusted through backpropagation (BP) but can be generated randomly based on a set of continuous probability distribution. The ELM contains two stages: a feature mapping stage and a learning stage. In this algorithm, the input data in the input layer is the image feature vector, and the output result of the th node is shown in the following equation:

That is equivalent to mapping a -dimensional vector to an -dimensional vector. where is the th connection between the node of its input layer and the hidden layer, is the bias, is the activation function, and the sigmoid function is used here.

The number of nodes in the output layer is denoted as , and the weight of the th hidden layer node and the th output layer node is . Then, the output (mapping) of node is

In the learning stage, the characteristic matrix provided by the data of a given training samples is denoted as , which contains mappings generated in the mapping phase. The objective function is represented as where is solved by the least squares. where is the generalized inverse moment of .

To improve the generalization ability of the ELM model and avoid overfitting of the model, the regular term [29] was added to the error function.

The solution of is as follows:

is the unit matrix.

In summary, the steps of ELM training can be summarized as follows: (1)Initialize the hidden layer parameter according to the random continuous distribution(2)The output matrix of the hidden layer is calculated according to Equations (15)–(18)(3)The weight matrix is calculated according to Equation (22)

In the recognition stage, given a set of sample , it first becomes after the mapping in the first stage; then, the category of the sample is given as

The argmax function is the index that returns the largest value in the array; that is, the maximum probability category number is the prediction category.

3. Experiment

This experiment is carried out on a computer with 8 GB of 64-bit memory in Windows10, an i5-6200u CPU, and Python 3.6.

The dataset of this experiment is from the local medical institution, and the hemolytic blood samples are collected and labeled by professionals. The collection standard is the hemolysis index . The original dataset for this experiment has 2250 hemolysis images used by medical laboratories with irregular resolution. Eighty percent (1800) of the original data are used as training data, and the remaining 20% (450) are used as final test data. The data distribution is shown in Table 1.

3.1. Dataset Processing

(1)The image is segmented and cropped, and the random cropping resolution of the image is . As a result, the training scale is expanded, and the clipped area is judged and classified. Histogram equalization is used to enhance the classified image, as shown in Figures 6 and 7

By redistributing the gray value of the original image, the gray value of the image is evenly distributed, so as to increase the contrast and enhance the details of the image. The gray histogram of Figure 6(a) is shown in Figure 8, and that of Figure 7(a) is shown in Figure 9. (2)The quantity of data in the sample is initially expanded through graph transformation, and generative adversarial networks are used to conduct confrontation training on the data after graph transformation, and a large number of available training samples are generated, as shown in Figure 10. The number of training images generated in this experiment is 2350, and the original dataset is extended to 4600

3.2. Extraction of Image Features

Through image enhancement and data enhancement, the original data is utilized more efficiently. The generated samples are processed, and the unavailable samples are rejected. Then, feature extraction is conducted. In the process of feature extraction, three layers of feature maps are extracted for visual observation. The original insoluble hemolysis detection image is shown in Figure 11. After convolution and pooling operations at different scales, a number of feature maps are extracted from the original image to reflect the features of the original image from different angles, as shown in Figure 12.

3.3. Comparison of the ELM Training Effect

ELM has two parameters to determine in advance during training including the input to the model and the number of nodes in the hidden layer. Different parameters affect the training efficiency, fitting degree, and generalization effect of the model. The general principle is that the number of hidden nodes is less than the input dimension, and the number of features should not be too large.

In order to determine the appropriate size of the input layer and hidden layer, -fold cross-validation is adopted in the analysis experiment to reduce overfitting and improve the generalization ability of the model, as shown in Figure 13. Experimental results show that 4-fold cross-validation has the highest accuracy, and 4-fold cross-validation is finally used to train GAN-ELM, GAN-ELM-L1, and GAN-CNN-ELM models. The initialization of hidden nodes is randomly generated based on a continuous probability distribution.

The training parameters are the number of selected features, and the influence of the hidden layer on accuracy is studied. The number of hidden nodes is selected as follows: 10, 100, 200, 400, 800, 1000, and 2000. Then, three models of GAN-ELM, GAN-ELM-L1, and GAN-CNN-ELM are trained separately under 4-fold cross-validation. The training results are shown in Figure 14.

As seen in the experimental results, when the number of hidden layer nodes increased from 10 to 200, the accuracy is greatly affected and increased rapidly. From the stage of 400 to 800, the accuracy tended to be stable. After the 800th node, the model started to overfit, and the generalization ability is low. Therefore, the number of optimal hidden nodes is approximately 600, which has good performance in this experiment. The experimental results also show that the single ELM classifier is not effective. The accuracy is improved by adding a regularized ELM classifier, and the accuracy is greatly improved by the GAN-CNN-ELM classifier model.

After determining the optimal number of hidden nodes as 600, this paper adopts 4-fold cross-validation to verify the accuracy of each fold, as well as the accuracy of test set with and without cross-validation of the model. The experimental results are shown in Table 2. The experimental results indicate that the accuracy performance of the model in the final test set is improved after using 4-fold cross-validation.

3.4. Comparisons of Experimental Results

(1)The traditional methods of hemolysis detection mainly include the Vitros 5600 biochemical immune analyzer to detect hemolysis indexes [5], the neuron-specific enolase (NSE) correction formula [6] and microcolumn gel detection card [30]. Because these experimental methods do not rely on human subjective judgment, it avoids the influence of human subjective factors and improves the accuracy of detection. At the same time, it does not rely on chemical instruments, saves chemical reagents, and is simple to operate. Computers operate much faster than physical and chemical analysis instruments. Therefore, it avoids the drawback of time-consuming measurement of relevant data and improves the detection efficiency to a large extent(2)This experiment compared seven different algorithms: GAN-CNN, GAN-ELM, GAN-ELM-L1, GAN-SVM, GAN-CNN-SVM, CNN-ELM, and GAN-CNN-ELM. The results show that the training model of the GAN-CNN-ELM classifier is adopted, since GAN greatly increased the sample data; compared with the CNN-ELM algorithm, the feature extraction time increased, but the accuracy improved. As the full connection layer in CNN network structure requires backpropagation to adjust parameters, ELM is a single hidden layer feedforward neural network, and the weight of the input layer and bias of hidden layer are randomly selected. ELM is used to replace the full connection layer for feature classification, which effectively reduces the computational complexity of the algorithm and improves the efficiency of model detection. The GAN-CNN-ELM model was compared with other models; the training time and accuracy are optimized. In Table 3, the accuracy is increased by at least 0.16%

Four models with relatively high accuracy are selected: GAN-CNN, GAN-CNN-SVM, CNN-ELM, and GAN-CNN-ELM. The remaining 20% of test data are into different models that are trained for prediction. The results are shown in Table 4.

Through the above experimental comparison, the results show that the GAN-CNN-ELM algorithm model is significantly better than the other three models in terms of recognition time and recognition accuracy. The recognition accuracy is increased by at least 2.58%.

4. Conclusion

This paper proposes a GAN-CNN-ELM algorithm model for hemolysis image detection, which uses image enhancement and graphic transformation to enhance image features and expand sample data. Then, we use the GAN to train the existing data to obtain a large quantity of data. The ResNet50 algorithm model is used to extract multilayer features of the pretreated hemolytic experimental atlas. The ELM classifier is constructed to classify the feature values to improve training time efficiency. For the test set, the accuracy of identification is as high as 98.91%. To achieve efficient identification of hemolytic images and accurately determine the hemolytic results.

In the future, multiscale convolution can be used to optimize the generalization degree of the system and the diversity of the recognition. An attempt will be made to improve the accuracy with CNN multiple feature fusion (MFF). The Res2Net framework will be used to extract multilayer features of the image to make the granularity finer and the feature scale more diverse, and the KELM algorithm will be studied to further improve the model training speed.

Data Availability

The image data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work and its data collection and sharing are supported and funded by the research on local stereo matching algorithm based on a vision system (18jk0509).