Abstract

The unique physical properties of graphite enable it to be applied in various fields of the national economy and people’s livelihood, which has very important industrial value. Many countries have listed graphite as a key mineral. To promote the transformation of the mining industry to informatization and intelligence, the realization of the intelligent recognition of graphite is particularly critical. Aiming at the problems of long time and low efficiency in manually identifying graphite, an improved AlexNet convolution neural network is proposed for graphite image recognition. First, we perform image preprocessing on the data set by means of random cropping, horizontal flipping according to probability, and normalization processing to achieve the purpose of data enhancement. Then we use the activation function ReLU6 to compress the dynamic range to make the algorithm more robust, using the batch standardization algorithm for normalization to speed up the convergence speed, modifying the size of the convolution kernel to enhance the generalization ability, and adding dropout regularization to the fully connected layer to further prevent overfitting. Finally, in the simulation experiment, compared with the existing method, the given method reduces the loss value and improves the average accuracy of identifying graphite.

1. Introduction

At present, graphite has become an important raw material of new composite materials in the field of high technology and has an important strategic position in the national economy [1, 2]. With the rapid development of science and technology, graphite is known as “the strategic resource supporting the development of high and new technology in the 21st century,” which has good high temperature resistance, conductivity, chemical stability, and thermal shock resistance [3]. It is not only the key raw material in new energy, new materials, aerospace, and other fields but also the important raw material in traditional industries such as refractories, electrode brushes, and pencils [4]. In the development of high-tech industry, graphite is becoming more and more important. Its unique physical characteristics make it applicable in various fields of the national economy and people’s livelihood, which has very important industrial value [5, 6].

The deep learning algorithm represented by convolution neural network (CNN) has developed rapidly in the era of big data, and has attracted much attention in the research fields of machine learning and computer vision [7, 8]. As a kind of deep feedforward neural network, CNN has made a great breakthrough in the field of image recognition [9]. Compared with the traditional classification method, the convolution network-based automatic classification method breaks away from the tedious of human identification. At the same time, the back propagation algorithm is used to automatically optimize the model parameters to obtain the optimal model under the current conditions. The concept of deep learning was put forward by Hinton and Salakhutdinov in 2006 [10]. It can describe the attributes and features of objects more abstractly and deeply, and design the AlexNet network. Since then, a wave of deep learning upsurge has been set off.

As a powerful algorithm in the field of image recognition [8, 9], few researchers have applied CNN to the graphite image recognition task. Compared with more advanced models such as GoogleNet and ResNet, AlexNet has a simpler network structure and fewer network parameters [9]. Compared with the existing lightweight network, the depth is deep, the training is not very difficult, the representation ability is strong, it is more convenient to make different improvements, it can save a lot of model training time, and it is not easy to overfit [8]. Therefore, this paper presents an image recognition technology based on AlexNet and deep learning to recognize and sort graphite and nongraphite. In view of the characteristics of a graphite image and the small data set, the data set is preprocessed to achieve the purpose of data enhancement, and the activation function, normalization layer, convolution kernel, dropout layer, and optimizer are replaced and updated. Compared with the related algorithms, simulation results show that the proposed method has low complexity and less parameters, improves the convergence speed of the network, reduces the loss rate, and effectively realizes the graphite image recognition target.

2. Traditional AlexNet Network Model

2.1. Convolutional Neural Network

AlexNet is a kind of CNN, and CNN is an artificial neural network, which has been widely used in image recognition and other machine learning fields [8]. Because its network structure adopts the way of weight sharing, it is more similar to the biological neural network. Compared with the general neural network, it reduces the number of network weights. The biggest advantage of CNN is to reduce the network parameters through receptive field and shared weight, that is, parameter reduction and weight sharing. This method makes the training faster and needs fewer samples. The basic network structure of CNN is shown in Figure 1, which is divided into four parts: input layer (input), convolution layer (conv), full connection layer (FC), and output layer (output).

The biggest difference between convolution neural network shown in Figure 2 and other neural network models is that convolution neural network connects convolution layer in front of input layer of neural network, so convolution layer becomes data input layer of convolution neural network. Convolution layer is a unique network structure in convolution neural network, which is mainly used for image feature extraction. The purpose of pooling layer is to sparse the feature map and reduce the amount of data computation. The full join layer plays the role of “Classifier” in the whole convolution neural network. It maps the learned “content” to the sample label space (synthesizes the previously extracted features).

In Figure 2, represents a digital filter, is an offset, is an activation function, represents the characteristic map of the convolution layer, represents the weight of downsampling, represents the corresponding weight, and represents the lower sampling layer.

It can be seen that the layer is used for feature extraction as convolution layer. Each neuron connects with a small receptive field in the upper layer, and then moves the receptive field to correspond the new receptive field to another neuron in layer. The activation function is used to make the process displacement invariant. Here, as long as the size of input layer and local receptive field are determined, the size of layer is also determined. The layer is the lower sampling layer, and its purpose is to change multiple pixels of the layer into one pixel. Because the weights on the mapping surface are shared, that is to say, the weights of each neuron are the same, so the parameters of the whole network are greatly reduced and the complexity is reduced.

2.2. AlexNet Convolution Neural Network

AlexNet is the champion network of the ISLVRC 2012 (ImageNet Large Scale Visual Recognition) competition. The classification accuracy has been improved from more than 70% to more than 80%. Its network structure is shown in Figure 3.

The network consists of eight layers, including five layers of convolution layer (conv) and three layers of full connection layer (FC). There are also three pooling layers in the AlexNet network, and the convolution layer and pooling layer are connected alternately.

In AlexNet network, the five-layer convolution layer uses convolution kernel of , , , , and to extract the feature. AlexNet abandons or functions, which are commonly used before, and uses ReLU as the activation function. Since there is no interval in the range of values obtained by the ReLU function, AlexNet proposes local response normalization (LRN) to normalize the data obtained by ReLU, suppressing small neurons and improving the generalization ability of the model.

3. Improved AlexNet Network Model

3.1. Data Enhancement

In order to improve the generalization ability of the model, the best way is to use a large sample data set for training. For small sample data sets, data enhancement can be used in deep learning to solve the problem of insufficient data [11]. Here, we use the image preprocessing package transform of Python to preprocess the images in the data set as shown in Figure 4.

Firstly, the test set is cut randomly to get the image pixel of , and the image is flipped horizontally according to the probability (setting ). Finally, the processed data set is normalized to achieve the purpose of data enhancement.

3.2. Activation Function

In the AlexNet network, although the ReLU function as the activation function effectively overcomes the shortcomings of the and function gradient disappearing and slow convergence speed, the linear activation of the ReLU in the region of the network may cause the value after activation to be too large and affect the stability of the model, which will reduce the training speed of the network, and even reduce the generalization performance of the network. In order to counteract the linear growth of ReLU, this paper uses the ReLU6 function.

Compared with the ReLU function, the model uses ReLU6 as the nonlinear layer, which can compress the dynamic range in a low precision calculation, and the algorithm is more robust. The ReLU6 function is shown in Figure 5.

3.3. Batch Normalization Algorithm

In the traditional AlexNet, LRN is used to normalize the first and second layers to enhance the generalization performance of the model. However, the actual improvement effect of LRN algorithm on the model is limited, and the training time of the model will be greatly increased. The batch normalization (BN) algorithm proposed in reference [12] can reduce the data offset caused by the activation function and effectively solve the problem of inconsistent data distribution in the training process. The BN algorithm achieves the goal of feature normalization by calculating the mean and variance of small batches.

Given a batch of size n initially, then, in order to normalize the data, the mean and variance of each batch are recorded by using the following Equations (2) and (3) in the training process.

In this way, the following Equation (4) can be used to normalize the data in the actual test. where is a coefficient used to increase the stability of the value.

The key of the BN algorithm is to transform and reconstruct, and introduce stretch parameter and offset parameter to correct as the following equation. where and ; the two parameters are updated with the update of network weight, and the normalized sample value is .

3.4. Convolution Layer

The VGG (visual geometry group) network is a deep convolution neural network developed by Oxford University Computer Vision Group and Google DeepMind. The network structure uses a convolution kernel to expand the number of channels to extract more complex and expressive features and has strong scalability and good generalization ability. Therefore, this paper uses the advantages of VGG network as a reference and modifies all convolutional cores in AlexNet to small convolutional cores , which greatly reduces the number of parameters and extracts more specific features.

3.5. Improved AlexNet Convolution Neural Network

In this paper, the Adam optimizer is selected, and the initial learning rate is 0.0002. The cross-entropy is selected as the loss function to avoid the problem that the learning rate gradually decreases due to the mean square error. Because a large number of parameters are introduced into the full connection layer of the AlexNet network, dropout is added after the full connection layer, which reduces the complex coadaptation of neurons and makes the model fusion more effective. The improved AlexNet network model is shown in Figure 6.

This paper uses the improved AlexNet network model to complete the task of automatic classification of data sets. The overall framework of the model is shown in Figure 7.

4. Experimental Parameters and Data Sets

4.1. Experimental Environment and Parameter Setting

In this paper, in the Python compiler environment, we use the 1.5.0 version of the Python framework to complete the experimental simulation and realize the image data preprocessing through transformation. The experimental environment is Mac OS 10.14.6 operating system, processor is 1.6 GHz Intel Core i5, and without GPU acceleration; running a network consumes 9000-10000 seconds on average. The weight attenuation is set to 0.01 to prevent overfitting. The epoch is 100, and a batch size is set to 32. All the weight parameters are initialized to Gaussian distribution with mean value of 0 and standard deviation of 0.01. At the same time, the images of the training set are randomly scrambled before input to reduce the influence of image order on the model.

4.2. Sample Collection and Expansion

In this paper, the image samples were collected from the existing graphite samples in the laboratory, cleaned, and dried, and the sample images were captured from different directions and angles by using the iPhone rear camera; then, the video was processed by frame, and a total of 166 sample images were obtained. At the same time, 180 nongraphite ore images were collected on the internet, and finally, the images were divided into graphite and nongraphite categories; 346 color images are obtained. Figure 8 shows the sample images of graphite and nongraphite.

In general, the more abundant the data set, the better the training effect of the model, and its generalization ability is enhanced. In order to solve the problem of insufficient training samples, the above data sets are expanded. The collected images are processed by center clipping, sample rotation (180° and 90°, respectively), flipping around a coordinate axis and mean value blur. Finally, 1857 images are obtained. 955 (80%) of the data sets are randomly selected as the training set according to the proportion to train the network, and 902 (20%) were used as test sets to verify the performance of the network. The training set and the test set are processed separately. The classification of data sets is shown in Table 1.

4.3. Data Set Preprocessing

In order to reduce the imbalance of samples and solve the overfitting phenomenon caused by a too small data set, this paper carries out data augmentation (DA) operation on the data set to improve the accuracy of network classification. In the DA operation, common methods include scale transformation, image contrast adjustment, image clipping, affine transformation, and horizontal or vertical flipping. Different enhancement methods can be selected for different data sets. Based on the characteristics of the ore image, the training set images are randomly scrambled before input to reduce the influence of image order on the model. Each image in the training set is randomly cropped, the image pixels are unified, and the image is flipped horizontally according to the probability (setting ). Finally, the data of each channel is regularized with the mean value of 0.5 and the standard deviation of 0.5, so as to achieve the purpose of data expansion and data enhancement.

5. Experimental Results and Analysis

5.1. Evaluating Index

In this paper, the training effect of a network is evaluated by two indexes: accuracy and loss value . Test accuracy refers to the ratio of the model output correct results in the test set, which can reflect the use effect of a network and is a very important index. The definition formula is shown in the following equation. where is the number of correct network identification in the test set and is the number of samples in the test set.

The training process of neural network is to minimize the loss function, and is the value of the loss function. In fact, the loss function calculates the mean square error () of the model on the test set.

5.2. Experimental Result

In this paper, the improved AlexNet model and the traditional AlexNet model are used to train the data set. Figure 9 shows the curves of and of using the ReLU6 function and ReLU function. The experimental results show that the loss value can be reduced from 0.0925 to 0.0772 by using the ReLU6 function, which is relatively small.

For the normalization layer, this paper compares the two cases of using the BN algorithm and not using the BN algorithm for simulation, and the results are shown in Figure 10.

The results in Figure 10 show that the BN algorithm can effectively reduce the loss value of model testing and avoid the risk of the model falling into overfitting. In order to further observe the feature extraction of the convolution layer before and after using the BN algorithm, a picture in the data set is randomly extracted, as shown in Figure 11, which is classified as graphite.

Then, the feature extraction diagram of the first 11 channels of Figure 11 in conv1, conv2, and conv3 is printed out in sequence, as shown in Figure 12.

It can be seen from Figure 12 that the features extracted by the traditional AlexNet network in the third volume layer are fuzzy, and the learning ability is not very good. Compared with the model using the BN algorithm, the feature extraction ability is enhanced, which can ensure that the network learns quickly and fully. The and of image recognition before and after using the BN algorithm are shown in Table 2.

In addition to the improvement of the activation function and normalization layer, this paper also modifies the size of the convolution kernel, as shown in Figure 13, to show whether to reduce the image recognition and of the convolution kernel.

Figure 13 shows the comparison between the loss value and the accuracy rate using the small convolution kernel on the basis of the above improvements. It is obvious that the use of the small volume core improves the generalization ability of the model and reduces the loss value of the model in the test set.

In order to evaluate the recognition performance of the improved AlexNet network on the graphite classification data set, the traditional AlexNet network and the AlexNet network of Reference [9] and Reference [13] are compared, and the situation of conv2 is extracted, as shown in Figure 14.

It can be seen from Figure 14 that the features in the second convolution layer using the improved AlexNet network are clearer than the other two models, and more features are extracted and learned. Finally, the comparison curves of models’ and are obtained, which further verifies the effectiveness of the improved AlexNet network proposed in this paper.

Through the experiment on the test set, the traditional AlexNet network, Reference [9], and Reference [13] are compared, and the comparison of the and with the method in this paper is shown in Table 3.

It can be seen from Table 3 that the improved AlexNet network proposed in this paper is better than the other two cases, which effectively verifies the feasibility and effectiveness of the network model proposed in this paper.

6. Conclusion

In this paper, the improved AlexNet convolution network is used to identify graphite. This process does not need to extract the image features of graphite manually. It has a high degree of intelligence, and the training process is relatively simple. By optimizing the activation function, normalization layer, and super parameters of AlexNet, the robustness of the whole model is improved, and the overfitting problem is avoided without increasing the computational complexity of the model. The experimental results show that the average accuracy of the improved algorithm is improved, the loss value is reduced, and the model has strong generalization. Moreover, the improved AlexNet convolution network model is applied to graphite image recognition, which increases the automation degree of the beneficiation process and reduces the workload of manual sorting. But in this paper, the data set is a small data set and the graphite sample is a block. Increasing the types of graphite to expand the data set and identifying most of the flowing graphite in the market are the next research direction.

Data Availability

The data sets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

It is declared by the authors that this article is free of conflicts of interest.

Acknowledgments

This work is supported by the Innovation Capability Support Program of Shaanxi Province of China (Grant No. 2020PT-023), National Natural Science Foundation of China (Grant No. 11801438), and Natural Science Basic Research Plan in Shaanxi Province of China (Grant No. 2018JQ1089).