Research Article  Open Access
Multigrades Classification Model of Magnesite Ore Based on SAE and ELM
Abstract
Magnesite is an important raw material for extracting magnesium metal and magnesium compound; how precise its grade classification exerts great influence on the smelting process. Thus, it is increasingly important to determine fast and accurately the grade of magnesite. In this paper, a method based on stacked autoencoder (SAE) and extreme learning machine (ELM) was established for the classification model of magnesite. Stacked autoencoder (SAE) was firstly used to reduce the dimension of magnesite spectrum data and then neutral network model of extreme learning machine (ELM) was adopted to classify the data. Two improved extreme learning machine (ELM) models were employed for better classification, namely, accuracy extreme learning machine (AELM) and integrated accuracy (IELM) to build up the classification models. The grade classification through traditional methods such as chemical approaches, artificial methods, and BP neutral network model was compared to that in this paper. Results showed that the classification model of magnesite ore through stacked autoencoder (SAE) and extreme learning machine (ELM) is better in terms of speed and accuracy; thus, this paper provides a new way for the grade classification of magnesite ore.
1. Introduction
Magnesite is an important ore containing magnesium [1, 2]. It is not only an indispensable auxiliary refractory material [3–5] but an important material for extracting magnesium compounds and magnesium metal. According to the data released by United States Geological Survey (USGS) in 2015, the magnesite explored globally has reached 12 billion tons with a storage of 2.4 billion tons. Countries rich in the resource include Russia (650 million tons, accounting for 27% of the total) and China (500 million tons, accounting for 21% of the total). In recent years, with the industrial development, the demand for highquality magnesite has been increasing. Highgrade magnesite determines the quality of magnesium products, which raises the standard of its grade classification. Traditionally, there are two methods for the classification: the artificial classification which is fast but not so accurate and the chemical analysis which is more accurate but of high cost and needs more time [6]. As a result, how to fast and accurately determine the grade of magnesite is a problem which needs solving without delay, which is of great real significance for reducing cost and improving the efficiency of classification.
Since nearinfrared (NIR) reflectance spectroscopy [7–9] is quick, of low cost, and efficient, modeling based on NIR data has been widely applied into such fields as rockmineral classification, grade identification, and food test. Contents and distribution of magnesium oxide, calcium oxide, ferric oxide, and aluminium oxide are the major determiners of the grade of magnesite ore. Yang et al. [10] studied the mineralogical properties of lateritic nickel ores from different places by Xray diffraction and Fourier transform infrared spectroscopy. Dalm et al. [11] proposed the use of nearinfrared sensors in porphyry copper deposits. Xu et al. [12] studied the flotation of chalcopyrite by infrared spectroscopy. Liu et al. [13] discussed the use of small molecule inhibitors in iron ore flotation tailings reelection.
The collected NIR magnesite ore raw data is 973highdimensional data, which will increase the time of the algorithm model and reduce the accuracy of the model. Meanwhile the data of chemical factors which are normally contained in the NIR data did not influence the grade classification. Consequently, it is necessary to reduce the original data of the NIR original data before building the algorithm model.
That is why the dimension of these data should be reduced. Dimensionality reduction is one of the ways to deal with highdimensional data [14]. Bellman [15] put forward in 1961 the concept of “curse of dimensionality” which means data in higher dimension will lead to exponential increase of sample size, analysis, complexity in data processing, and cost. Traditional methods of reducing dimension fall in linear and nonlinear ones. The former includes principal component analysis (PCA) [16–18], independent component analysis (ICA) [19–22], and factor analysis and this kind of method is satisfying only for highdimensional data with linear structure but not for data with nonlinear structure. Autoencoder in deep learning [23–26] is a new nonlinear dimension reduction method. It is an algorithm without monitoring and works to reduce the dimension and draw the features of data through the 3layered neutral network and seeks for the internal feature structure of data or the encoding modes of data through selflearning.
We have discussed that magnesite ore can be divided into two types of which are super and nonspecial grade [27]. But the actual application requirements often need to get into more types of magnesite; two kinds of classification methods in accuracy cannot meet the requirements. Therefore, we carried out a thorough study. First, we collected more grades of magnesite samples and carried out chemical tests and obtained accurate test results. Then they are divided into six categories, including “top grade, first grade, second grade, third grade, fourth grade, waste one.” In this paper, the nearinfrared (NIR) data of magnesite ore was used as the data source and a modeling method of grade classification for magnesite ore based on original and improved ELM was proposed with the basis of SAE dimension reduction. It is a new fast and accurate method for grade classification of magnesite ore.
2. Experiments
2.1. NIR Apparatus
SVC HR1024 portable field spectrum from Spectra Vista (US) was used to serve as the nearinfrared spectrometer. Its spectral range was 350–2500 nm, internal memory was 500 scans, weight was 3 kg, port numbers were 1024, spectral resolution (FWHM ≦ 8.5 nm) was 1000–1850 nm, and the minimum integral time was 1 ms.
2.2. Material Source and Collection of Spectrum Data
The 633 magnesite samples were from Dashiqiao Magnesite Mine in Liaoning Province, China. The spatial distribution of the ore body is controlled by the orecontrolling factors (such as tectonics, magmatic activity, stratigraphy, geochemical factors, and metamorphic factors) when the ore body is formed. As Figure 1 shows, the magnesite body formation is controlled by ore body and surrounding rock associated with it. In different areas of the contact zone formed different thickness and grades of ore body. The magnesite samples in this paper are taken from the four regions in Figure 1. These samples were sliced firstly.
Temperature and humidity have some influence on the acquisition of spectral data. We have done a series of experimental verification. The same magnesite sample (third grade) was tested at different times, different temperatures, and humidity. And it is ensured that the test temperature is within the normal working range of the instrument. Temperature, humidity, and solar elevation angle are as shown in Table 1; the spectral curve is shown in Figure 2. The results show that the temperature and humidity have little effect on the spectral test results.

The observation was done before sunset on sunny day without much cloud on open air. The scanning time is 1 s/time and the probe is 300 mm far away from the surface of magnesite ore and perpendicular to the upper surface. To reduce the radiation quantity on spectrum test, the experimenters are not supposed to walk about or be dressed in dark. The two surfaces were tested for 3 times through SVC HR1024 and the average value was used as the spectrum data. The spectral images of some samples are shown in Figure 3.
2.3. Sample Grade Determination
Liaoning Dashiqiao magnesite is an important source of magnesium in China, which has the characteristics of large reserves, thick layer, and shallow burial. After the spectrum test of 633 samples, contents of minerals in each sample were measured through chemical method. The samples were decomposed by hydrochloric acid, nitric acid, hydrofluoric acid, and perchloric acid before removing by HMTcopper the elements interfering the measurement including aluminium, iron, copper, zinc, and manganese; and then contents of calcium and magnesium were measured by EDTA standard solution complexometry. Industrial indicators of magnesite ore show that its grade classification is mainly determined by the contents of magnesium oxide, calcium oxide, and silicon dioxide. The physical and chemical test results of magnesite samples are presented in Table 2.

According to the industrial indicators and chemical analysis results, the grades of 633 samples are 104 top grade, 109 first grade, 108 second grade, 110 third grade, 102 fourth grade, and 100 waste ores.
2.4. Autoencoder (AE) Network and Structure Principle
Autoencoder (AE) network is used to reduce the dimension of data and extract the features of data through the 3layered neutral network. It can seek for the internal feature structure of data or encoding mode of data through selflearning. Its structure is shown in Figure 4.
Autoencoder tries learning an identity function and the pretraining stage in the encoding network can be divided into two steps: encoding and decoding. Layer 1 and Layer 2 belong to the encoding process where a function mapping will be defined to transform the input data into where and are the weight and bias from the input layer to the hidden layer; is a dimensional matrix of weight while is a dimensional bias vector. Function is a nonlinear mapping.
From Layer 2 to Layer 3 is the decoding. is restructured to through
Parameters of , , , and in autoencoding model are optimized by minimizing the mean reconstruction error. Mean reconstruction error can be defined in many ways but Formula (3) was adopted in this paper.
Gradient Descent and Backpropagation Algorithm (BP) were used for the iteration of and of in order to learn the optimal indepth autoencoder network. There are m samples in the fixed set . If the nodes of hidden layer are assumed smaller than , an input compression expression will be obtained. The overall cost function can be defined as
The first term is a term of mean square error while the second one is a regularized term. is the parameter of weight decay for narrowing down the range of weight to prevent overfitting. is the number of network layers. is the link parameter between in Layer and in Layer ; is the error term of in Layer . If the nodes in hidden layer are larger than , results of sparse coding will be worked out after adding some sparse restriction conditions, when the overall cost function is is the relative entropy between two Bernoulli random variables with and as the mean value. Each iteration in Gradient Descent can renew and according to the following formulas: where is the learning rate.
2.5. Basic Principle of Extreme Learning Machine (ELM)
ELM [28–30] is an algorithm through optimizing parameters of traditional neutral network through replacing iteration by solving linear equation set. Compared with traditional learning algorithms, ELM get rid of the repeated iterations, so it is faster and more competent in generalization [31].
For any given samples , where there are and , stand for feedforward neutral network in single hidden layer with nodes. Just as (1) shows, . refers to the relationship between node and . , and there is where is the input weight vector from the input layer to node in hidden layer; is the threshold value of node ; is the output weight vector from node to the output layer.
The samples , and , were selected. And then Formula (1) can be simplified as is the output matrix in hidden layer.
has no fixed formula and the number of nodes in the hidden layer is selected accordingly in the experiment. 20 is the best option in this paper. and are valued randomly in the training of model.
can be expressed in Form. (4) as follows: is the generalized inverse expression of .
2.6. Application of SAE and ELM in Model
From the autoencoder (AE) algorithm, we know that model’s output is equal to the input. The data dimension is reduced by controlling the number of hidden layer nodes less than the number of input nodes. SAE is to build multiple AE models to reduce the original data. SAE model’s parameters include the number of AE and the number of hidden nodes in each AE. The original spectral data is 633 × 973highdimensional data. In this paper, SAE constructs two AE models to achieve the purpose of reducing the original spectral data dimension; while the ELM is a single hidden layer feedforward neural network, the parameters we need to select are the number of hidden layer nodes, and we choose the hidden layer nodes through continuous testing which achieved good results in this paper.
2.7. Procedures of Classifying the Grade of Magnesite
NIR information of magnesite was collected through the spectrometer. These data need compression by SAE because of the high dimension and noise. And the data were classified through ELM to identify the magnesite: top grade and others. The procedures are displayed in Figure 5.
ELM in grade classification of magnesite can be concluded to 3 steps:(1)Firstly, set the training set to activate and number of nodes in hidden layer . And the parameter of will be generated randomly: .(2)Secondly, the output matrix in hidden layer is to be calculated.(3)Finally, the output weight is to be calculated: .
3. Establishment of the Model
3.1. Dimension Reduction of Magnesite Ore NIR Data by SAE
With 633 samples in the experiment, the original spectral data collected is a 973dimensional matrix. The redundancy of data makes the classification neutral network less accurate and consume longer time. The industrial grade of magnesite ore falls into 6 levels: top, first, second, third, fourth, and waste. The value of top, first, and second grade is well worth classifying. In this paper, SAE was used for the dimension reduction of pretreated spectral data. There are 2 layers of hidden layer, one of which contains 200 nodes and the other of which contains 100 nodes. So the original NIR data are supposed to reduce to 100 in the dimension reduction. SAE structure is displayed in Figure 6.
3.2. Establishment and Improvement of ELM Model
The spectral data treated by SAE is 633 × 100 and the training set and test set are randomly selected from the samples. There are 437 training samples, including 72 top grade, 73 first grade, 72 second grade, 80 third grade, 72 fourth grade, and 70 waste ores. Parameters of grade classification by ELM include activation function and number of nodes in the hidden layer. The activation function in ELM mainly includes Sigmoid function, sin function, and hardlim function. Number of nodes in the hidden layer exerts great influence on the learning and information processing of the network: excessive number makes the network more complex, prolongs the learning time, and leads to overfitting while smaller number may constrain the learning and processing ability of the network. In reality, empirical formula is conventionally used to roughly determine a range of the number before selecting an optimal one through repeated experiments. The optimal number in this paper is 45. Output weight is worked out through Formula (10) which gives the only optimal solution. The output results and expected scatter profiles of training set and test set in ELM model are presented in Figures 7 and 8, respectively.
Figures 7 and 8 reveal that traditional ELM is not so effective in the grade classification. The input weight of ELM and the threshold value of hidden layer are valued randomly, so the output is not so stable and likely to result in local minimum and thus the accuracy rate is small. On account of this, an improved ELM—accuracy ELM—is put forward. The traditional ELM is recycled for 200 times and then the output of training sets is compared to find out and reserve the most accurate input value(s) and threshold value(s) of hidden layer, both of which will be used as the fixed parameters of ELM model. That means the accuracy ELM model for grade classification of magnesite ore is established. The predicted and expected output scatter profiles of accuracy ELM training set are presented in Figures 9 and 10, respectively.
3.3. Integrated Accuracy ELM
The predicted results show obvious improvement. The accuracy ELM in each group corresponds to a group of parameters and there may be differences in the results, so it is not stable. To make it more stable, an integrated accuracy ELM is proposed. The number of integrated groups is the quantity of selected ELM models that we choose. Group 11 is selected in this paper and the integrated model outputs the most grades among the 11 single models, which can help improve the prediction accuracy. The quantitative relation between predicted value and expected value in results of experimental simulation and statistical simulations reveals that the integrated accuracy ELM model is superior in its stability and accuracy in the grade classification of magnesite ore. Comparisons between simulation results in this model and those of the real samples are presented in Tables 3 and 4.


4. Results and Comparison
ELM, accuracy ELM, and integrated accuracy ELM classification models were established for the grades of magnesite ore. The simulation results are displayed in Table 5.

This table shows the accuracy of each model in the grade classification. The training set of conventional ELM model is not so accurate; the accuracy ELM model is more precise, reaching above 85% but it is not so stable. Both the accuracy and stability of integrated accuracy ELM model have been greatly improved and its accuracy reaches 98%.
Table 6 is the comparison in cost and time in different ways including ELM model, BP, traditional artificial method, and chemical method.

This table presents that the traditional artificial method is not accurate enough though it is easy. Chemical medicine is needed in the chemical test method and the cost is about $1744, besides, some experimental apparatus and human cost will be above $435287, so this method is much more expensive. Comparatively, the investment on hardware devices including spectrometer and computer is no more than $43500. ELM is more accurate than BP, requiring less cost and shorter time. ELM is the most economical and accurate with considerable benefit.
5. Conclusion
In this paper, a new method of magnesite grade classification was put forward. Based on the nondestructive testing technology of NIR, the spectral data of magnesite ore was collected. And then, the dimension of these data was reduced with SAE. Finally, the models were established through ELM. Traditional ELM model is less accurate in the grade classification, because of which the accuracy ELM was proposed. To make up for the poor stability of accuracy ELM, the integrated accuracy ELM was proposed, which is superior to the two ELM models in terms of accuracy and stability. Its accuracy can reach as high as 98%. Compared with traditional methods of magnesite grade classification, integrated accuracy ELM has advantages in economic efficiency, accuracy, and rapidity; in addition, this method can achieve online testing of ores in large volume. Obviously, it is of great practical application value.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
This research is supported by National Natural Science Foundation of China (Grant nos. 41371437 and 61203214); Fundamental Research Funds for the Central Universities (N160404008); National Twelfth FiveYear Plan for Science and Technology Support (2015BAB15B01), China.
References
 Z. M. Wang, “Current situation and development trend of China magnesite,” China NonMetallic Mining Industry Herald, vol. 5, no. 6–8, article 23, 2006. View at: Google Scholar
 G. Li, Y. Yu, J. Li, Y. Wang, and H. Liu, “Experimental study on urban refuse/magnesium oxychloride cement compound floor tile,” Cement and Concrete Research, vol. 33, no. 10, pp. 1663–1668, 2003. View at: Publisher Site  Google Scholar
 G. Binal, “The usage of magnesite production wastes in ceramic tile bodies,” Journal of Ceramic Processing Research, vol. 15, no. 2, pp. 107–111, 2014. View at: Google Scholar
 C. Qingming and W. Tong, “Status and prospect of China's magnesia raw materials,” Refractories, vol. 47, no. 3, pp. 210–214, 2013. View at: Publisher Site  Google Scholar
 J. Li, Y. Zhang, S. Shao, S. Zhang, and S. Ma, “Application of cleaner production in a Chinese magnesia refractory material plant,” Journal of Cleaner Production, vol. 113, pp. 1015–1023, 2016. View at: Publisher Site  Google Scholar
 SJ/T 106321995, “Atomicabsorption spectrophotometry for electron ceramic raw materials”. View at: Google Scholar
 J. W. Ellis and J. Bath, “Modifications in the near infrared absorption spectra of protein and of light and heavy water molecules when water is bound to gelatin,” The Journal of Chemical Physics, vol. 6, no. 11, pp. 723–729, 1938. View at: Publisher Site  Google Scholar
 P. R. Mobley, B. R. Kowalski, J. J. Workman Jr., and R. Bro, “Review of chemometrics applied to spectroscopy: 198595, part 2,” Applied Spectroscopy Reviews, vol. 31, no. 4, pp. 347–368, 1996. View at: Publisher Site  Google Scholar
 J. J. Workman Jr., “Review of process and noninvasive nearinfrared and infrared spectroscopy: 1993–1999,” Applied Spectroscopy Reviews, vol. 34, no. 12, pp. 1–89, 1999. View at: Publisher Site  Google Scholar
 M. L. Yang, W. Fu, B. H. Wang, Y. Q. Zhang, X. R. Huang, and H. J. Niu, “Infrared spectra characteristics of the silicate nickel ores: a comparison study on different ore samples from Indonesia and China,” Spectroscopy and Spectral Analysis, vol. 35, pp. P631–P634, 2015. View at: Google Scholar
 M. Dalm, M. W. N. Buxton, F. J. A. Van Ruitenbeek, and J. H. L. Voncken, “Application of nearinfrared spectroscopy to sensor based sorting of a porphyry copper ore,” Minerals Engineering, vol. 58, pp. 7–16, 2014. View at: Publisher Site  Google Scholar
 H. Xu, H. Zhong, S. Wang, Y. Niu, and G. Liu, “Synthesis of 2ethyl2hexenal oxime and its flotation performance for copper ore,” Minerals Engineering, vol. 66, pp. 173–180, 2014. View at: Publisher Site  Google Scholar
 W. Liu, D. Wei, C. Han, and B. Cui, “Application of small molecular inhibitors in reconcentration of iron ore tailings,” Advanced Materials Research, vol. 402, pp. 552–555, 2012. View at: Publisher Site  Google Scholar
 B. Bohn, J. Garcke, and M. Griebel, “A sparse grid based method for generative dimensionality reduction of highdimensional data,” Journal of Computational Physics, vol. 309, pp. 1–17, 2016. View at: Publisher Site  Google Scholar
 R. Bellman, Adaptive Control Processes: A Guided Tour, Princeton University Press, New Jersey, NJ, USA, 1961. View at: MathSciNet
 M. Leineweber and Y. Gao, “Examining the feasibility of applying principal component analysis to detecting localized changes in mechanical properties,” Journal of Biomechanics, vol. 48, no. 2, pp. 262–268, 2015. View at: Publisher Site  Google Scholar
 J. E. Jackson, A Users Guide To Principal Components, John Wiky and Sons, New York, NY, USA, 1991.
 H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. View at: Publisher Site  Google Scholar
 J. Q. Zhong and R. S. Wang, “Multitemporal remote sensing images change detection based on ICA,” Journal of Electronics Information Technology, vol. 28, no. 6, pp. 994–998, 2006. View at: Google Scholar
 J. M. Zhang, Y. P. Lin, H. B. Wu, and G. L. Yang, “Advances of research in independent component analysis,” Journal of System Simulation, vol. 18, no. 4, pp. 992–997, 2006. View at: Google Scholar
 G. M. Wang and Z. Y. Liu, “Separation method of mixed pigments based on spectrum expression and independent component analysis,” Application Research of Computers, vol. 32, no. 2, pp. 593–597, 2015. View at: Google Scholar
 P. Comon and C. Jutten, Handbook of Blind Source Separation: Independent component analysis and applications, Academic Press, Cambridge, Mass, USA, 2010.
 G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” The American Association for the Advancement of Science. Science, vol. 313, no. 5786, pp. 504–507, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 G. E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1–27, 2009. View at: Publisher Site  Google Scholar
 P. Vincent, H. Larochelle, I. Lajoie, and P. Manzagol, “Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion,” Journal of Machine Learning Research, vol. 11, pp. 3371–3408, 2010. View at: Google Scholar  MathSciNet
 Y. Mao, D. Xiao, J. Cheng, J. Jiang, T. Ba, and S. Liu, “Research in magnesite grade classification based on near infrared spectroscopy and elm algorithm,” Spectroscopy and Spectral Analysis, vol. 37, no. 1, pp. 89–94, 2017. View at: Google Scholar
 G. B. Huang, Q. Y. Zhu, and C. K. Siew, “Extreme learning machine: theory and applications,” Neurocomputing, vol. 70, no. 1–3, pp. 489–501, 2006. View at: Publisher Site  Google Scholar
 G.B. Huang, “What are extreme learning machines? Filling the gap between Frank Rosenblatt's dream and John Von Neumann's puzzle,” Cognitive Computation, vol. 7, no. 3, pp. 263–278, 2015. View at: Publisher Site  Google Scholar
 G. Huang and L. Chen, “Enhanced random search based incremental extreme learning machine,” Neurocomputing, vol. 71, no. 16–18, pp. 3460–3468, 2008. View at: Publisher Site  Google Scholar
 G. Huang, “An insight into extreme learning machines: random neurons, random features and kernels,” Cognitive Computation, 2014. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2017 Yachun Mao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.