Abstract

An improved classification approach is proposed to solve the hot research problem of some complex multiclassification samples based on extreme learning machine (ELM). ELM was proposed based on the single-hidden layer feed-forward neural network (SLFNN). ELM is characterized by the easier parameter selection rules, the faster converge speed, the less human intervention, and so on. In order to further improve the classification precision of ELM, an improved generation method of the network structure of ELM is developed by dynamically adjusting the number of hidden nodes. The number change of the hidden nodes can serve as the computational updated step length of the ELM algorithm. In this paper, the improved algorithm can be called the variable step incremental extreme learning machine (VSI-ELM). In order to verify the effect of the hidden layer nodes on the performance of ELM, an open-source machine learning database (University of California, Irvine (UCI)) is provided by the performance test data sets. The regression and classification experiments are used to study the performance of the VSI-ELM model, respectively. The experimental results show that the VSI-ELM algorithm is valid. The classification of different degrees of broken wires is now still a problem in the nondestructive testing of hoisting wire rope. The magnetic flux leakage (MFL) method of wire rope is an efficient nondestructive method which plays an important role in safety evaluation. Identifying the proposed VSI-ELM model is effective and reliable for actually applying data, and it is used to identify the classification problem of different types of samples from MFL signals. The final experimental results show that the VSI-ELM algorithm is of faster classification speed and higher classification accuracy of different broken wires.

1. Introduction

Extreme learning machine (ELM) was proposed based on the single-hidden layer feed-forward neural network (SLFNN) [1]. Unlike the conventional network learning algorithm which must know the training samples before generating the parameters of the hidden node, ELM could generate randomly the parameters of the hidden node before understanding the training samples. ELM is characterized by the easier parameter selection rules, the faster converge speed, the less human intervention, and so on. However, due to the random generation mechanism of hidden nodes in ELM, there are still some urgent problems to be improved in ELM. The network structure is very crucial to the learning results and generalization ability of ELM. The network structure of ELM is determined by the number of hidden nodes. In recent years, the growth mechanism of hidden nodes has been extensively studied by many researchers. In order to obtain a better generalization of ELM, the performance of ELM should be optimized. Recently, there are many improved methods about ELM. The incremental extreme learning machine (I-ELM) was proposed by Huang et al. [1], which randomly adds hidden nodes one by one until it reaches the convergence requirement. But I-ELM does not recalculate the output weights of all existing nodes when a new node is added. To solve the disadvantages of I-ELM, the convex incremental extreme learning machine (CI-ELM) [2] and its improved method (ICI-ELM) [3] have been proposed. To decrease the calculation time of ELM, two different growth structures (increased structure and decreased structure) of hidden nodes were designed. The increased structure of hidden nodes includes the enhanced random search based on I-ELM (EI-ELM) [4], EM-ELM [5], and so on. The decreased structure of hidden nodes includes P-ELM [6], OP-ELM [7], EM-ELM [8], and so on. The error-minimized extreme learning machine for single-hidden layer feed-forward neural networks was proposed for the problem of simultaneous learning. The optimum values of these parameters and the numbers of hidden neurons of ELM were obtained by using a genetic algorithm (GA), wavelet or particle swarm optimization (PSO). In addition, some new adaptive growth methods of hidden nodes were proposed, including AG-ELM [9] and D-ELM [10]. Apart from optimization constraints of ELM, ELM has a wide range of applications in data classification [11], nonlinear dynamic systems identification [12], pattern recognition [1315], expert diagnosis [16], medical diagnosis [17], modelling permeability prediction [18], expert target recognition [19], human face recognition [20], and prediction interval estimation of electricity markets [21]. However, there are still some problems that need to be studied. All these have resulted in contradiction between the efficiency and the accuracy. This paper is based on deeply studying the improved ELM methods, and a new growth network structure of the ELM algorithm is proposed to gain better generalization. Due to the updating process being dynamically adjusted by the structure of hidden nodes by a variable step length, the method is referred to as the variable step incremental extreme learning machine (VSI-ELM). So VSI-ELM is characterized by the compact network structure, the fast running speed, and the better generalization ability.

Wire rope is widely used in coal mines, as the key component of a mine hoister, which is characterized by high intensity, lightweight, favorable flexibility, high reliability, better bending performances, and so on [22]. So wire rope is playing an increasingly important role in coal mining. Under the alternative load, the fatigue, wear, and corrosion of wire rope tend to happen and even result in the serious damage to broken rope [23]. Since some events may lead to wire rope with some risks to hosting persons, broken wire is not only the beginning of serious damage of broken rope but also difficult to be found previously, which cumulatively decreases the strength or even leads to fracture of wire rope [24, 25]. Therefore, it is important to study the nondestructive testing technique of wire rope.

The rest of this paper is organized as follows: Section 2 gives ELM algorithm theory and its improved VSI-ELM model. Section 3 gives data analysis and research of ELM, I-ELM, and VSI-ELM and the performance analysis of ELM by using the UCI data set. Section 4 introduces an automatic MFL detection system. In this section, VSI-ELM is applied to diagnosis of different broken wires. Section 5 concludes the paper indicating major achievements and future scope of this work.

2. ELM Theory

2.1. Traditional SLFNN Theory

Extreme learning machine (ELM) was proposed based on the single-hidden layer feed-forward neural network (SLFNN). ELM is characterized by the easier parameter selection rules, the faster converge speed, the less human intervention, and so on. The ELM algorithm has been widely used in many areas of image processing, machines vision, pattern recognition, decision and control, and so on. A typical SLFNN is mainly composed of the input layer, hidden layer, and output layer. ELM is a unified SLFNN with randomly generated input weights, bias, and hidden nodes. For any given independent samples , and .

Assume the input layer of SLFNN with nodes, the hidden layer of SLFNN with nodes, and the output layer of SLFNN with nodes. A typical SLFNN model can be represented bywhere is the connection weight between input layer nodes and hidden layer nodes and is the bias. The two parameters and are independent not only of the training sample set but also of each other. is the connecting weight between the hidden node and the output nodes. is the activation function of hidden nodes. is the output vector.

Unlike based on traditional gradient descent learning algorithms which only work for differentiable activation functions, ELM algorithm also can work for all bounded nonconstant piecewise continuous activation functions. The hidden node of ELM includes additive or RBF-type nodes, fully complex nodes, and wavelet nodes. The common activation functions of the hidden layer are shown in Table 1. For the traditional hidden layer activation function, the activation function parameters and are 1. And the different values will impact the performances of the ELM algorithm.

For a given standard set of training samples , if the outputs of the network are equal to the targets, we can get :

Equation (2) can be written compactly aswherewhere is called the output matrix of the hidden layer in ELM, the column of is the output vector of the hidden node with respect to the inputs, and is the transpose of a vector .

In practical applications, the number of training sample sets is greatly larger than the number of hidden nodes . In order to reduce calculation of ELM, the number of hidden nodes is generally selected less than the number of training samples .

For a given minimum value , ELM is of the universal approximation capability, as represented by the following equation:

Under the constraint of the minimum norm least square, the weight between the hidden nodes and the output nodes can be calculated aswhere is the Moore–Penrose generalized inverse of the output matrix of the hidden layer .

The ELM has a three-step learning model and can be summarized below. Given a training sample set and the activation function of the hidden node ,Step 1: assign randomly the input weight , the bias , and hidden layer nodes Step 2: calculate the output matrix of the hidden layer Step 3: calculate the output weight .

2.2. VSI-ELM Algorithm

Based on deeply studying the improved ELM methods, a new growth network structure of the ELM algorithm is proposed to gain better generalization. Due to the updating process being dynamically adjusted by the structure of hidden nodes by a variable step length, the method is referred to as the variable step incremental extreme learning machine (VSI-ELM). VSI-ELM is characterized by the compact network structure, the fast running speed, and the better generalization ability.

But due to lack of the selecting standard of hidden nodes, the initial value of hidden nodes is particularly important. If the number is far greater than the optimal value of hidden nodes, it can result in the increase of the training time and the decrease of the generalization ability. If the number of hidden nodes is too small, it can result in not only the lack of the fault tolerance ability but also the increase of the training error. According to the requirements between the number of hidden nodes and the resolution problem of ELM, together with the selecting experience of other neural networks, the initial value of hidden nodes in ELM is as follows:where is the initial number of hidden nodes, is the number of the input layer nodes, is the number of the output layer nodes, is the variable step length function, , and is the number of the iterations. When , is the initial number of hidden nodes.

Next, the update of hidden nodes is adjusted by (8). When the number of hidden nodes is close to the objective, ELM can adjust the smaller step to increase or decrease the number of hidden nodes. When the number of hidden nodes is far to reach the objective, ELM can adjust the larger step to increase or decrease the number of hidden nodes. VS-ELM reduces the computational complexity by only updating the output weights incrementally each time. The output weight is calculated by the least-square criterion. And the computing process of VS-ELM is as follows.

Given a set of training samples , , , the expected learning accuracy , and the maximum iteration , the VS-ELM algorithm can be shown in three phases.

Phase 1: the initialization phase:(1)Initialize the parameters of SLFNN with the mechanism of randomly generated , , and activation functions . There exists a positive integer . The initial number of hidden nodes is , and its error is , when .(2)Calculate the output matrix of the hidden layer :(3)Calculate the corresponding output error .

Phase 2: the recursive learning phase, while and :(1)

There are two seeking directions of the growing mechanism of hidden nodes of ELM, including the increasing growth and the decreasing growth of the network structure as represented by formula (10) and formula (11).

The total number of hidden nodes can be added to the value . It means adding the number of hidden nodes and to the existing SLFNN, respectively. Calculate the corresponding output errors and .(a)For the negative growth of the network structure, the number of hidden nodes is(b)For the positive growth of the network structure, the number of hidden nodes is

Firstly, compare , , and ; the smallest of , , and will be used as the number of hidden nodes. Suppose and the SLFNN with the number of hidden nodes , if the corresponding output error , , and , the growing procedure gets finished. If is the minimum value of , VS-ELM chooses the negative growth of hidden nodes. If is the minimum value of , VS-ELM chooses the positive growth of hidden nodes. For example, if is the minimum value of , the next update of the number of hidden nodes is and the corresponding output error is .

Secondly, compare and ; if and , the next update of the number of hidden nodes is and the corresponding output error is . In addition, if and , the number of hidden nodes will stop positive growing, it is taken as the new initial of the number of hidden nodes. Next, the number of hidden nodes is updated. Using this method, we can find the best number of hidden nodes until .

Phase 3: if the corresponding output error or , the growing procedure gets finished.

End while.

3. Data Analysis and Research

3.1. The Performance Analysis of ELM

Through the above analysis of ELM theory and its improved methods, it is not difficult to find that the performance of ELM has a direct relationship with its algorithm structure. The input weight matrix of ELM is generated by a random pattern after the matrix of input neurons and the number of hidden layer neurons. The number of input neurons is determined by the size of the sample matrix (training or testing), while the number of hidden layer neurons is artificially set. Therefore, the size of the sample matrix also affects the performance of ELM. However, it is very important to adjust the hidden layer neuron nodes to improve the performance of the ELM without changing the sample size. Based on this, the accuracy of the ELM in regression or classification will be improved if the input weight matrix generated by a random pattern is the best match with the training sample. Therefore, it is of great significance to study the number of hidden layer neurons in ELM and optimize the parameters of the input weight matrix.

To investigate the effect of hidden layer neuron nodes on the performance of ELM, a performance test of the ELM algorithm was conducted using some sample sets provided by the University of California, Irvine (UCI). The regression and classification sets selected from the UCI data set are shown in Tables 2 and 3, respectively. Among them, the determination coefficient and the root-mean-square error (RMSE) are selected as evaluation indexes. The smaller the root-mean-square error, the better the performance of the algorithm model. The determination coefficient is within the range , and the closer the coefficient to 1, the better the performance of the algorithm model. Conversely, the closer the coefficient to 0, the worse the performance of the algorithm model. The two indicators are calculated as follows:where is the predicted value of the th sample, is the true value of the th sample, and is the number of samples.

In order to verify the impact of hidden layer nodes of ELM on the field of regression prediction, the selected four classes of regression data sets (spectra set, concrete set, fertility set, and sinc set) were tested from UCI provided in Table 2. Based on the performance indicators of the root-mean-square error (RMSE) and determination coefficient , the performance index changes with the hidden layer node, as shown in Figure 1.

After analysis, it can be seen that the sample features can effectively describe the characteristics of the samples, which is the prerequisite for the regression prediction of the samples, which is the main reason for the unsatisfactory performance, as shown in Figure 1(b). In both the training samples and the testing samples, the closer the RMSE to 0, the closer the determination coefficient to 1. Similarly, the closer the determination coefficient to 1, the closer the RMSE to 0.

For the simple characterization of samples such as the sinc set, with the increase of hidden layer nodes of ELM, the closer the RMSE to 0 and the closer the decision coefficient to 1, the higher the regression prediction accuracy of the testing samples than the training samples. At the same time, with the increase of hidden layer nodes of ELM, there is no fluctuation of their RMSE and decision coefficient anomaly. However, with the increase of hidden layer nodes of ELM, the abnormal fluctuation of RMSE and determination coefficient appears on the regression prediction of the spectra set, concrete set, and fertility set, and these abnormal fluctuations occur in the hidden layer node numbered 30, 50, and 50, respectively. Therefore, it is not appropriate to improve the fitting accuracy of ELM regression only by adding hidden layer nodes without considering the ELM overlearning problem. In addition, for the spectra set, concrete set, and sinc set, there is no change when the hidden layer nodes reach 90, 100, and 10, respectively; since then, continuing to increase the hidden layer nodes will only increase the computing time of ELM, as shown in Figure 1.

In order to verify the effect of hidden layer nodes of ELM on the classification method, 11 classification data sets (abalone, statlog (heart), diabetes, parkinsons, wdbc, iris, wine, breast tissue, glass, seeds, and waveform (version 2)) are used to test the classification predictions of ELM. The classification accuracy of the classification results varies with the hidden layer nodes, as shown in Figure 2.

Through the analysis and comparison of the above data set, the conclusions are as follows:(1)With the increase of hidden layer neuron nodes, the classification accuracy of the training samples has a sharp increase stage and then relatively slowly approaches the target value. If the neurons in the hidden layer continue to increase, the classification accuracy of the training samples can reach 100%.(2)With the increase of hidden layer neurons, the classification accuracy of the testing samples also has a sharp increase phase, but it does not approach the target value relatively slowly afterwards as the training samples. Instead, the following possibilities exist: (1) When the classification accuracy increases sharply and reaches the maximum value, the classification accuracy decreases gradually, as shown in Figures 2(a), 2(e), and 2(k). (2) When the classification accuracy sharply increased to reach the maximum, the classification accuracy first decreased and then stabilized, as shown in Figures 2(b), 2(c), 2(d), and 2(i). (3) When the classification accuracy increased sharply and reached the maximum value, the classification accuracy first decreases and then rises and approaches a new stationary value, as shown in Figures 2(f), 2(h), and 2(j).(3)There is a big difference in classification accuracy of ELM when the nodes of hidden layer neurons are not much different, and their classification accuracy even exceeds 20%. As shown in Figure 2(g), the classification accuracy of wine presents a banded distribution. The main reason for this situation is that ELM input weights are caused by the stochastic mode. However, the root cause is that every time the hidden layer node of ELM is updated, the input weight matrix is updated again, which leads ELM to lose self-optimizing ability and greatly increase the searching time of the ELM optimal structure. This is also the reason why the I-ELM algorithm uses hidden layer neuron nodes layer by layer.

3.2. The Different Growth Structure of Hidden Layer Nodes of ELM

There are two different ways of growth of hidden layer nodes in this article. I-ELM algorithm 1 needs to recalculate all the input weights according to the updated number of hidden layer neurons, and I-ELM algorithm 2 only needs to calculate the connection weights of the new added hidden layer neurons and original input and output neurons. I-ELM algorithm 2 makes full use of the previously calculated input weight matrix to reduce its calculation time. I-ELM algorithm 2 improves the algorithm structure only by adding hidden layer nodes, as shown in Figure 3.

In order to verify the difference between the two methods, different updates of hidden layer neuron numbers and recalculation of input weights are done. In this paper, the two methods (I-ELM algorithm 1 and I-ELM algorithm 2) were tested by using the iris set, respectively. The results are shown in Figure 4. As can be seen in Figure 4, I-ELM algorithm 2 significantly converges faster than I-ELM algorithm 1, making the ELM structure more compact and avoiding unnecessary training time consumption.

3.3. The Performance Analysis of the VSI-ELM Algorithm

In order to compare the performance of the VSI-ELM algorithm and I-ELM algorithm, the UCI classification data set (statlog (heart), diabetes, parkinsons, and iris) provided in Table 3 was used to test the two algorithms. The update rate curves of the sample classification accuracy are shown in Figure 5. The detailed comparison results of the training time-consuming and hidden layer neuron node for the VSI-ELM algorithm and I-ELM algorithm are shown in Table 4. For iris, the preimprovement algorithm (I-ELM) takes twice the time-consuming training of the modified algorithm (VSI-ELM). So the VSI-ELM algorithm is of faster training speed of the multiclassification samples.

Numerical analysis shows that the VSI-ELM algorithm can guarantee the optimal number of neurons in the hidden layer and faster convergence than the I-ELM algorithm and make the ELM network structure more compact; generalization ability is also stronger.

4. Practical Application Based on the VSI-ELM Algorithm

4.1. Research Background of Broken Wire Detection

Mine lifting wire rope is one of the most critical components of the coal mine transportation system. It is responsible for the transportation of personnel, coal, and equipment, and its working condition is directly related to the safe and orderly production of coal mine. As the mine lifting wire rope is affected by the long-term friction, humidity, corrosion, and other harsh production conditions and bears the repeated tensile load and bending load, broken wire, abrasion, corrosion, and other structural damage will inevitably appear, which results in the strength reduction of wire rope and brings harm to the safe operation of the wire rope. With the increasing depth of mining, the requirements for wire rope that can withstand high-speed, long-term, and heavy-load conditions have become exigent. However, the complex structure of wire rope and uncertainty of damage type and location have brought a lot of technical problems to the wire rope nondestructive detection, especially the use of the magnetic flux leakage method. In that case, the relation between magnetic field change and structure, movement mode, and stress change of wire rope is becoming more complex, and it also brings troubles to the magnetic flux leakage signal detection. Magnetic flux leakage (MFL) is an efficient nondestructive testing technique for the defected wire rope and plays an important role in the dynamic monitoring of wire rope [2628]. Because of the intricate structure of wire rope, there is a complicated relation between the diverse damages and MFL signals. Permanent magnet is characterized by small volume, low cost, lightweight, high magnetic field, not requiring power, and easy to dispose and install. The MFL signals are gathered by some arrays of Hall effect sensors disposed at the circumference clinging to the outer surface of wire rope [29]. So the MLF signals are influenced by the lift-off distance, velocity effect, shaking, and various properties of the defects. The MFL signal of each channel is different from that of other channels in a multistage ring MFL detection device [30]. Certainly, all influencing factors are very important to study the design of the subsequent signal processing. In recent years, a large amount of the defect detection methods have gained great achievement in respect of monitoring of wire rope; meanwhile, there are also some issues that need to be resolved. Therefore, it is an important and urgent lesson in the research field that explains how to apply the simplest and fastest method for fault feature extraction of the broken wire of wire rope. The effect of variable tensile stress on the MFL signal response of defective wire ropes is analyzed and dealt with as needed [31]. The filtering system consisting of the Hilbert–Huang transform and compressed sensing is used to obtain the defect RMF image characteristics of wire rope, and the characteristics are extracted as the input of a radial basis function neural network to identify the defects of wire rope [32].

To remove the effects of channel-to-channel mismatch on the disposition, an adaptive method for MFL channel equalization is based on PCA and ELM [33]. For the classification of the MFL signal for different broken wires, the neural networks are very popular methods. The BP neural network was employed for the quantitative identification of broken wires [34]. The improved radial basis function neural network was applied for the quantitative identification of defected wire rope [35]. The wavelet neural network was used for the prediction and diagnosis of hoisting wire rope [36]. Therefore, the choice of the wire rope breakage identification method is to be solved.

4.2. Experimental Study of the Classification of Different Degrees of Broken Wires

In this paper, a new MFL detection device is used to obtain the MFL signal. The MFL detection device is shown in Figure 6. Twenty-four Hall sensors are distributed in space of the MFL detection device. Each of the three Hall sensors is a group. The acquisition board includes three diverse direction Hall sensor arrays. Each direction is composed of 8 channels of Hall sensors, which are uniformly arranged at the annular circuit board. There are 24 Hall sensors to measure the magnetic flux leakage of defected wire rope by using the necessary amplification and filter to record the MFL signal. The multichannel MFL signals are transmitted to the acquisition system. The time-domain and time-frequency domain characteristics of MFL signals of the diverse wire rope are analyzed. In order to train the VSI-ELM algorithm, some normal samples are needed in this experiment. The mixed-features vector can be used as the effective characteristic input of the quantitative identification when wire rope appears to be broken wires. To avoid the training sample set getting too large, the length of the sample set is set to a certain length (2048 data points). Table 5 shows the characteristic samples of MFL signals of broken wires, where is the sequence number, is the peak of the MFL wave, is the width of the MFL wave, is the area under the MFL wave, is the diameter of wire rope, is the lift-off distance, and is the damage type. In this section, VSI-ELM was utilized to extract the characteristics of MFL signals of different broken wires. For MFL signals, the characteristic samples include training samples and testing samples. The number of training samples is 80. The number of testing samples is 80. The classification accuracy of defected broken wires is up to the best value of 97.5% by using VSI-ELM. Compared to the I-ELM algorithm, VSI-ELM can not only gain the optimal number of hidden nodes but also the fast convergence rate. The experimental results show that the VSI-ELM algorithm is of faster classification speed and higher classification accuracy of different broken wires.

5. Conclusions

In this paper, the theory of ELM based on the single-hidden layer feed-forward neural network is reanalyzed. The classification model of ELM is theoretically deduced, and the existing improving methods of ELM are compared. The number of hidden layer nerves of ELM is emphatically analyzed. So the key is the hidden layer neuron growth strategy. This article focuses on the analysis of the influence of the number of hidden layer nodes on the performance of ELM.

The numerical simulation analysis of the UCI data set is used to test the effect of the number of hidden layer neuron nodes of ELM. Through comparative analysis, it is found that I-ELM algorithm 2 has better performance. It is verified that the I-ELM algorithm 2 is more conducive to finish of sample training by using stacking the hidden layer nerves. Based on the above analysis, a novel adjustment strategy of hidden layer neuron nodes of the ELM (VSI-ELM) algorithm is proposed in this paper. The feasibility of VSI-ELM is verified by the UCI classification data set (statlog (heart), diabetes, parkinsons, and iris). The time-consuming ratio of I-ELM to VSI-ELM of statlog (heart), diabetes, parkinsons, and iris is 50.64, 18.08, 28.89, and 2.86, respectively. The experimental results show that the VSI-ELM algorithm can find the best number of hidden layer neuron nodes faster than the I-ELM algorithm. Finally, the VSI-ELM algorithm is applied to identify the characteristics of the MFL signal of different broken wires. The classification accuracy of defected broken wires is up to 97.5% by using VSI-ELM.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was financially supported by the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions, the University-Industry Cooperation Research Project in Jiangsu Province under Grant no. BY2016026-02 and the State Key Laboratory of Integrated Services Networks under Grant no. ISN10-10.