Abstract

As a commonly used mode of transportation in people’s daily lives, the normal operation of railway transportation is crucial. The track circuit, as a key component of the railway transportation system, is prone to malfunctions due to environmental factors. However, the current method of inspecting track circuit faults still relies on the experience of on-site personnel. In order to improve the efficiency and accuracy of fault diagnosis, we propose to establish an intelligent fault diagnosis system. Considering that the fault data are a one-dimensional time series, this paper presents a fault diagnosis method based on the UNet-LSTM network (ULN). The LSTM network is established on the basis of fault data and used for ZPW-2000A track circuit fault diagnosis. However, the use of a single LSTM network has a high error rate in the common fault diagnosis of track circuits. Therefore, this paper proposes a feature extraction method based on the UNet network. This method is used to extract the features of the original data and then input them into the LSTM network for fault diagnosis. Through experiments with on-site fault data, it has been verified that this method can accurately classify seven common track circuit faults. Finally, the superiority of the method is verified by comparing it with other commonly used fault classification methods.

1. Introduction

In recent years, China’s railways have been developing continuously. With the progress of technology and the needs of people’s lives, the frequency of train operation in stations is increasing. Therefore, the requirements for the safety and reliability of railway signal systems are increasingly strict. The track circuit is a key part of the railway signal system, which directly affects the operating efficiency of the whole traffic system. When a track section is occupied by a train, the voltage received by the main rail and the small rail of the track circuit is lower than the voltage required for normal operation. This results in the activation of red light indications, prohibiting following trains from entering the section. When the track section is unoccupied but there is a fault in the track circuit equipment, the railway signal system will also activate red light indications, prohibiting rear vehicles from entering the section to ensure safety. At this time, the personnel need to quickly determine the location of the fault and carry out repairs to restore train operations promptly. In order to rapidly diagnose the fault location, it is necessary to monitor the voltage data in the track circuit in real time, and the personnel need to be proficient in recognizing the voltage changes caused by different equipment faults.

However, due to the fact that a large number of devices in the track circuit are used outdoors, they are easily susceptible to environmental influences and can experience various types of failures. Currently, the diagnosis of track circuit failures in practical work mainly relies on the experience of personnel. However, humans cannot maintain the optimal working condition like machines, and manual fault diagnosis is prone to interference, which limits efficiency and accuracy. Therefore, it is necessary to use an intelligent fault diagnosis method, which can improve fault diagnosis efficiency. Traditional fault diagnosis uses a support vector machine (SVM), fuzzy clustering, and other methods. However, these methods have the disadvantages of being unable to process large-scale data and difficulty in classifying new problems. The computational complexity of fuzzy clustering is high. It involves calculations of fuzzy sets and iterative optimization processes. When dealing with large datasets, it requires more time and computational resources. Similarly, the computation complexity of support vector machines (SVM) increases with the increase in the number of training samples. When dealing with very large datasets, both the training time and storage requirements significantly increase. The original SVM algorithm was designed for binary classification problems. To apply SVM to multiclass problems, extension techniques need to be used, which can result in increased computational overhead, especially when there are many classes.

In traditional fault diagnosis, people usually rely on expert knowledge and rules to judge and locate problems. Intelligent fault diagnosis, on the other hand, utilizes the learning ability and pattern recognition capability of neural networks to automatically judge and diagnose faults by analyzing and processing input and output data from systems or devices.

At present, methods based on deep learning are widely used in the fault diagnosis field. Jiao et al. proposed a method to improve model performance by adding batch normalization and discarding layers to the convolution layer and pooling layer [1]. Lu et al. proposed a method to store new faults in the database for analysis [2]. Zhang et al. proposed a method for fault diagnosis after converting one-dimensional data into two-dimensional graphs [3]. Huang et al. proposed the CNN + LSTM model [46], a CNN is used to extract features, and an LSTM is used to classify faults. Huang et al. proposed a method to obtain data samples by establishing a simulation model and then using a CNN for classification [7].

In the track circuit fault diagnosis field, Zhao et al. established the tuning zone simulation model according to the working principle of the track circuit tuning zone [8] and built a backpropagation neural network for tuning unit fault diagnosis. Chen et al. used a neurofuzzy system [9]. It combines the advantages of fuzzy logic and neural networks and can be learned through the neural network training process. Bruin et al. proposed processing data by using a short-term memory network [10] and considering the temporal and spatial correlation of data. By comparing the fault diagnosis results with the convolutional network, it was concluded the LSTM was better than the CNN. Chen et al. established a fault diagnosis model based on ZPW-2000A track circuit experimental data and combined it with kernel principal component analysis and a stacked autoencoder network (KPCA-SAD) [11], which can realize the fault location of the track circuit. Hu et al. proposed a fault diagnosis method that combines gray theory with an expert system [12]. Sun and Zhao proposed a track circuit fault diagnosis method based on an SVM [13]. Short-circuit current signals were obtained by establishing a simulation model. SVM was used for classification, and the SVM classification accuracy was 96%, which is relatively low. Zheng et al. proposed the use of an optimized particle swarm algorithm to optimize deep belief networks [14], which improved the robustness and accuracy of the network. Lin et al. proposed a fault diagnosis method based on rough set and graph theory for ZPW-2000A uninsulated rail circuits [15] and proposed a new concept of fault decision chart for fault diagnosis. The detailed information of the references used in the above content is shown in Table 1.

This paper presents a track circuit fault diagnosis method based on the UNet-LSTM network (ULN). The network is composed of a feature extraction module and a fault classification module. The feature extraction module is composed of a UNet network. The UNet network is used to extract features from the original data, and the sample feature size of the original data is reduced from 600 to 38. This approach not only alleviates the curse of dimensionality and improves training efficiency but also allows the model to better learn key features from the data, thereby enhancing generalization performance. The track circuit fault data are a one-dimensional time series. Because the LSTM network has advantages in dealing with timing problems, the LSTM network is used as a classifier in the fault classification module. The fault classification module inputs the data after feature extraction into the LSTM network for classification. The dataset used in this paper is the main and short track voltages of three adjacent rails, which combines the information features in time and space to improve the classification effect of the model.

2.1. The Basic Principle of ZPW-2000A Track Circuits

In railway transportation systems, the ZPW-2000A track circuit is commonly used to ensure safety and the efficiency of train transport. It detects whether there is a train occupying a section of track and transmits relevant information [16]. The ZPW-2000A track circuit consists of two rails and several devices, as shown in Figure 1. When the devices are functioning normally, the transmitter provides power, which is delivered to the surface of the rails through the SPT cable and matching transformer. At this point, the signal is transmitted simultaneously to the main rail and the small rail (the small rail connects to the adjacent track) and eventually reaches the receiver. When the received main track and small track voltages do not meet the required voltage for the operation of the device, the track will generate a red light strip to sound an alarm.

This article primarily focuses on the important components of the ZPW-2000A track circuit, including the matching transformer, tuning unit, rail, compensating capacitor, and attenuator. When the ZPW-2000A track circuit is functioning properly, the voltage waveforms received by the main rail and small rail are continuous and stable. However, if any of the devices experience a malfunction, the received voltage on the main rail and small rail will change. Different equipment failures will result in different voltage waveforms on the main rail and small rail. Therefore, the voltage changes on the main rail and small rail in the ZPW-2000A track circuit can be used as a basis for fault diagnosis.

In practical work, when a device in the track circuit malfunctions, it will send an alarm to the personnel. The staff determine the type of fault by examining real-time monitored track voltage data. However, manual fault diagnosis has certain limitations. This method relies too heavily on the experience of the staff. When the staff are fatigued or not fully focused, the efficiency and accuracy of fault diagnosis can be affected. Establishing an intelligent fault diagnosis model can reduce the workload of personnel and improve the efficiency and accuracy of fault diagnosis.

In actual railway systems, the route for train operation consists of a series of interconnected track circuits. When one track circuit experiences a malfunction, its adjacent track circuits will also be affected to varying degrees. Therefore, we took the voltage data of the main rail and small rail from the faulty track circuit and its two adjacent track circuits as experimental data.

3. Method

3.1. Long Short-Term Memory Network (LSTM)

A long short-term memory network (LSTM) [17] is a kind of time-cyclic neural network that is specifically designed to solve the long-term dependence problem of a general cyclic neural network (RNN). An LSTM cell consists of a memory cell Ct and three gate structures (input gate it, forget gate ft, output gate ot) as shown in Figure 2. At moment t, Xt denotes the input data, Ht denotes the hidden layer, “X” denotes the vector outer product, and “+” denotes the superposition operation. The operation formula is as follows, using which the forward propagation calculation of LSTM can be performed. Each of the gate structure’s is using sigmoid function as an activation function, using which the input data can be filtered to keep the useful information and delete the useless information. The memory cell Ct has the function of memory, and the filtered data can be saved and transmitted backward by the memory cell. Therefore, LSTM network can solve the problem of long-term dependency.

3.2. Framework

The data-driven [18] UNet-LSTM network framework proposed in this paper is shown in Figure 3. First, the collected dataset is input into the feature extraction module, and feature extraction is performed through the UNet network [19]. Then, the extracted features are input into the fault classification module, and the LSTM network is used for classification.

3.3. Feature Extraction Module

The data used in this paper are one-dimensional time series. Time series have local and global characteristics. To make the model have high accuracy in processing time series, it must consider both characteristics. A high correlation between adjacent variables of time series is used to extract global features according to the advantages of the LSTM network in processing long-term dependent data. For local time series features, this paper adopts a convolutional neural network model based on the UNet network to extract local features. The feature extraction model is shown in Figure 4.

The structure of the feature extraction method is divided into left and right sections. The left section is convolutional downsampling. The original data sequence feature size is reduced from 600 to 75. The right section is data upsampling. First, the features after the last downsampling are deconvolved. Then, they are spliced with the features extracted from the second downsampling. Then, the spliced data are upsampled once and spliced with the first downsampled data. Finally, after a 1 × 3 convolution, the stride is 8, the number of feature channels is reduced to 1, and the sequence feature size is reduced to 38. Then, it is input to the LSTM network. According to experience, setting the convolutional kernel size as 3 in the convolution operation during upsampling and downsampling tends to yield better performance. It is important to add a feature extraction block in the second downsampling to extract the depth feature once. Its structure is shown in Figure 5.

The principle of the feature extraction block is as follows. First, we take the features after the second downsampling as the input and extract the depth features through three convolution layers. In order to extract features at different scales, we set the size of the convolutional kernel to 3, 5, and 7. Then, the three convolved features are spliced together. Finally, the number of channels meets the right splicing requirements through a convolution.

3.4. Fault Classification Module

The fault classification module includes an LSTM layer and a full connection layer (FC) and constructs a complex nonlinear model between the input and output. As shown in Figure 6, in the model proposed in this paper, the original time series is used for feature extraction and then used as the input to the LSTM network. The LSTM layer trains the input data. The softmax function is used in the fully connected layer, and the output of the last neuron in LSTM_1 is transformed into a probability distribution of seven fault type. There are seven types of track circuit faults, so the number of neurons in the fully connected layer is 7.

4. Experiments

In this section, we evaluate the method proposed in this paper by testing on the actual ZPW-2000A track circuit fault dataset and training the network with a dataset containing six different faults and 1 healthy state.

4.1. Datasets

The dataset used in this paper is collected from real measurements in the field of the ZPW-2000A track circuit. The dataset includes the main and small track voltages of three adjacent track circuits. The small track voltage of the rear track circuit is hardly affected, so voltage data are not collected for it. The collected data include the main and small track voltages of the faulty track circuit, the main and small track voltages of the track circuit in front of the faulty track circuit, and the main track voltage of the track circuit behind the faulty track circuit. When a failure occurs, the voltage changes of each track circuit typically stabilize within two minutes. Therefore, voltage data is collected every second within a two-minute period as experimental data. The dataset consists of 17,500 training samples and 10,500 test samples. The article studies seven types of track circuit faults, and the amount of data for each type of fault is the same. The specific information is shown in Table 2.

4.2. Model Setting

The parameters used in the model are set according to the feature extraction method and fault identification network proposed in this paper. Specific parameter information is shown in Table 3.

4.3. Model Training

Using the cross-entropy function as the loss function in the LSTM network to evaluate network performance, the output is obtained after model training, and then the cross entropy between the output value and the sample tag is calculated. It achieves a good effect on judging the similarity between the actual output and the expected output. The function formula is as follows:

Here, y is the real label value (positive class value is 1, negative class value is 0), and is the predicted probability value ( ∈ (0, 1)). It represents the difference between the real sample tag and the prediction probability. In this article, the optimizer uses stochastic gradient descent (SGD), and the initial learning rate is set to 0.1, and the learning rate is gradually reduced with iterations. The cross-entropy function is used as the loss function in the LSTM network to evaluate network performance and update the parameters with backpropagation. The work was implemented using Python language with PyTorch libraries and executed on a system based on GeForce RTX 2080 Ti GPU, running in a Windows 10 environment.

4.4. Experimental Results

In this paper, datasets are used for 4,000 samples in each of the six fault and normal states, totaling 28,000 samples. This paper classifies and recognizes the data after feature extraction according to the proposed method. In the LSTM network parameters, the number of layers and the number of neurons in LSTM have a significant impact on the experimental results. Therefore, this article has conducted research on these two parameters and selected the optimal combination. Therefore, this paper studies the two and selects the best parameter combination. This paper studies the classification ability of UNet-LSTM when the number of LSTM neurons is 4, 8, 16, and 32 and the number of LSTM layers is 1–3. Figure 7 shows the accuracy of UNet-LSTM under different parameters. It can be seen from the data in the figure that when the number of LSTM layers is 1 and the number of LSTM neurons is 16, the accuracy rate reaches the highest, 99.85%. As the number of network layers increases, the model exhibits an overfitting phenomenon, and the accuracy decreases.

Figure 8 shows the UNet-LSTM accuracy and loss under the optimal parameter combination. We record the results for every 400 iterations. The model starts to converge from the 40th epoch and maintains an accuracy of over 95%. The research model demonstrates good convergence speed and stability. Its precision, recall, and F1 are shown in Table 4:

The LSTM network without the feature extraction method proposed in this paper is tested with the same dataset. To eliminate the randomness of the experiment, each experiment was repeated ten times. Table 5 displays the highest, lowest, and average accuracies of the two methods in ten experiments. According to the data in the table, the maximum accuracy of using the LSTM network to diagnose track circuit faults is 85.71%, and the average accuracy is 84.28%. The maximum accuracy using the ULN method proposed in this article to diagnose track circuit faults is 99.85%, and the average accuracy is 99.75%. Thus, the UNet-LSTM network greatly improved the accuracy of track circuit fault diagnosis.

Figure 9 shows the highest accuracy of track circuit fault diagnosis using two methods. According to the data in the figure, it can be seen that the LSTM network cannot identify faults 5 and 6 very well, and the accuracy rate is only 50%. The reason for this is that the voltage change waveforms of these two faults are similar. Using the UNet-LSTM network proposed in this paper, the fault identification accuracy is more than 99%, which can identify the fault types in the track circuit well.

In Figure 10, (a) is the confusion matrix of UNet-LSTM network classification, and (b) is the confusion matrix of LSTM network classification. In the test, 10,500 faults were classified, and the LSTM network mistakenly identified 1,500 faults, of which fault 5 and fault 6 were mistakenly incorrectly identified because of the similarity between the two fault data. Fifteen faults were incorrectly diagnosed using the UNet-LSTM network because of the similarity between the original data of fault 1, fault 5, and fault 6. However, the overall fault diagnosis results verify that the proposed UNet-LSTM framework can effectively detect various track circuit faults.

4.5. Experimental Comparison

Experiments are conducted on commonly used fault classification methods using the same data set, and the experimental results are compared with the methods used in this paper. The comparison results are shown in Table 6. The accuracy rate of the method used in this paper is 99.85%, which is higher than the other three fault diagnosis methods (the highest accuracy rate is 86.23%).

5. Conclusion

We propose an effective track circuit fault diagnosis method (UNet-LSTM), which collects three track circuit information as experimental data, which has temporal and spatial features to facilitate the accuracy of fault diagnosis. We introduce a module of UNet feature extraction to extract features at different scales. In the UNet module, we add a deep feature extraction block to extract features at deeper scales. Finally, we build an LSTM network for fault diagnosis. The experimental results show that the accuracy of this method for fault diagnosis of rail circuits is 99.75%, and the superiority of this method is verified by experimental comparison with other methods.

Data Availability

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. The data include the track circuit fault data set, and the code includes some.py files written in Python.

Conflicts of Interest

The authors declare no conflicts of interest with respect to the research, authorship, and/or publication of this article.