Abstract

For pipes connected by pipe joints, leaks in the pipeline system are likely to occur at the pipe joints as opposed to the tube itself. Thus, early detection is critical to ensure the safety of the pipeline system. Based on acoustic emission (AE) techniques, this paper presents an experimental research on small leak detection in gas distribution pipelines due to loosening of the pipe joint connection. Firstly, the acoustic characteristics of leak signals are studied; then, features of signals are extracted. Finally, a classifier based on the support vector machine (SVM) technology is established, and the qualified features are selected to detect the leak. It is verified that the main frequency of the AE small leak signal due to the failure of the pipe joint is focused in the range of 33–45 kHz, and the algorithms based on SVM with kernel functions all can reach a better estimation accuracy of 98% using the feature “envelope area” or the feature set {standard deviation (STD), root mean square (RMS), energy, average frequency}.

1. Introduction

Pipes as a device for transporting gas are used in various applications. A serious issue in this process is leaks in the pipeline system, which reduces the efficiency of the entire pipeline transportation and also poses a threat to the safety of personnel working in this environment. Thus, efficient and convenient pipeline leak detection and location is very important for the maintenance and management of pipeline systems. In particular, in civil engineering field, a variety of pipes, such as gas drainage pipes in coal mine, urban water supply pipes, and oil conveying pipes, are used either as distribution networks (DNs) or transmission mains (TMs). As described in [1], the methods of addressing the transmission main leakage are different enough from distribution networks to warrant separate consideration. In general, transmission mains are buried deep underground and in less accessible locations than distribution lines, often making them impractical to detect. Many research methods have been proposed in the past for the detection of transmission lines, including fiber optic methods [2], thermography methods [3, 4], balance methods [5], and tracer methods [6]. These methods rely on a strong hardware and software foundation, and the final leak false rate is relatively large. In recent years, elastic stress waves are induced when there is a leak in the pipeline, which generate AE signals that represent the leak, and the acoustic emission technology is used to detect the leak. Firstly, for identifying the leak location, Han et al. [7] proposed a combination of wavelet packet algorithm and radial basis function network (RBFN). Lee et al. [8] presented a method based on theoretical analysis to detect a leak and confirmed that wavelet transform (WT) is an effective tool for determining the source of the leak. Wu et al. [9] presented an improved leak localization identification method, which is a combination of two location methods. Xu et al. [10] presented a multi-level location method for regional location and precise location.

In recent years, transient test based techniques (TTBTs), a newly introduced method, is gaining interest as an economical and effective way to diagnose the pipe network condition [11]. Meniconi and Brunone et al. [12, 13] proposed the possibility of using TTBTs for the location and sizing of branches. The characteristic of branches lies in that it is considered as intermediate between those of classical DNs or TMs system mentioned above.

Although there are many research methods for leak location identification, these methods are proposed based on the case where the pipelines have large leaks and the leak location is in the tube itself. In reality, the piping system is made up of many signal tubes and its leak is not prone to occur in the tube itself but in the pipe joints. Considering the security and tightness of the piping system, pipe joints are widely used as connectors in high precision piping systems. For gas distribution pipe that is subjected to a certain pressure, the leak is most often caused by the loosening of the pipe joints connection, so it is necessary to effectively identify the leak.

A common method for leak detection using AE signals is parameter analysis. Some characteristic indices which can discriminate leak signals and environment signal are selected such as peak, average, root mean square, energy, and other time domain parameters and peak frequency, average frequency, and other frequency domain parameters. Li et al. [14] proposed an experimental study which is to extract the effective parameters to detect the leak of a water distribution system subject to failure of socket joint. Yu and Li [15] proposed an experimental investigation which is to extract the effective parameters leak to achieve the small leak detection of the galvanized steel pipe due to screw thread loosening. Quy et al. [16] proposed a study which is to construct a new signal containing information about leak symptoms and by extracting effective features to leak detection. However, the limitation of this method is that the selected parameters cannot be deterministic for different leak conditions. At the same time, there are some excellent and sophisticated methods that are widely used for signal noise denoising and feature extraction, such as modal decomposition, empirical mode decomposition, wavelet theory, and HHT transform. [1720]. But these complex methods of processing signals are currently carried out for large pipeline leaks.

Another major problem about pipeline leak analysis is to timely identify leaks with low recognition false alarms. At present, modern pattern recognition techniques are widely used in specific engineering and image analysis [2124]. Popular algorithms for leak classification include k nearest neighbor (KNN), logistic regression (LR), artificial neural network (ANN), and support vector machine (SVM), but there is no uniform definition of algorithms in diagnosis recognition problems. Generally, the performance of neural network and support vector machines on recognition is better than others [25]. Hence, this paper focuses on SVM, one of the most powerful classifiers in the literature. For the leak evaluation, the SVM-based two-class pattern classification has been proven to be an effective method that can accurately evaluate whether a leak-induced AE pattern in the analyzed AE signal exists or not. This paper proposes a study on the problem of gas pipe leak. Two pipes isolated from the pipe system are connected by pipe joints. The experimental apparatus is set up to simulate the leak in the gas pipeline due to loosening of thread pipe joint. The acoustic characteristics of the pipeline leak signal are studied, and the representative characteristics of signal leak are extracted. Then, the classifier based on a support vector machine is constructed, and the feature selection is based on the classification standard to form the classification model for gas pipeline detection.

2. Experimental Setup

To simulate the leak due to failure of pipe joint in the propulsion pipeline of the sounding rocket, an experimental setup is specifically designed. The two pipe segments of 20 mm diameter, 500 mm length, and 3 mm thickness are selected and connected by using pipe joint. The pipes are made of aluminum-alloy (AlCu6Mn). A plastic blind case is installed at the end of one of the pipes by using a screw thread connection. In Figure 1(a), gas is pumped into the pipeline through the air compressor and the internal pressure of the pipe is controlled by an air compressor at 5 bar. Leak is formed by loosening the pipe joint connection. To explain the phenomenon about leak, the bubble leak detection method was used in the experiment. The leaking pipe is immersed in water completely; then bubbles are gradually generated by the leak in the location. In Figure 1(b), an AE sensor was placed on the side of the pipe without tape with 5 mm away from the pipe joint. The AE signal was pre-amplified by a preamplifier (PAC, MISTRAS, 2/4/6) operating at 60 dB amplification and collected by an 8-channel AE data acquisition card (PAC, Micro-II Express, 1 MS/s) driven by an ancillary data processing software (PAC, AE Win). Thus, the AE sensing system was utilized to measure and collect the leak AE signal.

3. Procedure Correlating with Leak Detection

3.1. Extraction of Characteristic Indices

In this type of specific engineering case, the time and frequency characteristics of the leak AE signal generated by pipe joint loosening are unknown. When the pipe leaks, there are many features in the time domain and frequency domain of the signal that separately contain leak information. It is considered that extracting characteristic indices from the time domain and frequency domain of the AE leak signal can be used to evaluate the experimental case [10]. Therefore, the most typical characteristics of the AE signal are listed, such as peak, mean, STD, RMS, and energy in time domain as well as skewness, kurtosis, peak frequency, and average frequency in frequency domain. At the same time, there will be a frequency band about the leak signal in the frequency domain when the pipe leaks. Therefore, there is a bold hypothesis: if the frequency band of the leak signal can be enveloped, the envelope area of the characteristic frequency band around may be an important characteristic index to evaluate the leak. So, the nine features and the envelope frequency are extracted to analyze the leak AE signal.

3.2. SVM-Based Leak Detection Algorithm

After features are extracted, due to the selection of features and the final accuracy of pattern recognition, it is necessary to rely on a powerful classifier. There are many intelligent classification algorithms which have been developed for the specific problem. SVM-based classification algorithm is a relatively mature algorithm developed for such two-class classification problems. For leak detection using acoustic signals in this work, SVM with OVA (One-vs-All) algorithm is employed. The basic idea of SVM is to seek an optimal separating hyper plane as the decision surface, which can classify two classes of data whilst maximizing the distance between the points over the separation margin and the hyper plane. The corresponding learning process has mainly two stages, the training and the testing. Suppose there is a set of training samples (xi, yi) that belongs to two classes of data, i = 1, 2, …, n, xi ∈ Rd, yi = ±1. For linear hyper plane,where w and b are undetermined coefficients; SVM solution can be transformed into a convex quadratic programming problem:subject to

The above SVM model only works for the linearly separable samples. For practical classification problems which are most nonlinear, SVM maps the samples to a higher even infinite dimensional feature space through nonlinear mapping and then solves the nonlinear classification with methods used in linear problems. In order to avoid sophisticated process of nonlinear mapping, the kernel functions are introduced to substitute it. Finally, by applying the linear SVM method in feature space, the classification decision function can be obtained as follows:where n is the total number of training data points; αi are Lagrange coefficients corresponding to a support vector (SV); is the pre-defined kernel function; and b is the threshold constant; f(x) > 0 indicates one class while f(x) < 0 represents the other.

It is commonly known that the choice of the kernel functions plays a decisive role in building the decision boundary function. Three common kernel functions, including linear kernel (KL), polynomial kernel (KP), and Gaussian RBF kernel (KG), are employed in the training phase to find the optimal SVM solution. At the same time, considering the final classification accuracy, this paper chooses OVA (One-vs-All) classifier constructed with the features which are trained by k-fold cross-validation. k-fold cross-validation is a more popular analysis method. All training samples in training are divided into k sample subsets which have the same number; one of the sample subsets is selected as test sample and the other k − 1 sample subsets are training subsets, through calculating the average accuracy of the k test results to define the classification accuracy, so that the classification model generalization is optimal. This article chooses 5-fold cross-validation.

3.3. Selection of Characteristic Indices

After the feature extraction and leak detection algorithm is employed, totally 10 features are extracted from the raw data of AE signals. To filter the qualified feature indicators suitable for intelligent recognition, the feature selection plays a significant role. In brief, if the classification ability of features is weak, even if the algorithm for intelligent classification is excellent, the classification result would be poor. If there are many redundant features in the feature set, it will also interfere with the classification result of the classifier. Feature selection is a procedure to find features associated with leak and discard correlate features. This paper proposes the research method based on SVM classification, so the indicators should be selected by constructing the classifier. Finally, the better signal features correspond to the higher accuracy of three classifiers.

4. The Experimental Results and Performance Evaluation

4.1. Data Acquisition of Raw AE Signals

In order to better distinguish between leak signal and background noise, a broad spectral band AE sensor (Physical Acoustics Corporation (PAC), S9208) is firstly utilized to capture the generated AE signal in experiment. Furthermore, the leak is set relatively large to form the leak sources to get a complete comprehension on the effect of known leak source on the AE signal induced. The primary comparison of the time and frequency characteristics of the raw AE signals collected by a broad spectral band AE sensor about environmental noise and small leak which is 0.5 ml/s is illustrated in Figure 2. It can be observed that (1) there is no typical difference between the two typical AE signals in time and frequency domain; (2) the local acoustic energy of the frequency concentrates in the range of 20–50 kHz; (3) one spike locates at 20 kHz, regardless of whether a leak source exists or not in Figure 2(b) and the corresponding peak values have almost no change; (4) as the leak occurs, peak values in the frequency range of 33–45 kHz have a slight change. The change can be regarded as the effect of known leak on the leak AE signal in frequency domain.

It can be seen from Figure 2 that, for the rocket pipeline connected by the pipe joint, the characteristic frequency band of the leak AE signals caused by the looseness of the pipe joint thread is about 33–45 kHz. Hence, a narrow spectral band AE sensor (Physical Acoustics Corporation (PAC), R6a) whose center frequency is around 50 kHz is further utilized to receive the AE signal generated in experiment. In addition, in order to get a deeper understanding about characteristics of the micro-leak AE signal, a small leak is set to form a leak source at the pipe joint which is 0.2 ml/s. The induced leak AE waveform is similar to the time domain shown in Figure 2(a), and the almost same acoustic energy distribution appears in the frequency range 20–50 kHz in Figure 3. There still is one spike which locates around 20 kHz. It is worth noting that there are many new findings which are as follows: (1) the peak values around 33–45 kHz do not significantly change with the presence of the leak. Obviously, such frequency domain feature cannot be clearly utilized to distinguish the leak; (2) it can be found that the frequency band around 33–45 kHz becomes denser with the presence of the leak. Therefore, the authors think that the envelope area of the frequency band 33–45 kHz may be an important characteristic index to evaluate the real leak rate in this study.

For the classification problem in this paper, the leak recognition algorithm can be described in terms of the following six steps:Step 1: Extracting samples and calculating the features of training samples.Step 2: Building the SVM models with the kernel functions KL, KP, KG and combinations of envelope area and the other nine characteristic indices.Step 3: Selecting the appropriate characteristic indices. In testing, the classification accuracies of the aforementioned SVM models built in Step 2 with the same kernel function were compared to determine the best characteristic indices.Step 4: Selecting the appropriate kernel function. In testing, in contrast to Step 3, the classification accuracies of the aforementioned SVM models built in Step 2 with the same characteristic indices were compared to determine the best kernel function.Step 5: Determining the optimal SVM model and then detecting the leak. The comprehensive comparison is made on the classification accuracy of the SVM models in Steps 2 and 3 to obtain the optimal SVM model, which is then used to detect the leak signal.Step 6: Verifying the optimal SVM models built in Step 5. A group of new samples are chosen to detect leak by Steps 1∼5.

4.2. Extraction of Characteristic Indices

Based on the analysis of the original data, firstly 200 samples are randomly selected and the duration of each sample is 100 ms; then the envelope area of 33–45 kHz and the other nine characteristic indices are extracted. In order to better compare the AE signal, the feature values need to be processed as follows: (1) each feature indicator is averaged; (2) the features of background noise are normalized. The relative values of the ten features regarding the two test cases (C1-C2) are given in Table 1.

In contrast with the background noise, the envelope areas of the characteristic frequency band 33–45 kHz present a monotonic increase with the presence of the leak and show a great difference compared with other characteristics indices. This kind of variation also confirms the authors’ conjecture. For other nine features, their variations also show the correlations with the leak. However, the nine features have different performances for leak. For instance, STD, RMS, and energy in time domain whose character increases with the presence of the leak are more obvious than other features in time domain, as well as average frequency in frequency domain whose character is weaker than other features in frequency domain. So, it is necessary to select the qualified indicators and best feature indicators.

4.3. Selection of Characteristic Indices

Due to the inconsistent performance of the nine features in time and frequency domain on the leak, the different SVM models were constructed by characteristic indices (mean, STD, RMS, peak, energy, skewness, kurtosis, peak frequency, and average frequency) and kernel functions KL, KP, and KG. 70 percent of the raw AE data including leak signal and environmental noise are randomly selected as the training set. The remaining 30 percent are used as the testing set.

Figure 4 lists the results after running the test for 9 features. It can be seen that a total of 4 characteristic indices are qualified which all have more than 90% classification accuracy. In the qualified characteristic indices, the time domain features are dominant, and the qualified time domain parameters in Table 1 are better than other time domain features for leak, while the qualified frequency domain features are poor for the leak. The time domain features may be more stable than frequency domain features in analyzing leaks. In summary, through the test results, it is determined that features including STD, RMS, energy, and average frequency may be employed as candidates for SVM-based pattern recognition.

4.4. Detection for Leak

Considering a case which is the extraction of the envelope area of the leak frequency band and the feature sets, the SVM-based classifier is constructed as in the steps given in Section 4.3. Using different feature combinations including {envelope area}, {STD, RMS, energy}, {STD, RMS, energy, average frequency} by taking one or more features from the candidates, the training results based on 5-fold cross-validation are presented in Table 2. The test results based on the trained SVM model using different features set are shown in Figure 5.

It can be obviously observed that different characteristic indices have very different classification accuracies, as follows:(1)It can be seen that for the training results shown in Table 2, regardless of the kind of kernel function, false alarm rate of the case using the index set {STD, RMS, energy, average frequency} is generally lower than that using the index set {STD, RMS, energy}. It illustrates that the combination of time-frequency domain features can effectively reduce false alarm rate, compared with the combination of time domain ones only. Although the accuracy of the set {STD, RMS, energy} is only 98.88% which is the minimum level in all cases, it is higher than the max 95.45% accuracy in Figure 4. It indicates that a combination of various qualified characteristic features, whether in time domain or frequency domain, can increase the accuracy of pipe leak detection, compared to the single feature index in Figure 4.(2)It is also worthy to note that, corresponding to kernel functions KL, KP, and KG, false alarm rate of the proposed feature index envelope area is 0%, 0%, and 0.75% which is much lower than that of the feature index set {STD, RMS, energy} and {STD, RMS, energy, average frequency}. The testing results in Figure 5 also show that the accuracy of index Envelope Area is the highest regardless of the kind of kernel function. Hence, the proposed feature index envelope area in this article can significantly reduce false alarm rate.

For the selection of kernel function, through the training results of the SVM-based classifier in Table 2, it can be seen that the SVM models with kernel function KG or KL are more accurate and efficient with a ≤ 0.75% false alarm rate overall, compared with the kernel function KP. Similarly, as described in Figure 5 about testing results, the average accuracy of the kernel function KP is lower than that of KG and KL. In particular, when the feature index envelope area is selected, the SVM model with the kernel function KL is of the highest accuracy with 100% in both training and testing results. Therefore, it can be concluded that the kernel function KL is the most appropriate for the pattern recognition problem considered in this study.

In summary, based on an overall analysis of training and testing of experimental data, the characteristic index envelope area and the kernel function KL can be regarded as the optimal combination to construct the SVM model for detecting the small leak generated in the propulsion system pipeline of the sounding rocket.

4.5. Validation for Detection

In order to verify the correctness of the optimal kernel function and characteristic indicators for leak detection, a new leak sample is re-acquired based on Table 1 leak range to extract feature. The verification results based on the original training model are shown in Figure 6. At the same time, in order to verify the correctness of the results for the unqualified features filtered out by Section 4.3, the unqualified features are combined with the effective features to construct the SVM classification feature set, and the test results are shown in Table 3.

It is quite clear in Figure 6(b) that feature set {STD, RMS, energy, average frequency} has a classification accuracy higher than 98% in contrast to that of feature set {STD, RMS, energy} which is less than 98%. Thus, whatever kernel function is chosen, it can be verified that the combination of time-frequency domain features can reduce false alarm rate.

As shown in Figure 6(a), the SVM model with the feature index “envelope area” has the highest accuracies with 99.24%, 99.24%, and 98.48% corresponding to kernel functions KG, KL, and KP. Their accuracies, in general, are higher than those of feature combination set {STD, RMS, energy, average frequency} in Figure 6(b). Therefore, the feature index “envelope area” can be validated as the best one to achieve high estimation accuracy for the detection.

In addition, the verification results for the unqualified feature indices are as follows. Even when these unqualified features are combined with each other, such as the combination {mean, STD, RMS, peak, energy} and {skewness, kurtosis, peak frequency, average frequency} in Table 3, their classification accuracies are only up to 96.21% which is lower than that of the qualified feature combinations in Figure 6(b). Therefore, the feature indices filtered out by Section 4.3 have no effect on the accuracy increasing of the SVM classifier and cannot be used as candidates for leak identification.

It is interesting to note that, in Table 3, the unqualified feature indices sets {envelope, mean, peak} and {envelope, skewness, kurtosis, peak frequency} reach higher classification accuracy from 97.73% to 99.24% due to addition of feature index “envelope area”. It further demonstrates that index “envelope area” plays an important role in decreasing false alarm rate of pipe leak detection.

According to Figure 6 and Table 3, the SVM models with kernel functions KG and KL both perform more precisely overall in contrast with KP. When adopting the feature index “envelope area,” the conclusion is the same. Considering the similar results and discussion in Section 4.4, it can be validated that the SVM model constructed with kernel function KL is more suitable for reducing false alarms of failure in pipe joints.

5. Conclusion

An experimental study is carried out for the small leak phenomenon of rocket-pushing system pipe joint based on the acoustic emission. Through using the support vector machine technology, the intelligent identification of the pipe leak is realized with the result of lower false alarm rate. The main conclusions are as follows:(1)With regard to the specifically broadband frequency sensor and the narrow-band sensor simulation experiments for the phenomenon of the small leak of the pipe joint, through analyzing from the perspective of time domain and frequency domain, there is no significant difference between the small leak signal and the background noise from the time domain. The frequency band of pipe joint leak signal occurs at 33–45 kHz. Accordingly, as a new feature index, envelope area of leak AE signals in the frequency band 33–45 kHz is proposed.(2)Through the SVM test results, it is determined that features including STD, RMS, energy, and average frequency all have higher classification accuracy than the last five features, so they could be employed as candidates for pattern recognition.(3)The proposed SVM models with characteristic index set {STD, RMS, energy, average frequency} have generally higher accuracy than those using the set {STD, RMS, energy}. In contrast with single characteristic index, the combination of time- and frequency-domain characteristic indices increases the classification accuracy. Besides, SVM models with kernel function KL are the most accurate and efficient; KG ranks next to KL.(4)The proposed feature “envelope area” in this paper can achieve leak identification of rocket propulsion system pipe joint with good estimation accuracy, and its accuracy is even higher than the combination of time domain and frequency domain features {STD, RMS, energy, average frequency}. It indicates that the feature index “envelope area” can reduce false alarm significantly.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors have declared that no conflicts of interest exist.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (NSFC), grant nos. 51565047 and 51635010; Inner Mongolia University of Science and Technology Innovation Fund, grant no. 2017YQL04; and Natural Science Foundation of Inner Mongolia, grant nos. 2017MS(LH)0531, 2018MS05007, and 2019MS05041.