Abstract

Acoustic emission (AE) technique is often used to detect inaccessible area of large storage tank floor with AE sensors placed outside the tank. For tanks with fixed roofs, the drop-back signals caused by condensation mix with corrosion signals from the tank floor and interfere with the online AE inspection. The drop-back signals are very difficult to filter out using conventional methods. To solve this problem, a novel AE inner detector, which works inside the storage tank, is adopted and a pattern recognition algorithm based on CRF (Conditional Random Field) model is presented. The algorithm is applied to differentiate the corrosion signals from interference signals, especially drop-back signals caused by condensation. Q235 steel corrosion signals and drop-signals were collected both in laboratory and in field site, and seven typical AE features based on hits and frequency are extracted and selected by mRMR (Minimum Redundancy Maximum Relevance) for pattern recognition. To validate the effectiveness of the proposed algorithm, the recognition result of CRF model was compared with BP (Back Propagation), SVM (Support Vector Machine), and HMM (Hidden Markov Model). The results show that training speed, accuracy, and ROC (Receiver Operating Characteristic) results of the CRF model outperform other methods.

1. Introduction

Acoustic emission (AE) is a beneficial method to test the corrosion of the floor without opening the storage tank [15]. In conventional online tank floor tests [6], sensors are fixed by magnets outside the tank wall to collect signals. However, the AE test is susceptible to outside intervention, such as sand collision and external vibration. To solve this problem, a newly invented AE detection equipment is adopted in tank floor inspection and it could work inside the tank to collect the AE signals to avoid external disturbance [7]. Meanwhile, the condition of acoustic field inside storage tank is complicated. The characteristics of many noise signals inside tank are quite similar to the corrosion signals of tank floor, which would seriously influence the result of the evaluation of tank floor.

For those tanks with fixed-roofs, warm gas in the tank condenses to droplets when it meets the cold roof. The droplets fall down from the roof to the water/oil surface and generate interference AE signals [8]. The interference signals caused by the droplets should be filtered out to secure the accuracy of corrosion source location and the efficiency of the tank floor evaluation. Guard sensors are usually employed with the aim of shielding droplets noise signals during AE test for tank bottom. However, the space inside the inner AE detector is small and the hardware system of guard sensors is complicated. So it is not suitable for inner AE detector to use guard sensors. For this reason, a specific pattern recognition algorithm is proposed to filter out the interference signals.

Pattern recognition is often applied to identify AE signals caused by different sources. In 2008, Riahi et al. [9] used an artificial neural network system to differentiate between leakage and corrosion signals in AE testing of aboveground storage tank floors. Zhang et al. [10] proposed a method to detect the leakage of the gas pipeline valve by using AE technique and SVM (Support Vector Machine) was applied to recognize the leak level of the valve accurately. And in the field of tool wear monitoring, Zhu et al. [11], Varma and Baras [12], Zhang et al. [13], and Chen et al. [14] both used HMM (Hidden Markov Model) to recognize the different tool wear states.

In this study, an algorithm based on CRF (Conditional Random Field) model is proposed to differentiate drop-back noise from corrosion AE signals. Seven typical AE parameters, such as amplitude, counts, duration time, rise time, true energy, average frequency, and peak frequency, are extracted to create the classifier model by CRF, BP (Back Propagation), SVM, and HMM. The result showed that CRF model is better than the other three models in training speed, accuracy, and ROC (Receiver Operating Characteristic) results.

This paper is organized as follows. Section 2 introduces the basic principles of CRF model. The experimental setup and procedure are illustrated and feature extraction method for the AE signals is presented in Section 3. Section 4 shows the establishment procedure of CRF model creating and the results, which are obtained by the comparison between CRF and other three classifiers. Section 5 presents results of application of CRF model in the field experiment. And the summary of the paper is given in Section 6.

2. CRF Model

CRF model is a typical discriminant model which was proposed by Lafferty et al. in 2001 [15]. A CRF may be viewed as an undirected graphical model, or Markov random field, which defines a single log-linear distribution over output variable sequences given a particular input random variable [16].

Linear chain conditional random field (LC-CRF), shown in Figure 1, is one of the most commonly used forms of the CRF model. The input random variable and the output random variable denote the observation sequence and the state sequence, respectively. If the conditional probability of given is known, tends to satisfy the maximum global conditional probability ; that is,

In this model, for the observation data , the probability of the state sequence can be represented as is a normalization factor which can be described aswhere is a transition feature function of the entire observation sequence and the states at positions and in the state sequence; is a state feature function of the state at position and the observation sequence; and , which needed to be estimated from training data, denote the weight values of the transition feature function and the state feature function respectively.

For the AE testing on tank floors, the features extracted from the AE signals can be viewed as the observation sequence and the signal types can be viewed as the state sequence. Then, the CRF model can be created and the signals can be classified.

3. Experimental Preparation and Feature Extraction

3.1. Experimental Setup

The experimental system consists of a water tank, the inner AE detector, and a specimen for corrosion experiments, shown in Figure 2. The water tank in Figure 3, with the dimension of 1.4 m × 1.4 m × 1.5 m (length × width × height), is used to simulate a storage tank in the laboratory. The inner AE detector, which is utilized to collect AE signals, includes AE sensors, the amplifier, the data acquisition system, and batteries (see Figure 4(a)) [17]. The detector can actuate itself to get close to the tank floor and collect AE signals, so it could weaken the interference caused by external disturbance and improve the signal-to-noise ratio (SNR) compared with the conventional AE testing method on tank floors. Four AE sensors are mounted in the holes on the bottom of the detector to collect AE signals. And the data acquisition system including processing circuits, the AD sampling card, and the PC104 computer is placed inside the shell to sample and save the collected signals.

The specimen, shown in Figure 4(b), is corroded by acid to simulate the corrosion in tank floor. The material of the specimen is the Q235 carbon steel sheet, which is identical with the material of the storage tank floors. The specimen is machined by the dimension of 180 mm × 180 mm × 5 mm (length × width × thickness) with the surface roughness of 0.02 mm. A round, hollow vessel with the inner diameter of 50 mm is fixed on the specimen by epoxy. The surface of the specimen was grinded by abrasive papers through 400-grade to 2000-grade, rinsed with acetone, degreased with deionized water, and dried in air. Before the experiments, the acid would be poured into the vessel and sealed with a lid wrapped with a matching ribbon.

3.2. Experimental Procedure
3.2.1. Collection of Corrosion Signals

To collect the corrosion signals, 5 mol/L H3PO4 was used as the test solution to react with the specimen to simulate the corrosion in tanks. R15 piezoelectric AE sensors produced by Physical Acoustics Corporation (PAC), with operating frequency range of 50–400 kHz, were used in the experiment. The gain of the charge preamplifier is set to 60 dB, and the cut-off frequencies of the analog band pass filter are 100 kHz and 400 kHz [18, 19]. The sampling rate is 3 MHz and the sampling precision is 10-bit. During the experiment, the threshold level was fixed at 35 mV, which was slightly above the previously measured background noise.

A series of experiments were conducted in the laboratory. The specimen, which was handled in terms of the procedures mentioned before, was placed on the tank floor with a distance of 15 cm under the inner detector in the water tank and corrosion signals were collected for about 1 hour.

3.2.2. Collection of Drop-Back Signals

The field experiment was conducted in a new fire-resistant water tank in good condition. The diameter of the tank is 6 m and the height is 10 m. The experiment preferences were the same as that in the lab test. The temperature during the experiment in the tank was 23°C, while the outside temperature was −15°C. The drop-back signals were rich due to the difference between the warm gas in the tank and the fixed cold roof of the tank. After measuring the environment noise level, the threshold is set higher than the background noise. During the experiment, drop-back signals were collected without the eroded specimens (see Figures 5 and 6).

3.2.3. Collection of Mixed Signals

After collecting the drop-back signals, in the field water tank, the eroded specimen was placed at the same position with the lab test. And both the corrosion signals and the interference signals were acquired. During the experiment of one hour, 7475 groups of AE signals were collected for further analysis.

3.3. AE Feature Extraction and Sample Set

The feature of AE parameters represents characteristics of the corrosion signals, and seven typical feature parameters of AE signals are extracted to build the classification model [20, 21]. The features consists of five hit based features, one comprehensive feature and one frequency feature, shown in Table 1.

In order to realize the classification by pattern recognition, 260 groups of corrosion signals and 260 groups of drop-back interference signals were selected as samples to establish the classification model. The signals were randomly divided into 2 sets. 200 groups signals were used as the training set while the other 60 groups were used as test set, respectively. The formation of the training set and the test set are listed in Table 2.

3.4. Feature Selection

As stated, seven AE features, such as amplitude, counts, duration time, rise time, true energy, average frequency, and peak frequency, are extracted and they are defined as F1~F7. To avoid the influence caused by the different magnitude between seven features, characteristic parameters are normalized to . And a feature selection algorithm named mRMR (Minimum Redundancy Maximum Relevance) is utilized to decide the optimal feature set.

mRMR is a new method to select good features proposed by Peng et al. [22]. It is going to find out the features with the highest relevance to the target class while still having low redundancy with other features. And mRMR could be defined as where is the initial feature set and is the target class set, is the mutual information of feature and class , is the mean value of all mutual information values between individual feature and class , is the mutual information of features and , and means the mutual information between different features.

Mutual information is defined in terms of their probabilistic density functions, given two random variables and :

The operator is defined to combine and and consider the following simplest form to optimize and :

And the result of the formula (6) is called Mutual Information Difference (MID) and it is used to rank features. In practice, seven features are ranked by mRMR as follows: F7, F4, F1, F3, F2, F6, F5. The first 4 features (peak frequency, rise time, amplitude, and duration time) are determined as the optimal feature set to train and recognize the samples.

4. Classification Results and Discussions

In this section, the classifier models of CRF, BP, SVM, and HMM are adopted to recognize the corrosion signals from interference signals based on the extracted features, respectively. And the results of the different models are compared and discussed.

4.1. Establishment of CRF Classification Model

In the LC-CRF model, the feature vectors of input sequences are considered to be positive integers. And thus, the extracted features are normalized to 1~101 and used as the observation sequences. The state of the sample for corrosions and drop-back interferences is labeled as 1 and 2, respectively. The application of the LC-CRF model includes two steps: training and recognition. The features of the training samples are used to calculate the model parameters . The conditional probability model is obtained by means of the maximum likelihood estimation, while the limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithm is used to get the optimal parameters for the model. Here, the initial model parameters are set to 0 and the convergent accuracy limit is 0.0001. During recognition, features of the test samples are taken as input variables while the state sequences for the model, obtained in the first step, are calculated by the Viterbi algorithm. The method can be summarized by the flowchart illustrated in Figure 7.

4.2. Establishment of BP, SVM, and HMM

As stated, BP, SVM, and HMM are commonly used methods for classification of AE signals. In order to compare with the recognition results, the three classifiers use the same training and test data as LC-CRF. The models of BP, SVM, and HMM are established as follows.

BP is a common method of training artificial neural networks. The structure of a typical BP classifier is shown in Figure 8. To design a BP model, these parameters should be determined: function of the output layer, function of the hidden layer, training rate, and the number of hidden layer nodes. In the tests, “” function is selected as the activation function of the hidden layer, and “” function is chosen as the transfer function of the output layer. And the learning values are tuned by a gradient descent manner. The number of hidden layer nodes, set as 14 in this test, was twice the number of input nodes. In addition, learning rate is 0.01.

The SVM uses the central concept named kernel for a number of tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by using different kernel function and base algorithm. The structure of SVM is shown in Figure 9, where is the kernel function. Three parameters need to be determined to design a SVM model: the kernel function, the cost (), and the gamma (). In this paper, the classical RBF kernel function is chosen, where kernel parameters and could be determined by fivefold cross-validation methodology. In that case, the optimal solution is 2, and is 22.627.

HMM is composed of Markov chain and stochastic process. The Markov chain corresponds to the state sequence, which is described by and . The stochastic process is depicted as the observation sequence, which is described by . So a HMM model can be described aswhere is the state number of the Markov chain and is the possible number of observed value in each state. is state transfer probability matrix with sizes and is the probability matrix of the observed values whose sizes are equal to . is the initial probability distribution vector with length . So and must be confirmed to establish a HMM classifier. The value of and is decided as 6 and 8, respectively, using the trial and error method. The model parameters are calculated by Baum-Welch algorithm, while the convergent accuracy limit is 0.0001.

4.3. Results and Discussions

To validate that if the first 4 features are the optimal feature set, samples are trained and tested by CRF model using the first 3, the first 5, and all seven features, respectively. The accuracy rates are shown in Table 3. Using the first 4 and the first 5 features can make an accuracy rate of 100%, higher than the other two feature sets. So the first 4 features (peak frequency, rise time, amplitude, and duration time) are selected as the optimal feature set to train and recognize the samples.

Using same training and test set, the recognition results of CRF model and the other three algorithms (BP, SVM, and HMM) are compared based on a PC (Core 2 Duo E6300 with 3.2 G memory), respectively. The results are compared in the training time, the accuracy, and the ROC (Receiver Operating Characteristic) curve. The maximum training time and accuracy rate are shown in Table 4.

It shows that the accuracy rate of the CRF is higher than BP, SVM, and HMM model and the training time of CRF is the shortest. Gradient descent algorithm is utilized to adjust the parameters of BP, so it needs to iterate to get the optimal parameters. Moreover, the selection of the maximum iteration, learning rate, and number of the hidden layer nodes are often determined by experience or method of trial and error. So the training speed and the accuracy rate of BP are lower and it is difficult to get the optimum network. The training speed and accuracy of SVM are higher than BP and HMM but it is more suitable for the situation of small sample data rather than AE testing field, which is a large sample data situation. HMM is widely employed in many fields, but one of the disadvantages of HMM is that this model assumes that the observation value at one point is only dependent on the state of Markov chain at this time and the observation sequences are independent of each other, while the features of AE signals are not independent of each other. So the accuracy of HMM is the lowest in the four models. The CRF model could fully utilize the information of the features and accept the dependences between the features. Meanwhile, it could always achieve global optimization.

Furthermore, ROC curve is used to test the performance of CRF model and BP, SVM, and HMM model [2325]. The curve is created by plotting the true positive rate (sensitivity) against the false positive rate (1 − specificity) at various threshold settings. The area under the curve (AUC) can be used as judge criteria of models. Larger AUC represents better performance. Figures 9 and 10 show the ROC curves of the recognition results for corrosion signals and drop-back noises, respectively.

It is shown in Figures 10 and 11 that CRF model has the greatest AUC, followed by SVM, BP, and HMM model.

Therefore, the recognition result of CRF model outperforms SVM, BP, and HMM in training speed, accuracy rate, and the AUC of the ROC curves.

5. Application of CRF Model in Field Experiments

In the last step of field experiment, the inner detector collected both corrosion signals and drop-back noise. During the experiment of one hour, 7475 groups of AE signals were collected and classified using CRF model. 1105 groups were classified as corrosion signals and the other 6370 groups were identified as drop-back interferences. And the quantity of corrosion signals is approximately equal to the quantity of corrosion signals collected in the laboratory in the same duration. To test the effect of CRF model in the field environment, statistical analysis method was used to compare the results. The relation curves, in which the cumulative quantity of signals varies with time, were obtained and showed in Figures 12 and 13.

In Figure 12, the corrosion signals were collected in the laboratory and the quantity of AE hits (corrosion signals) varies with time. It is observed that the corrosion process can be divided into 4 zones. At the beginning (Zone 1), the phosphoric acid began to react with the steel plate. Because of the large contact area and high hydrogen ion concentration, the quantity of AE hits increases fast. Then, the hydrogen created during the reaction accumulated on the surface of the plate and formed bubbles so that the contact area was decreased (Zone 2). As the reaction progressed, the bubbles converged into large bubbles and then burst out. The acid was fully contacted with the steel plate again and the reaction rate and the growth rate of AE hits increased dramatically (Zone 3). While the concentration of hydrogen ion fell, the acid reacted with the steel plate slower than before and the quantity of AE hits grew slowly (Zone 4).

Figure 13(a) shows the relation between the quantity of AE hits and time before being classified by CRF model and it is almost linear. It does not reflect the statistical law of the corrosion tests. Figure 13(b) shows the relation curve of AE hits and time of the signals collected in field site after classified by CRF model. The relation curve in Figure 13(b) also has 4 zones with same characteristics in Figure 12. There is a subtle difference between turning point of the zones on time axis in Figure 13(b) and that in Figure 12 because the set-up time for inner detector to start collecting signals in the field test was a litter longer than in the laboratory test. The result shows that the data processed using CRF model could reflect the statistical law of the corrosion test and the CRF model performs well in the field test application.

6. Conclusions

Drop-back signals which are caused by condensation in storage tanks with fixed roofs are a big problem in AE online storage tank floor inspection. In this paper, a new inner AE detector and a recognition algorithm based on CRF model are applied to differentiate corrosion AE signals from drop-back interferences. AE parameters, amplitude, counts, duration time, rise time, true energy, average frequency, and peak frequency, were selected as feature parameters for recognition.

Experiments were carried out in water tanks both in laboratory and in the field to collect corrosion AE signals and drop-back signals. The recognition results of CRF are compared with other 3 models of BP, SVM, and HMM. The comparisons of the accuracy, training speed, and the AUC of the ROC curve show that the CRF outperforms the other three models for the recognition of corrosion signals and drop-back interference signals.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by Tianjin Science Foundation under the Grant of 13JCYBJC18000, Tianjin Technical-Support Foundation under the Grant of 14ZCZDGX00003, and Tianjin Marine Economy Innovation and Development Region Demonstration Project under the Grant of 2015120024000473.