Abstract

Oil recovery with pumping units is widely used in oilfield development at present. The management level of pumping units is related to the overall economic benefit of oilfield. In order to effectively control the production management of pumping wells, it is necessary to take all the production data to formulate a reasonable working system and make continuous analyses about the daily management level of pumping wells. The analysis and interpretation of the indicator diagram is a direct understanding way about the situation of deep well pumps in petroleum engineering. However, the complicated practical oilfield conditions often lead to the distortion of regular indicator diagrams, resulting in the low efficiency of fault-type diagnosis and wrong recognition. A new method of identifying the fault type of indicator diagrams is proposed by using the dynamic time warping (DTW) distance, through which a two-dimensional indicator diagram is transformed into a one-dimensional differential curve. The experimental results of real indicator diagrams show that the method is effective for the type diagnosis of indicator diagrams with high accuracy and fast speed.

1. Introduction

Rod pumping is the commonly used method for mechanical oil extraction. In order to understand the real underground working situation of the deep pumping wells, the indicator diagram records the real-time changes of curves related to the changes of states of light bars. A closed curve called an indicator diagram is drawn by a complete up-and-down stroke, which contains the rich information of the pump and the rod and their corresponding working situations. Indicator diagrams are viewed as one of the important basic sources in the fault diagnosis of pumping wells and the evaluation of oil production [13]. Therefore, the intelligent identification and analyses of indicator diagrams are quite significant for petroleum engineering and production management [4, 5].

The actual indicator diagrams of pumping wells are the real reflection of the complex process of the deep pumping wells and the sucker rods. The curves in the indicator diagrams possibly fluctuate and change frequently, making the diagnosis result easily influenced by the subjective and objective factors such as the experiences and technical levels of the interpreter, so the efficiency and accuracy of the pumping system may be restricted to some extent [6, 7]. Meanwhile, with the rapid development of computer sciences, many indicator diagrams and their corresponding types have been digitalized and defined in the database [8], which can be easily applied to the fault diagnosis in the real oilfields through the database. The existing digital resources are helpful to automatically identify the shape and characteristics of indicator diagrams, which have become a hot spot in the research of oil development [9, 10].

Currently, some methods, e.g., the back-propagation (BP) neural networks [11, 12], the radial basis function (RBF) [13], the extreme learning machine [14], and the convolution neural networks (CNNs) [15], have been applied to the fault diagnosis of pumping units and are gradually replacing the traditional artificial analysis. However, due to the internal mechanism of these methods, they possibly suffer from the following disadvantages: (1) when extracting features, the input of numerous displacement-load data of indicator diagrams may seriously affect the complex internal mapping structure of neural networks so that the diagnostic accuracy of the model is lowered; (2) some models, e.g., CNNs, are computationally intensive and have a large number of parameters to set, requiring large storage space and memory, which is contradictory to the low-level hardware conditions in the real field; (3) fault diagnoses are based on the detection of contour features of indicator diagrams, but some models sometimes cannot effectively extract these features.

To address the abovementioned issues, the key solution in the fault diagnosis of pumping units can be summarized as extracting and recognizing contour features of indicator diagrams effectively. In this paper, based on the dynamic time warping (DTW) distance of time series [16], a new method for improving the identification of indicator diagrams is proposed. This method first uses the differential curves to convert the two-dimensional closed curves of the indicator diagrams into the one-dimensional curves and then identify the type of indicator diagrams through the DTW distance [17, 18]. The experimental results show that our method can effectively distinguish similar curves of indicator diagrams, and the fault recognition for the unknown indicator diagrams is accurate and fast.

2. The Main Idea of the Proposed Method

2.1. Basic Concepts About Differential Curves
2.1.1. The Turning Point

For a theoretical indicator diagram, it includes two parts: the up stroke and the down stroke. A turning point Q is viewed as the end of the upper stroke and the start of the down stroke in the whole indicator diagram. Suppose there is an indicator diagram that contains only a quadrangle ABCD composed of n points, {(), (), …, ()}, where () (i = 1,2, …, n) stands for the coordinates of each point on the sides of ABCD. Note that these points () do not necessarily represent the four vertexes A, B, C, or D, but may be any points on the side of the quadrangle. The definition of Q is as follows:where and are, respectively, the maximum values in the horizontal and vertical directions in ABCD. Since an indicator diagram with no fault is actually a parallelogram, a fault-free indicator diagram is illustrated using a parallelogram ABCD in Figure 1, and the turning point Q just happens to be C. The up stroke is ABC and the down stroke is CDA. The horizontal axis stands for the displacement of the light rod and the vertical axis is the load of the light rod.

2.1.2. The Reverse-Translational Curve of the Down-Stroke Curve

The fault of pumping units leads to the failure of making the indicator diagram a parallelogram, meaning the up stroke and the down stroke cannot form a parallelogram. Under the circumstances, the specific fault should be recognized using indicator diagrams. To identify the fault type, one of the possible ways is to calculate the difference between the down stroke and the up stroke and then determine the fault type based on the difference, which makes obtaining the reverse-translational curve of the down-stroke curve a precondition. A reverse curve is defined as a curve like the letter “S,” formed by two curves bending in opposite directions or produced by the joining of two curves that turn in opposite directions. For a down-stroke curve CDA that is composed of the line segments of CD and DA, its reverse curve symmetrically about A, called AD’’C’’, should be calculated by a symmetrical transformation of CDA. For convenience, suppose A coincides with the origin O, i.e., A is in the same position as the origin O. If the coordinate of D is (), then its corresponding transformed coordinate of D’’ is (). Similarly, the corresponding transformed C’’ is () if the coordinate of C is (). Therefore, for any point (x, y) in CDA, the symmetric point () in AD’’C’’ is shown in the following equation:

The symmetric transformation from CDA to its reverse curve AD’’C’’ is shown in Figure 2.

Then, each point on AD’’C’’ should be translated by, respectively, moving a distance of in the positive horizontal direction and a distance of in the positive vertical direction to obtain a new translational curve C’D’A’. The coordinate of points on C’D’A’ is (), which writes

Substitute equation (2) into equation (3), and then the coordinate of points on C’D’A’ is

Finally, the reverse-translational curve (i.e., C’D’A’) of the original down-stroke curve is shown in Figure 3. Since the theoretical ABCD in Figure 1 is a parallelogram, now B coincides with D’, O coincides with C’, and so does C and A’, meaning the two curves OBC and C’D’A’ are actually same. However, in the real situation, the indicator diagram is not always a parallelogram due to the fault of pumping units, so these two curves will not always coincide under the real circumstances.

2.1.3. Introduction to the Differential Curve

The difference function is widely used in mathematics, physics, and informatics and is a tool for studying discrete mathematics, reflecting a change between discrete quantities. The differential curve is used to characterize the difference function. The shapes of theoretical indicator diagrams drawn from the fault-free pumping units are parallelograms. Since the probability of two or more equipment’s failure leading to the same symmetrical changes in the up stroke and down stroke of the indicator diagrams is quite small, the different shapes of differential curves can be used to describe the states or the fault types of pumping units.

For a two-dimensional diagram ABCD with a specific kind of fault information from pumping units, it is difficult to directly measure the difference between the up stroke and the down stroke, causing a challenge to recognize the fault type. However, it can be inferred that if the difference between the up-stroke curve and the reverse-translational curve of the down stroke is transformed to a one-dimensional curve, which can be easily plotted through a differential curve, it will be much easier for interpreters to judge the fault type since the two-dimensional shapes of indicator diagrams have been expressed by one-dimensional differential curves. The differential curve is computed by the reverse-translational curve of the down stroke minus the up-stroke curve of the indicator diagram, through which the recognition of the fault type for pumping units is highly simplified.

Suppose the up-stroke curve is defined as , (i = 1,2, …, n) and the reverse-translational curve of the down stroke is , where and , respectively, are the functions defining the relations between the horizontal axis xi and the vertical axis ( and ), meaning each coordinate of the up-stroke curve is () and that of the reverse-translational curve of the down stroke is (). can be obtained by equation (4) and is directly obtained from the original data. Then, the differential curve is formulated as

For example, Figure 4 shows the realization of the differential curve abcd using the indicator diagram corresponding to the fault of missing fixed valves, i.e., the quadrangle ABCD in Figure 4 is used as the typical fault of missing fixed valves in indicator diagrams, where ABC represents the up stroke, CDA stands for the down stroke, and AD’ is parallel to CD. As mentioned previously, the quadrangle ABCD should be a parallelogram when the pumping units have no any faults. However, since AD’ rather than AB is parallel to CD, the quadrangle ABCD fails to be a parallelogram, and then the differential curve abcd is obtained by equation (5).

Three theoretical differential curves corresponding to three types of indicator diagrams and their fault types are shown in Figure 5. It is easily inferred that if the differential curve abcd in Figure 4 is in a fault-free state of pumping units, point b should coincide with point c, shown as the differential curve of “Working properly” in Figure 5. It is seen that the theoretical differential curves of different faults of pumping units are obviously different. However, due to the actual complex underground situation, the actual differential curves are quite unstable and changeable, possibly quite different from those theoretical fault types of differential curves, making it difficult for the type recognition of differential curves.

2.2. DTW Distance

To address the abovementioned issue, the DTW distance, originally used in speech recognition for similarity research of time series, is introduced to recognize the differential curves in complicated conditions [19]. By DTW, the distance between two time series with different lengths can be calculated accurately. In order to realize the time-axis scaling in a time series, the similarity waveforms of time series should be aligned on the time axis. According to the minimum cost of the time warping path, the DTW distance is used to measure the similarity of two time series with different lengths, in which the points are not required a one-to-one mapping.

For two time series X and Y, some parameters are introduced first. Suppose and are the lengths of these two time series; (1 ≤ i ≤ |X|) is the subseries varying from the first element in the X series to the i-th element; (1 ≤ j ≤ |Y|) is the subseries varying from the first element in the Y series to the j-th element; represents the Euclidean distance between two points and , and represents the DTW distance between the two subseries and the subseries . A DTW distance can be calculated by the dynamic programming method with a time-complexity of , which is defined as follows:where writes

Note that if i = 1, , or if j = 1, , or if i = j = 1, . For example, as shown in Figure 6, take the two time series (X = (3, 6, 5, 3, 4) and Y = (3, 4, 5, 6, 2, 3)) to illustrate the calculation of the DTW distance. In Figure 6, X is the row vector and Y is the column vector, so i ranges from 1 to 5 and j ranges from 1 to 6. For a one-dimensional time series, . Therefore, according to equations (6) and (7), Row 1 (1 ≤ i ≤ 5, j = 1) of Figure 6 is calculated: D(X[1 : 1], Y[1 : 1]) = d(3,3) = 0.D(X[1 : 2], Y[1 : 1]) = d(6,3) + D(X[1 : 1], Y[1 : 1]) = 3 + 0 = 3.D(X[1 : 3], Y[1 : 1]) = d(5,3) + D(X[1 : 2], Y[1 : 1]) = 2 + 3 = 5.D(X[1 : 4], Y[1 : 1]) = d(3,3) + D(X[1 : 3], Y[1 : 1]) = 0 + 5 = 5.D(X[1 : 5], Y[1 : 1]) = d(4,3) + D(X[1 : 4], Y[1 : 1]) = 1 + 5 = 6.

Likewise, then Row 2 (1 ≤ i ≤ 5, j = 2) is calculated: D(X[1 : 1], Y[1 : 2]) = d(3,4) = 1.D(X[1 : 2], Y[1 : 2]) = d(6,4) + D(X[1 : 1], Y[1 : 1]) = 2 + 0 = 2.……

Repeat the abovementioned procedure until Row 6 (1 ≤ i ≤ 5, j = 6) is calculated: ……D(X[1 : 5], Y[1 : 6]) = d(4,3) + D(X[1 : 4], Y[1 : 5]) = 1 + 4 = 5.D(X[1 : 5], Y[1 : 6]) is the final DTW distance, shown with a gray background in Figure 6. Generally, if two time series have a close DTW distance, they can be considered as similar time series even when they have different lengths. Suppose there are two differential curves of indicator diagrams, one of which has been recognized as a known fault type of pumping units; the other is unknown. If the DTW distances between the two differential curves are close enough, they can be considered as the same fault type; if not, the unknown curve should be calculated with other differential curves with a known fault type to obtain the DTW distance for similarity evaluation. Thus, the DTW distance can be applied to evaluate the fault type between two differential curves of indicator diagrams.

2.3. The Procedure of the Proposed Method

The procedure of the proposed method is described as follows: Step 1: input the original indicator diagram and determine the turning point, the up stroke, and the down stroke, respectively.Step 2: perform the reverse-translational transformation upon the down stroke and then obtain the differential curve, called diff, of the indicator diagram according to equation (5).Step 3: using some differential curves with already-known fault types as the control group, the DTW distance between diff and the control group is calculated according to equation (6).Step 4: if the DTW distance between diff and the curve in the control group is small enough or smaller than a threshold, they are considered as the same type. If not, try another curve in the control group until the fault type of diff can be determined.

3. Experimental Results and Analyses

In order to test the efficiency of the proposed method, the experiments were performed using the actual indicator diagram dataset, which contains 150 actual indicator diagrams and has altogether 5 types: gas influence (sample size = 45), insufficient supply of fluids (sample size = 39), not-filled pump (sample size = 22), oil well paraffinication (sample size = 19) and working properly (sample size = 25), in which the first four are fault types and the last one means the pumping units have no any faults. The experimental environment is Intel Core i5-2450M CPU, 4G memory, and Windows 10. Twelve randomly selected actual indicator diagrams as well as their types are shown in Figure 7, and their corresponding differential curves are plotted according to equation (5), also as shown in Figure 7. Each indicator diagram is named from #1 to #12 successively.

Then, based on the abovementioned differential curves, the DTW distances are obtained using our method, as shown in Table 1. The values in the grid are the DTW distances between the different differential curves, in which the smallest distances are highlighted with a bold font. The two curves with the smallest distance can be considered as the same type of indicator diagrams. It is seen that most indicator diagrams with smallest distances are proven to be the same type by comparing the types already shown in Figure 7. For example, the closest curve for #4 is #9, whose type is “working properly.” The closest curve for #2 is #12, and for #12, the closest curve is #10. The curve types of #2, #10, and #12 are all “gas influence.” There is only one mistake, which is that the closest curve for #11 (insufficient supply of fluids) is #5 (gas influence), according to Table 1. The reason is the curves of “insufficient supply of fluids” and those of “gas influence” sometimes are quite similar in some conditions. Therefore, the accuracy of fault recognition for these 12 samples is 91.7%.

Then, our method was compared to some other typical methods for fault recognition of indicator diagrams including BP networks [12], extreme learning machine [14], and RBF [20] in order to further verify its effectiveness. The tests were performed using all the 150 samples in the actual dataset of indicator diagrams. Table 2 shows the training time and fault diagnosis accuracy of different methods. As can be seen from Table 2, our method performs best in the accuracy and the second best in the speed, only slightly slower than the extreme learning machine.

4. Conclusion

Based on the characteristics of the indicator diagrams, this paper establishes a method of recognizing the fault type of pumping units using the DTW distance of differential curves that is obtained through the difference between the up-stroke curve and the reverse-translational curve of the down-stroke curve. The experimental results show that the proposed method is accurate and fast, suitable for the complex real situations of oilfields.

Data Availability

The data used to support this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. 41702148 and 41672114).