#### Abstract

Seismic data denoising and interpolation are generally essential steps for reflection processing and imaging workflow especially for the complex surface geologic conditions and the irregular acquisition field area. The rank-reduction method is a valid way for the attenuation of random noise and data interpolation by selecting the suitable threshold, i.e., the rank of the useful signals. However, it is difficult for the traditional rank-reduction method to select an appropriate threshold. In this paper, we propose an adaptive rank-reduction method based on the energy entropy to automatically estimate the rank as the threshold for seismic data processing and interpolation. This method considers the energy entropy into the traditional rank-reduction method. The energy entropy of signals can be used to indicate the energy intensity of a signal component in the total energy. The difference of the energy entropy between the useful signals and random noise is perceived as a measurement for selecting the appropriate threshold. Synthetic and field examples indicate that the proposed method can well achieve the attenuation of random noise and interpolation automatically without the estimation of the ranks and demonstrate the feasibility of the new adaptive method in seismic data denoising and interpolation.

#### 1. Introduction

Seismic data denoising and interpolation play a vital role in seismic processing and have an influence on seismic interpretation particularly. Because of the presence of the complex surface geologic conditions and the irregular acquisition field area [1, 2], random noise and missing data are common cases in seismic exploration. In general, acquired data from field should be preprocessed before data processing [3, 4]. Missing data can severely affect the following processing steps, such as, static correction, waveform inversion, and high-resolution image processing [5–7]. However, the random background noise and the irregular sampled data are commonly present in field data because of geologic conditions and economic factors, which lead to the poor quality of seismic imaging. Seismic interpolation can effectively deal with the data acquired from severe field environment and obtain regular data. Therefore, seismic data denoising and interpolation are essential for seismic data preprocessing (e.g., [8, 9]).

Many different attempts have been proposed to achieve the attenuation of random noise and handle missing traces in seismic data. Among them, the transformed-domain methods based on compressive sensing theory (CS) or sparity have attracted much attention in seismic data interpolation [10–12]. These methods utilize the sparity properties of signals under some basis in some transformed domains. The signals and random noise can be mapped into different subspaces due to the correlation of signals and thereby the signals can be separated. Several methods based on transformed domains have been presented for seismic data denoising and interpolation, such as Fourier domain, wavelet domain, Radon domain, seislet domain, and curvelet domain (e.g., [13–17]). Wavelet-based denoising technique is a simple and effective method for seismic data denoising [18, 19]. He et al. [20] present a kind wavelet thresholding function based on hyperbolic tangent function. Lu et al. [21] propose a new wavelet threshold function and denoising application. Sparse dictionary learning is a similar method to recover signals by approximating seismic data with some bases in a dictionary [22, 23]. The dictionary is constructed by details found in the training data, thereby the missing traces can be constructed with the sparse coefficients by the dictionary. Wave equation is also a kind of considerable way to interpolate seismic wavefields using the propagation of seismic wave; however, the method depends on the known velocity model (e.g., [24–26]). The rank-reduction method has been widely used for seismic denoising and interpolation (e.g., [27, 28]). In addition, there are also other popular kinds of methods for denoising and interpolation. Canales [29] proposes signal enhancement in the f-x domain based on signal predictability, and this method is widely used in the industry. Median filtering is a type of nonlinear filter method and is widely accepted in the oil and gas [30]. It chooses the middle value of a sequence of numbers ordered. Unfortunately, these two methods have the tendency to retain some background noise and possess the limitation of random noise attenuation. Singular value decomposition is a kind of laterally coherent signal detection in multitrace recordings by an eigenvalue decomposition [31, 32]. The useful coherent signals are encoded into the large singular values while the background noise is encoded into the small values. However, this method performs less well in extracting weak signals or conflicting-dip events [33]. Deep learning is a relatively new method for noise suppression and a lot of literatures have been published [34, 35]. This kind of method has a great potential for perfect attenuating random noise in seismic data; however, one requires plenty of different known samples for training data processing.

The complete signals without any noise have a fixed rank in time or frequency domain, but the rank of seismic data raises in the presence of the random noise and missing traces. Thus, we can reduce the rank to suppress the random noise and reconstruct seismic data. Many attempts using the rank-reduction method have been developed to deal with seismic interpolation problems in publications (e.g., [36–38]). Multichannel singular spectrum analysis (MSSA) [28, 39, 40] is a popular way to attenuate the random noise. The MSSA method is extended from different aspects to overcome the existing shortcomings [41–43]. This method considers the Hankel matrix constructed with frequency-domain seismic data at a given frequency. Then, the singular value decomposition (SVD) approach is used to decompose the Hankel matrix. The original data can be recovered with a fewer number of singular values representing the useful signals by the truncated SVD (TSVD). Due to the presence of the residual random noise after the application of the traditional MSSA method, Huang et al. [44] propose an improved MSSA algorithm named damped MSSA (DMSSA) to further remove the remaining noise. For the TSVD stage, the suitable selection of threshold as the rank of the signals is crucial for the attenuation of random noise. The larger and smaller ranks may lead to the increment of artificial noise and the damage of signals, respectively. However, it is difficult for the conventional rank-reduction method to select an appropriate rank.

In this study, in order to overcome the problem of rank estimation, we develop an adaptive rank-reduction method based on energy entropy to automatically estimate the rank as the threshold. The energy entropy of signals can be used to indicate the energy intensity of a signal component in the total energy. The difference of the energy entropy between the useful signals and random noise is perceived as a measure for selecting the optimal threshold. The optimal rank is used to suppress the random noise and recover the original signals via the DMSSA algorithm. The structure of the paper is organized as follows. First, we review the basic theory of the rank-reduction method using the DMSSA algorithm. Then, the adaptive rank-reduction method based on the energy entropy is introduced. Finally, synthetic and field data examples are provided to demonstrate the performances of the adaptive rank-reduction method for data denoising and interpolation.

#### 2. Theory and Method

##### 2.1. Review of Rank-Reduction Method Using DMSSA

Rank-reduction methods have been widely used in seismic denoising and interpolation, such as the singular value decomposition method [45], the local singular value decomposition method [33], and the Cadzow methods [46]. These methods can map seismic data onto the signal subspace and the random noise subspace using the SVD method and separate the signals from the background noise. The DMSSA algorithm is a popular way to attenuate the random noise and obtain the signals.

As we know, the frequency-domain seismic data at a given frequency with *N* traces can be represented as . The corresponding Hankel matrix can be written as **T:**where and denotes the integer part of the element, and [][28]. As the seismic data **D** consists of the signals and the random noise, we can assume the seismic data and the random noise have the full rank size, while the signals have deficient rank. Due to the presence of the random noise and missing traces, the ranks of data are raised, as well as the corresponding Hankel matrix. To suppress the random noise, the SVD algorithm is applied to decompose the corresponding Hankel matrix:where **L** and **U** are matrices with orthogonal singular vectors, **Λ** denotes the diagonal matrix with singular values, *k* is the rank of the useful signals, and *H* denotes the conjugate transpose. The diagonal matrix **Λ** consists of decreasing singular values , . The stronger the energy of the component, the larger the singular value. Because the energy of the signals is much stronger than that of the random noise, the singular values of the signals are distinctly larger than those of the random noise. Thus, the signals can be reconstructed via choosing a few large singular values to recover the signals, i.e., let **Λ**_{N-k} be 0 to attenuate the random noise. The reconstruction process of the signals can be represented by the TSVD method as follows:

However, Huang et al. [44] indicate that the reconstructed signals using the traditional TSVD method still have some residual random noise because of the signal-plus-noise subspace and propose the DMSSA algorithm to overcome the defect of the residual noise mentioned above. The new TSVD considering a damping factor can be expressed as follows [44]:where **W** denotes the damping matrix, **I** is the identity matrix, *σ*_{m} denotes the first element of **Λ**_{N-k}, and *p* is the damping factor.

##### 2.2. Data Interpolation

In fact, the incomplete observed data **D**_{obs} can be expressed with the sampling operator **S** and the completed data **D**, i.e., where represents the element product of two matrices with the same row and column size. In order to handle with the missing traces, we apply the popular weighted POCS-like algorithm to achieve seismic data interpolation [28]. The algorithm can be formulated as follows [39]:where *a*_{n} denotes an iterative parameter and **F**_{d} denotes the DMSSA filter. We can set the max number of iterations to stop this algorithm.

##### 2.3. New Rank-Reduction Method

As mentioned above, the background noise and incomplete traces will raise the rank, thereby we can decrease the rank of the signals for the attenuation of the noise. The rank of the signals has a close relationship with the useful signals, so a suitable selection of the rank is crucial. The larger value of the rank will raise the residual random noise and the small value of the rank will lead to damage the signals. However, for the traditional rank-reduction method with a constant rank, it is difficult to select a suitable threshold as the rank of the useful signals. To estimate the rank of seismic data, we consider the energy entropy of signals to be applied in the DMSSA algorithm [47, 48]. Actually, the energy entropy of the signals can be used to indicate the energy intensity of a signal component in the total energy. The difference of the energy entropy between the useful signals and random noise is perceived as a measure for selecting the appropriate threshold. The singular values of the Hankel matrix characterize the energy of the corresponding signal component and thus the total energy of the seismic data can be written as follows [49, 50]:

The corresponding energy entropy of the singular value *σ*_{i} for the seismic data denoising and interpolation can be written as follows (e.g., [51, 52]):

As the energy of the random noise is much weaker than that of the signals, the singular values of the signals are much larger than that of the random noise. Thus, the energy entropy of the useful signals should be larger than that of the random noise. We can adaptively estimate the value of the rank *λ* via

Thus, the energy entropy of the signals can be used to achieve the adaptive DMSSA algorithm for seismic data denoising and interpolation, which helps to the following data processing steps. Figure 1 shows the schematic workflow for seismic data denoising and reconstruction using the traditional and proposed rank-reduction method.

##### 2.4. Examples and Application

In this section, we provide synthetic and field applications to test the feasibility of the proposed new rank-reduction approach based on the energy entropy. The first two synthetic applications are provided to indicate the validity of the proposed rank-reduction method for seismic data denoising and interpolation. The field data application is intended to better demonstrate the proposed method in dealing with complicated seismic wavefields.

##### 2.5. Synthetic Data Applications

To test the capability of the proposed method, we provide two synthetic models: a simple model and a complex model, to achieve seismic data denoising and interpolation.

The first synthetic data is a linear-event example with crossing events, as shown in Figure 2(a). The Kirchhoff forward modeling algorithm is applied to generate this seismic record with 45 Hz Ricker wavelet. There are two crossing events in the depth area and one horizontal event near the surface in this section. The data has 100 traces and 500 sample points with a sampling interval of 1 ms. Figure 2(b) indicates the noisy data with a low SNR. The reflection events are encoded in the background noise. The new adaptive rank-reduction method is used to suppress the random noise, and Figure 2(c) shows the local processing denoised data. It is evident that the background noise is attenuated and the signals are well enhanced. The differences between the noiseless and the denoised data are very small, and the results are very close to the noiseless records. Figure 2(d) displays the denoising error between the denoised and noiseless data. The signals are well protected because of the suitable rank selection of the useful signals. Figure 2 clarifies the validity of the adaptive approach in seismic data denoising.

**(a)**

**(b)**

**(c)**

**(d)**

Besides seismic data denoising, data reconstruction is a common problem in seismic data processing. To test the interpolation effectiveness of the adaptive approach, we provide the observed data examples with 30%, 40%, and 50% missing traces, respectively. The missing traces are randomly removed. The incomplete data and interpolated results are shown in Figure 3.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

Figures 3(a) and 3(b) are the incomplete data with 30% missing traces and the interpolated data, respectively. Figures 3(c) and 3(d) are the examples with 40% missing traces, and Figures 3(e) and 3(f) are the decimated and reconstructed data with 50% missing traces. There is no doubt that the missing traces in seismic data will lead to the increment of the rank. We can reconstruct the seismic data using the rank-reduction method by reducing the rank. Figures 3(b), 3(d), and 3(f) show that the random noise is well suppressed and simultaneously the data can be reconstructed with the proposed method for three different kinds of missing traces. The random noise in the shallow area is perfectly suppressed but the SNR of the reconstructed results becomes low with the increment of the recording time. In addition, note that the random noise is gradually increasing with the increment of the missing traces. Thus, the first example indicates that the proposed method can provide acceptable results and is a feasible way to achieve the goal of seismic data denoising and interpolation.

To evaluate the performance of the reconstructed results, the SNR is used for quantitative analysis, which is defined below:where **D**_{0} denotes the original clean data and **D** is the interpolated results. This will lead to a different value of SNR for the incomplete data with different missing traces. The SNRs of the reconstructed data in Figure 3 are 9.23 dB, 9.06 dB, and 8.65 dB, respectively. This indicates that the less the missing data, the higher the SNRs of the recovered data.

The second synthetic example used to test the proposed method is a complex shot record. The shot record is generated by the Kirchhoff-based forwarding modeling algorithm with the dominant frequency of 45 Hz Ricker wavelet. This record contains some reflection events distributed from the shallow to the deep area. There are 80 traces and 801 time-sample points with 1 ms sampling interval. The noiseless seismic record is shown in Figure 4(a). The noisy data are caused by adding Gaussian noise, as indicated in Figure 4(b). The incomplete data with 30% missing traces are shown in Figure 4(c), in which the signals are severely damaged. The mask operator is displayed in Figure 4(d), where 1 indicates keeping a trace and 0 denotes removing a trace. Note that the removing traces are randomly selected in accordance with the mask operator [53].

**(a)**

**(b)**

**(c)**

**(d)**

We first apply the conventional rank-reduction method with a preset rank of seismic data as the threshold. As the rank selection has a close relationship with the number of linear events (and there are 10 events in this record), we select three ranks R: 8, 10, and 12 as the threshold to denoise and interpolate the seismic data. Figure 5 shows the denoised and interpolated results and the corresponding interpolation error for the different rank selection. As displayed in Figure 5, the denoised and interpolated results are not satisfying. The residual random noise is obvious and the useful signals are removed, which leads to a low SNR. Thus, it is difficult for the traditional rank-reduction method to select a suitable rank for complex and various seismic data. Although we can try a lot different ranks for the rank-reduction method, the computational cost is much expensive.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

The results after the applications of the presented method are shown in Figure 6. Figure 6(a) displays the denoised and interpolated data, which is correctly recovered compared to the clean data in Figure 4(a). The locations of the missing traces are well interpolated and the corresponding amplitudes are relatively balanced compared to the neighboring traces. In addition, the SNRs of the reconstructed data in Figure 5 are 9.87 dB, 12.80 dB, and 15.50 dB, respectively. The SNRs of the recovered data become high with the increment of the selected rank. The SNR of the result after the application of the adaptive method in Figure 6 is 21.52 dB, and the SNR is well improved. Though there is still a little residual noise near the boundary, the SNR of the processing results have been greatly improved, compared to the results in Figure 5. The corresponding interpolation error also suggests that the useful signals are well protected, which indicates the great potential of the presented adaptive method.

**(a)**

**(b)**

The synthetic examples verify the feasibility of the presented adaptive method in seismic data denoising and interpolation. The adaptive method can well attenuate the background noise to improve the SNR and interpolate the missing traces, which helps to the following data processing.

##### 2.6. Field Data Example

Seismic data denoising and interpolation play a vital part in seismic data processing, which are the key steps for the following data processing. We provide three examples to test the performance of the proposed method.

##### 2.7. 2D Poststack Domain

To further verify the proposed method, an example of 2D poststack field data is given to perform data denoising and interpolation. This data example is a stacked profile acquired from east of China, as shown in Figure 7(a). Due to the complex geologic condition, the data have a very low SNR and some useful signals are masked by the random noise, especially for the shallow area, where the continuity of the seismic events is damaged. The corresponding incomplete data with 30% missing traces are shown in Figure 7(b). We can see that the events have been severely damaged and its SNR is very low. Figure 7(c) shows the mask operator, where 1 denotes keeping a trace and 0 indicates removing a trace.

**(a)**

**(b)**

**(c)**

For the conventional rank-reduction method, a constant rank should be set to achieve data denoising and interpolation. Because there are a lot of the linear events in the stacked section, we are supposed to select a slightly larger value of the rank of the useful signals. In addition, the missing traces will further result in the increment of the rank. It is clear that the larger rank will lead to the increment of the random noise and the smaller rank will damage the useful signals. Thus, we select different values of the rank to suppress the random noise and interpolate the field data. Figure 8 displays four kinds of process results after the application of the conventional rank-reduction method, *R* = 10, *R* = 15, *R* = 20, and *R* = 25, respectively. The denoised and interpolated results in Figure 8 are really improved in comparison to the incomplete traces in Figure 7(b). However, the background residual noise is obvious in the whole sections, especially for the shallow area. The reflection events are smeared because of the residual noise, as shown in black ellipses. Moreover, there are still some spatial discontinuities in the sections because of inappropriately selected ranks.

**(a)**

**(b)**

**(c)**

**(d)**

Figure 9 shows the adaptive results after the application of the adaptive method. It is obvious that the interpolated section has a high SNR, whether in the shallow or the deep area, and the events are relatively more continuous and spatially coherent compared with those in Figure 8. The suitable selection of the threshold, as the rank, is shown in black ellipses. Since there is no absolute clean data for field data, we only consider the interpolation error as the measure to evaluate the SNRs of the reconstructed data.

To further verify the quality of the recovered data, the differences between the field data (Figure 7(a)) and the processing results (Figure 8) are displayed in Figure 10. Figures 10(a)–10(d) are corresponding to the processing data in Figure 8, respectively. Figure 10(e) shows the errors between the original data (Figure 7(a)) and the adaptive processing results (Figure 9). We know that the removed noise and interpolation errors are evident for the rank of 10, 15, 20, and 25 because of the unsuitable selection of the ranks. In Figure 10(e), the removed noise is much more than other recovered results and the interpolation errors are relatively less than others in Figure 10, especially for the shallow area, as shown in black ellipses. The differences indicate the effectiveness of the adaptive method.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

##### 2.8. 2D Prestack Domain

A set of marine 2D seismic data is used to test the validity the proposed method. The original data are shown in Figure 11(a) and the corresponding incomplete data with 30% missing traces are displayed in Figure 11(b). The data consist of reflections, multiples, and some diffractions generated from the boundaries of the salt body. The continuity of the reflection events is damaged and the amplitudes are in nonuniform distribution. We apply the traditional rank-reduction method with constant rank to reconstruct the incomplete data. Figures 12(a)–12(c) show the corresponding processing results with the ranks of 20, 30, and 40, respectively. Figure 12(d) displays the adaptive processing results after the application of the proposed method.

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

**(d)**

These processing results succeed in reconstructing the original data, and it is hard to distinguish the original and recovered data. To further compare the results with different rank-selection, Figure 13 provides the differences between the original and processing data in Figure 12 with different ranks. It is evident that the interpolation errors are obvious, as shown in red arrows, when the ranks are a constant directly and it is difficult to select a suitable rank for different data using the traditional rank-reduction methods. As shown in Figure 13(d), the red arrows indicate that the interpolation error is obviously smaller than the errors of Figures 13(a) and 13(b). Notice that Figures 13(c) and 13(d) have a similar interpolation error. For the near-offset traces, the interpolation error of the adaptive results is less than that of the constant rank of 40. However, for the far-offset traces, the results with a constant rank of 40 are better than that of the adaptive results, which indicate the suitable rank of the data is close to 40. These tests verify the proposed method can recover the missing traces in the prestack domain.

**(a)**

**(b)**

**(c)**

**(d)**

##### 2.9. 3D Prestack Domain

This method can be extended to 3D seismic data and a set of 3D marine data is used to verify the performance of the proposed method in 3D prestack domain. The 3D shot gathers used for testing the method are shown in Figure 14(a) and the corresponding incomplete data with 30% missing traces after random removal are displayed in Figure 14(b). To compare the processing results, the traditional rank-reduction methods are used to reconstruct the incomplete data. Figure 15 shows the processing results using the traditional rank-reduction methods with some constant ranks and the proposed method with the adaptive rank selection.

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

**(d)**

It seems that both the traditional rank-reduction methods and the proposed adaptive method can produce reasonable results though the unsuitable ranks may be used. To evaluate the quality of the recovered results in detail, we provide the differences between the original and recovered data in Figure 16, corresponding to the results in Figure 15. It is evident that the interpolation errors shown in Figures 16(a)–16(c) are slightly larger than that of the recovered data obtained by the adaptive method (Figure 16(d)). The suitable rank for this field data should be larger than the ranks used. It will cost too much to select the suitable rank for different seismic data by many tests. The proposed method can lead to a promising result without any tests and have a great potential for seismic data denoising and interpolation.

**(a)**

**(b)**

**(c)**

**(d)**

The field applications illustrate the good performance of the proposed method in field data denoising and interpolation. The new method can enhance the useful signals and improve the spatial coherence of the events, which contributes to seismic data processing and inversion.

#### 3. Conclusions

In this paper, we present an adaptive rank-reduction method based on energy entropy to automatically estimate the rank as the threshold for seismic data denoising and interpolation. Some conclusions can be expressed as follows. Firstly, this new rank-reduction method based on the energy entropy of signals utilizes the energy intensity of a signal component in the total energy. The difference of the energy entropy between the useful signals and the random noise can be perceived as a measure for selecting the appropriate threshold. Secondly, the synthetic examples indicate the principle and feasibility of the proposed method in dealing with random noise attenuation and useful signals enhancement. Finally, the field examples demonstrate the good performance of the proposed method for data denoising and interpolation in handling the complex wavefields.

In addition, the proposed method may confront the limitation of computational time, especially for a big dataset. One should consider the algorithm based on CPU or GPU parallel processing techniques to accelerate time-consuming work.

#### Data Availability

The data, models, and code generated or used during the study are available from the corresponding author by request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was supported by the Earthquake Prevention and Mitigation Planning project of Shandong Province (Grant no. SD13523). The authors would like to thank Shandong Earthquake Agency for supporting this research and Xiaowei Zhao, Chao Sun, and Peng Lin for their meaningful discussion and helpful suggestions.