Abstract

Given only the signs of signal plus noise added repetitively or sign data, signal amplitudes can be recovered with minimal variance. However, discrete derivatives of the signal are recovered from sign data with a variance which approaches infinity with decreasing step size and increasing order. For industries such as the seismic industry, which exploits amplitude recovery from sign data, these results place constraints on processing, which includes differentiation of the data. While methods for smoothing noisy data for finite difference calculations are known, sign data requires noisy data. In this paper, we derive the expectation values of continuous and discrete sign data derivatives and we explicitly characterize the variance of discrete sign data derivatives.

1. Introduction

Sign-bit recording systems discard all information on the detailed motion of the geophone and ask only whether its output is positive or negative, whether it is going up or coming down. In a sign-bit system, therefore, the signal waveform is converted into a square wave. All amplitude information is lost [1].

It is well known that, for a range of signal-to-noise ratios between about 0.1 and 1, the final result of sign-bit recording, after stacking, correlating, and other processing, looks no less good, to the eye, than the result from full-fidelity recording. This is considered to be as intriguing as it is surprising [1]. Alternatively, what we present in this paper is evidence that the processing of sign-bit data (i.e., sign data) can be limited for certain cases relative to the processing of the full-bandwidth data.

Model signal appears as a one-dimensional function, 𝑓(𝑣), and noise as a random variable, 𝑋. In industries like the seismic industry, measurements of signal, 𝑓(𝑣)βˆΆβ„β†’β„ and noise, π‘‹βˆΆΞ©β†’β„, 𝑓(𝑣)+𝑋 are recorded for multiple iterations of the noise. The average of the measurement (i.e., the expectation 𝐸) recovers the signal𝐸(𝑓(𝑣)+𝑋)=𝑓(𝑣).(1.1) If the noise is chosen to be uniform, where 𝜌(π‘₯) is the density function such thatξƒ―1𝜌(π‘₯)=2π‘Ž,βˆ’π‘Žβ‰€π‘₯β‰€π‘Ž0,else,(1.2) then the variance, 𝐸(𝑓(𝑣)+𝑋)2βˆ’(𝐸(𝑓(𝑣)+𝑋))2, reduces to1Var(𝑓(𝑣)+𝑋)=3π‘Ž2.(1.3) As reported by O’Brien et al. [2], it was empirically discovered that the average of the signs of signal plus noise recovers the signal if the signal-to-noise ratio is less than or equal to one. This can be shown mathematically [3] using the signum function [4], sgn(π‘₯)=+1,π‘₯>0, sgn(π‘₯)=βˆ’1,π‘₯<0, sgn(0)=0,ξ€œπΈ(sgn(𝑓(𝑣)+𝑋))=βˆžβˆ’βˆžξ€œsgn(𝑓(𝑣)+π‘₯)𝜌(π‘₯)𝑑π‘₯=βˆžβˆ’π‘“ξ€œπœŒ(π‘₯)𝑑π‘₯βˆ’βˆ’π‘“βˆ’βˆžπœŒ(π‘₯)𝑑π‘₯.(1.4) Because 𝜌(π‘₯) is even and equals ξ€œπ‘“βˆ’π‘“πœŒ(π‘₯)𝑑π‘₯(1.5)𝐸(sgn(𝑓(𝑣)+𝑋))=𝑓(𝑣)π‘Ž[],π‘“βˆˆβˆ’π‘Ž,π‘Ž.(1.6) The variance is 𝐸(sgn(𝑓(𝑣)+𝑋))2βˆ’(𝐸(sgn(𝑓(𝑣)+𝑋)))2, reducing toξ‚΅Var(sgn(𝑓(𝑣)+𝑋))=1βˆ’π‘“(𝑣)π‘Žξ‚Ά2.(1.7) Consequently, the error is minimal when the signal-to-noise ratio is near unity.

The advantage of retaining only the signs of signal plus noise is the requirement of approximately 1 bit to record the information as opposed to requiring 16 to 20 bits to record full amplitude data [2].

The goal of this paper is to examine the recovery of derivatives from sign data in uniform noise. The issue is that recovery of signal from sign data can be extended to recovery of derivatives of the signal through the use of finite differences and that recovery is constrained by the size of the variance. In this paper, we first examine sign data derivatives for both the discrete and continuous case. We follow with a derivation of variance. We conclude our analysis with a computational test, which lists the true variance versus the variance estimate derived statistically for a test function for selected step sizes.

2. Sign Data Derivatives

Let the signal 𝑓(𝑣) be an 𝑛th order differentiable function. Based on signal recovery from sign data, it can be shown that derivatives of the signal are also recoverable. Using the linearity of the expectation value,𝐸Δ𝑛𝑣(Δ𝑣)𝑛=Ξ”sgn(𝑓(𝑣)+𝑋)𝑛𝑣(Δ𝑣)𝑛𝐸(sgn(𝑓(𝑣)+𝑋)),(2.1) where Δ𝑛𝑣 is the 𝑛th order finite difference operator with respect to the variable 𝑣 [5]. In this case, a nonunit step size, Δ𝑣, is used (e.g., [6]).

In detail, we can writeΔ𝑛𝑣(Δ𝑣)𝑛1sgn(𝑓(𝑣)+𝑋)=(Δ𝑣)𝑛𝑛𝑖=0(βˆ’1)π‘–βŽ›βŽœβŽœβŽπ‘›π‘–βŽžβŽŸβŽŸβŽ ξ€·π‘“ξ€·sgn𝑣+(π‘›βˆ’π‘–)Δ𝑣+𝑋𝑖,(2.2) where the notation (𝑛𝑖) represents the binomial coefficient 𝑛!/𝑖!(π‘›βˆ’π‘–)! and where 𝑋𝑖=𝑋0,𝑋1,… are independent representations of the random variable, 𝑋.

Substituting from (1.6) into (2.1) yields𝐸Δ𝑛𝑣(Δ𝑣)𝑛=1sgn(𝑓(𝑣)+𝑋)π‘ŽΞ”π‘›π‘£π‘“(𝑣)(Δ𝑣)𝑛.(2.3) In the limit of infinitesimal step size, this becomes a continuous derivativelimΔ𝑣→0𝐸Δ𝑛𝑣(Δ𝑣)𝑛=1sgn(𝑓(𝑣)+𝑋)π‘Žπ‘‘π‘›π‘“(𝑣)𝑑𝑣𝑛(2.4) or𝐸𝑑𝑛𝑑𝑣𝑛=1sgn(𝑓(𝑣)+𝑋)π‘Žπ‘‘π‘›π‘“(𝑣)𝑑𝑣𝑛.(2.5) Equation (2.4) presents an alternative solution to direct integration. For example, using the rule, βˆ«π‘“(π‘₯)𝛿(𝑛)∫(π‘₯)𝑑π‘₯=βˆ’(πœ•π‘“/πœ•π‘₯)𝛿(π‘›βˆ’1)(π‘₯)𝑑π‘₯, [7], the integral𝐸𝑑3𝑑𝑣3ξ‚Ά=ξ€œsgn(𝑓(𝑣)+𝑋)βˆžβˆ’βˆžξƒ©2𝑑2𝛿𝑑𝑒2𝑑𝑓𝑑𝑣3+6𝑑𝛿𝑑𝑒𝑑𝑓𝑑𝑑𝑣2𝑓𝑑𝑣2𝑑+2𝛿3𝑓𝑑𝑣3ξƒͺ𝜌(π‘₯)𝑑π‘₯(2.6) loses all terms with derivatives of the delta functional, reducing to𝑑=2𝜌(βˆ’π‘“)3𝑓𝑑𝑣3||||𝑓=βˆ’π‘₯.(2.7) In general,𝐸𝑑𝑛𝑑𝑣𝑛𝑑sgn(𝑓(𝑣)+𝑋)=2𝜌(βˆ’π‘“)𝑛𝑓𝑑𝑣𝑛||||𝑓=βˆ’π‘₯=1π‘Žπ‘‘π‘›π‘“π‘‘π‘£π‘›.(2.8) It follows that the noise is restricted such that π‘Žβ‰₯|𝑓|.

3. The Variance of Sign Data Derivatives

Letting 𝑆𝑛≑(Δ𝑛𝑣/(Δ𝑣)𝑛)sgn(𝑓(𝑣)+𝑋), compute the variance, 𝐸(𝑆2𝑛)βˆ’(𝐸(𝑆𝑛))2. From (2.3), it follows that (𝐸(𝑆𝑛))2=(Δ𝑛𝑣𝑓/π‘Ž(Δ𝑣)𝑛)2. 𝐸(𝑆2𝑛) can be found by inductively generalizing from 𝑛=2:𝐸𝑆22ξ€Έ=1Δ𝑣4𝑏0𝑓sgn0+𝑋0ξ€Έ+𝑏1𝑓sgn1+𝑋1ξ€Έ+𝑏2𝑓sgn2+𝑋2ξ€Έξ€Έ2=1Δ𝑣4𝑏20+𝑏21+𝑏22+2𝑏0𝑏1𝑓0𝑓1π‘Ž2ξ‚Ά+2𝑏0𝑏2𝑓0𝑓2π‘Ž2ξ‚Ά+2𝑏1𝑏2𝑓1𝑓2π‘Ž2,ξ‚Άξ‚Ά(3.1) where 𝑓𝑖=𝑓(𝑣+(π‘›βˆ’π‘–)Δ𝑣), π‘“π‘˜=𝑓(𝑣+(π‘›βˆ’π‘˜)Δ𝑣), and 𝑏𝑖=(βˆ’1)𝑖(𝑛𝑖).

These results generalize to𝑆Var𝑛=1(Δ𝑣)𝑛2𝑛𝑖=0βŽ›βŽœβŽœβŽπ‘›π‘–βŽžβŽŸβŽŸβŽ 2+2(Δ𝑣)𝑛2π‘›ξ“π‘–β‰ π‘˜(βˆ’1)𝑖+π‘˜βŽ›βŽœβŽœβŽπ‘›π‘–βŽžβŽŸβŽŸβŽ βŽ›βŽœβŽœβŽπ‘›π‘˜βŽžβŽŸβŽŸβŽ ξ‚΅π‘“π‘–π‘“π‘˜π‘Ž2ξ‚Άβˆ’ξ‚΅Ξ”π‘›π‘£π‘“π‘Ž(Δ𝑣)𝑛2.(3.2) Since 𝑓 is differentiable, |(Δ𝑛𝑣𝑓/(Δ𝑣)𝑛)βˆ’(𝑑𝑛𝑓/𝑑𝑣𝑛)|<πœ€ and, thus, Δ𝑛𝑣𝑓/(Δ𝑣)𝑛 is finite. Based on definition, Var(𝑆𝑛)>0.

Consequently, limΔ𝑣→0Var(𝑆𝑛)=+∞. Similarly, limπ‘›β†’βˆžVar(𝑆𝑛)=+∞,0<Δ𝑣<1. The variance of a discrete sign derivative approaches infinity with decreasing step size and increasing order. In addition, since limΔ𝑣→0(𝑆𝑛)=(𝑑𝑛/𝑑𝑣𝑛)sgn(𝑓(𝑣)+𝑋), Var((𝑑𝑛/𝑑𝑣𝑛)sgn(𝑓(𝑣)+𝑋))=+∞, so in the case of the continuous derivatives (2.5) the variance is infinite.

Use (3.2) to find the variance of the first discrete sign derivative by letting 𝑛=1:𝑆Var1ξ€Έ=1(Δ𝑣)2𝑓2βˆ’21+𝑓20ξ€Έπ‘Ž2ξƒͺ.(3.3) The variance of the second discrete sign derivative (𝑛=2) is similarly computed as𝑆Var2ξ€Έ=1(Δ𝑣)4ξ‚€16βˆ’π‘Ž2𝑓20+4𝑓21+𝑓22.(3.4)

4. Computational Tests

These results can be tested computationally. Variance can be estimated for 𝑁 iterations withVar𝑁𝑆𝑛=1π‘π‘ξ“π‘š=1𝑆𝑛𝑆(π‘š)βˆ’πΈπ‘›,ξ€Έξ€Έ(4.1) where the index π‘š designates the sample number.

Consider the test function 𝑓=sin(𝑣). Using the first-order sign data derivative (𝑛=1), compare Var(𝑆1) to Var𝑁(𝑆1), and using the second-order sign data derivative (𝑛=2), compare Var(𝑆2) to Var𝑁(𝑆2) for 𝑁=1000, π‘Ž=1, and 𝑣=3. The results are shown in Tables 1 and 2.

We illustrate the change in variance in Figure 1, which shows three curves, each consisting of 𝑁=1000 iterations. The first curve in blue shows the sign data recovery of the function 𝑓=sin(𝑣) or 𝐸(𝑆0) for π‘Ž=1 and Δ𝑣=0.5. The second curve in green shows the sign data recovery 𝐸(𝑆1), which approximates 𝑓′ for π‘Ž=1 and Δ𝑣=0.5. The third curve in red shows the sign data recovery 𝐸(𝑆2), which approximates π‘“ξ…žξ…ž for π‘Ž=1 and Δ𝑣=0.5.

5. Conclusions

Recovery of signal from the signs of signal plus noise incurs a variance, which only depends on the noise amplitude, while recovery of discrete derivatives from the signs of signal plus noise (i.e., sign data) incurs a variance which grows infinite for infinitesimal step size and infinite order.

The application problem is that sign data can be used in the seismic industry in processes which may differentiate the data. In such cases, if the step size or order of the finite difference is not constrained, the process will incur large variance and convergence of the process will be minimized. While methods for smoothing noisy data for finite difference calculations are known, sign data requires noisy data. In this paper, we have characterized the problem by explicitly evaluating the variance of discrete sign data derivatives.

Appendix

Clarification of 𝐸(𝑆22)

𝐸𝑆22ξ€Έ=1Δ𝑣4𝑏0𝑓sgn0+𝑋0ξ€Έ+𝑏1𝑓sgn1+𝑋1ξ€Έ+𝑏2𝑓sgn2+𝑋2ξ€Έξ€Έ2=1Δ𝑣4𝑏20sgn2𝑓0+𝑋0ξ€Έ+2𝑏0𝑓sgn0+𝑋0𝑏1𝑓sgn1+𝑋1ξ€Έ+𝑏21sgn2𝑓1+𝑋1ξ€Έ+2𝑏2𝑓sgn2+𝑋2𝑏0𝑓sgn0+𝑋0ξ€Έ+2𝑏2𝑓sgn2+𝑋2𝑏1𝑓sgn1+𝑋1ξ€Έ+𝑏22sgn2𝑓2+𝑋2.ξ€Έξ€Έ(A.1) This simply reduces to𝐸𝑆22ξ€Έ=1Δ𝑣4𝑏20+2𝑏0𝑏1𝐸𝑓sgn0+𝑋0𝑓sgn1+𝑋1ξ€Έξ€Έ+𝑏21+2𝑏2𝑏0𝐸𝑓sgn2+𝑋2𝑓sgn0+𝑋0ξ€Έξ€Έ+2𝑏2𝑏1𝐸𝑓sgn2+𝑋2𝑓sgn1+𝑋1ξ€Έξ€Έ+𝑏22ξ€Έ.(A.2) In order to compute (A.2), we must compute an integral of the form𝐸𝑓sgn𝑖𝑓+𝑋sgnπ‘˜=+π‘‹ξ€Έξ€Έβˆžβˆ’βˆžξ€·π‘“sgn𝑖+π‘₯𝑖𝑓sgnπ‘˜+π‘₯π‘˜ξ€ΈπœŒξ€·π‘₯π‘–ξ€ΈπœŒξ€·π‘₯π‘˜ξ€Έπ‘‘π‘₯𝑖𝑑π‘₯π‘˜.(A.3) The probability densities are both uniform:πœŒξ€·π‘₯𝑖π‘₯=πœŒπ‘˜ξ€Έ=ξƒ―12π‘Ž,βˆ’π‘Žβ‰€π‘₯β‰€π‘Ž,0,else(A.4) and using the results of (1.6),𝐸𝑓sgn𝑖𝑓+𝑋sgnπ‘˜=𝑓+π‘‹ξ€Έξ€Έπ‘–π‘“π‘˜π‘Ž2.(A.5) Consequently, (A.2) reduces to𝐸𝑆22ξ€Έ=1Δ𝑣4𝑏20+𝑏21+𝑏22+2𝑏0𝑏1𝑓0𝑓1π‘Ž2ξ‚Ά+2𝑏0𝑏2𝑓0𝑓2π‘Ž2ξ‚Ά+2𝑏1𝑏2𝑓1𝑓2π‘Ž2ξ‚Άξ‚Ά.(A.6)

Acknowledgment

Thanks are due to Gwendolyn Houston for advice and proofreading.