Abstract

This paper evaluates a mutual-information integral that appears in EXIT-chart analysis and the capacity equation for the binary-input AWGN channel. Rapidly-converging series representations are derived and shown to be computationally efficient. Over a wide range of channel signal-to-noise ratios, the series are more accurate than the Gauss-Hermite quadrature and comparable to Monte Carlo integration with a very large number of trials.

1. Introduction

The mutual-information between a binary-valued random variable and a consistent, conditionally Gaussian random variable with variance 𝜎2 isξ€œπ½(𝜎)=1βˆ’βˆžβˆ’βˆžξ‚ƒβˆ’ξ€·exp1/2𝜎2ξ€Έξ€·π‘¦βˆ’πœŽ2ξ€Έ/22ξ‚„βˆš2πœ‹πœŽlog2(1+π‘’βˆ’π‘¦)𝑑𝑦.(1) This function and its inverse play a central role in EXIT-chart analysis, which may be used to design and predict the performance of turbo codes [1], low-density parity-check codes [2], and bit-interleaved coded modulation with iterative decoding (BICM-ID) [3]. A change of variable yieldsπ½ξ€œ(𝜎)=1βˆ’βˆžβˆ’βˆžξ€·expβˆ’π‘¦2ξ€Έ/2√2πœ‹log2ξ‚€1+π‘’βˆ’πœŽπ‘¦βˆ’πœŽ2/2𝑑𝑦(2) which has the same form as the equation for the capacity of a binary-input AWGN channel [4]. Despite its origins in the early years of information theory, the integral in (2) has not been expressed in closed form or represented as a rapidly-converging infinite series. Consequently, (2) has been evaluated by numerical integration, Monte Carlo simulation, or approximation [2].

As will be shown subsequently, a rapidly-converging series representation for (1) and (2) is1𝐽(𝜎)=1βˆ’ξƒ―ξ€·ln2𝜎expβˆ’πœŽ2ξ€Έ/8βˆšβˆ’ξ‚΅πœŽ2πœ‹22ξ‚Άπ‘„ξ‚€πœŽβˆ’12ξ‚βˆ’βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šξ‚ΈπœŽ(π‘š+1)exp22ξ€·π‘š+π‘š2ξ€Έξ‚Ήξ‚€πœŽΓ—π‘„π‘šπœŽ+2,(3) where1𝑄(π‘₯)=2π‘₯erfc√2ξƒͺ=ξ€œβˆžπ‘₯1βˆšξ‚΅βˆ’π‘¦2πœ‹exp22𝑑𝑦.(4)

Numerical evaluation of (3) requires truncation to a finite number of terms. Since an alternating series satisfying the Leibniz convergence criterion [5] appears in (3), simple upper and lower bounds are easily obtained by series truncation, and the maximum error is easily computed from the first omitted term. Only six terms in the summation over π‘š in (3) are needed to compute 𝐽(𝜎) with an error less than 0.01.

Define 𝐽𝑀(𝜎) to be (3) with the upper limit on the summation over π‘š replaced with 𝑀. Let 𝐸(𝑀,𝜎) be the magnitude of the error when the series that is truncated after 𝑀 terms, i.e. 𝐸(𝑀,𝜎)=|𝐽(𝜎)βˆ’π½π‘€(𝜎)|. Since 𝑄(π‘₯)≀exp(βˆ’π‘₯2/2)/2,  π‘₯β‰₯0, the error magnitude 𝐸(𝑀,𝜎) satisfies𝐸(𝑀,𝜎)≀expβˆ’πœŽ2ξ€Έ/82(𝑀+1)(𝑀+2)(5) which indicates a rapid convergence of (3).

Because of the dependence on 𝜎 in the bound given by (5), the number of terms required to achieve a very small error may become large for low 𝜎. This motivates the use of an alternative series expression for (1) and (2) that is suitable for small 𝜎. A rapidly-converging series representation for 0<πœŽβ‰€0.25 is1𝐽(𝜎)=ξƒ―βˆ’ξ€·ln2𝜎expβˆ’π‘2ξ€Έ/2√+ξ‚΅πœŽ2πœ‹22ξ‚Ά++ln2𝑄(𝑏)βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šξ‚ΈπœŽexp22ξ€·π‘š+π‘š2ξ€Έξ‚Ή+𝑄(π‘šπœŽ+𝑏)βˆžξ“π‘›=11𝑛2π‘›π‘›ξ“π‘˜=0(βˆ’1)π‘˜ξ‚΅π‘›π‘˜ξ‚Άξ‚ΈπœŽΓ—exp22ξ€·π‘˜2ξ€Έξ‚Ήξ‚Ό,βˆ’π‘˜π‘„(π‘˜πœŽβˆ’π‘)(6) whereπœŽπ‘=2+ln3𝜎.(7) Note that (6) is valid for 𝜎=0 if we use 𝑄(βˆ’βˆž)=1 and 𝑄(∞)=0.

Both series entail the computation of the 𝑄 function, which is itself an integral. However, the central importance of the 𝑄 function has led to the development of very efficient approximations to compute it, for instance, by using the erfc function found in MATLAB or the approximation proposed in [6].

In EXIT-chart analysis [1], the inverse of 𝐽(𝜎) is required. The truncated series can be easily inverted by using numerical algorithms such as the MATLAB function fzero, which uses a combination of bisection and inverse quadratic interpolation.

The remainder of this paper is organized as follows. The series are derived in Section 2. Computational aspects of (3) and (6) are discussed in Section 3. The series are compared against Gauss-Hermite quadrature in Section 4, and the paper concludes in Section 5.

2. Derivation of Series

To derive the desired representations, we first derive a family of series representations of a more general integral. Consider the following integral:ξ€œπΌ(π‘Ž,𝜎)=βˆžβˆ’βˆžπΊ(π‘₯)ln(1+π‘Žπ‘’βˆ’πœŽπ‘₯)𝑑π‘₯,π‘Žβ‰₯0,𝜎β‰₯0,(8) where1𝐺(π‘₯)=βˆšξ‚΅βˆ’π‘₯2πœ‹exp22ξ‚Ά.(9) Since 𝐼(0,𝜎)=0 and 𝐼(π‘Ž,0)=ln(1+π‘Ž),π‘Ž>0 and 𝜎>0 are assumed henceforth. For any 𝛽β‰₯0, define1𝑏=πœŽξ‚΅ln1+2π›½π‘Žξ‚Ά.(10) Dividing the integral in (8) into two integrals, changing variables in the second one, and using the fact that 𝐺(π‘₯) is an even function, we obtain 𝐼(π‘Ž,𝜎)=𝐼1(π‘Ž,𝜎)+𝐼2(π‘Ž,𝜎), where𝐼1ξ€œ(π‘Ž,𝜎)=βˆžβˆ’π‘πΊ(π‘₯)ln(1+π‘Žπ‘’βˆ’πœŽπ‘₯𝐼)𝑑π‘₯,(11)2ξ€œ(π‘Ž,𝜎)=βˆžπ‘πΊ(π‘₯)ln(1+π‘Žπ‘’πœŽπ‘₯)𝑑π‘₯.(12) A uniformly convergent Taylor series expansion of the logarithm over the interval [0,1+2𝛽] isln(1+𝑦)=ln(1+𝛽)+βˆžξ“π‘š=1(βˆ’1)π‘š+1π‘š(1+𝛽)βˆ’π‘š(π‘¦βˆ’π›½)π‘š,(13) where 0≀𝑦≀1+2𝛽. The uniform convergence can be proved by application of the Leibniz criterion for alternating series [5].

Since π‘Žexp(βˆ’πœŽπ‘₯)≀1+2𝛽 for π‘₯β‰₯βˆ’π‘, (13) indicates that ln(1+π‘Žπ‘’βˆ’πœŽπ‘₯) in (11) may be expressed as a uniformly convergent series of continuous functions in the interval of integration. Since βˆ«βˆžβˆ’π‘πΊ(π‘₯)𝑑π‘₯ is bounded, the infinite summation and the integration may be interchanged and𝐼1+(π‘Ž,𝜎)=ln(1+𝛽)𝑄(βˆ’π‘)βˆžξ“π‘š=1(βˆ’1)π‘š+1π‘š(1+𝛽)βˆ’π‘šξ€œβˆžβˆ’π‘πΊ(π‘₯)(π‘Žπ‘’βˆ’πœŽπ‘₯βˆ’π›½)π‘šπ‘‘π‘₯,(14) where1𝑄(π‘₯)=2π‘₯erfc√2ξƒͺ=ξ€œβˆžπ‘₯𝐺(𝑦)𝑑𝑦.(15) Substituting the binomial expansion, interchanging the finite summation and integration, completing the square of the argument of the exponential in the integrand, changing variables, and then evaluating the integral, we obtain𝐼1(π‘Ž,𝜎)=ln(1+𝛽)𝑄(βˆ’π‘)βˆ’βˆžξ“π‘š=11π‘š(1+𝛽)π‘šπ‘šξ“π‘˜=0(βˆ’1)π‘˜Γ—ξ‚΅π‘šπ‘˜ξ‚Άπ›½π‘šβˆ’π‘˜π‘Žπ‘˜βˆ’π‘˜2𝑄(π‘˜πœŽβˆ’π‘).(16)

Since ln(1+π‘Žπ‘’πœŽπ‘₯)=ln[π‘Žπ‘’πœŽπ‘₯(1+π‘Žβˆ’1π‘’βˆ’πœŽπ‘₯)]=𝜎π‘₯+lnπ‘Ž+ln(1+π‘Žβˆ’1π‘’βˆ’πœŽπ‘₯),β€‰β€‰π‘Žβ‰ 0, (12) may be expressed as𝐼2ξ€œ(π‘Ž,𝜎)=βˆžπ‘+ξ€œπΊ(π‘₯)(𝜎π‘₯+lnπ‘Ž)𝑑π‘₯βˆžπ‘ξ€·πΊ(π‘₯)ln1+π‘Žβˆ’1π‘’βˆ’πœŽπ‘₯𝑑π‘₯.(17) Since π‘Žβˆ’1π‘’βˆ’πœŽπ‘₯≀(1+2𝛽)βˆ’1≀1+2𝛽 for π‘₯β‰₯𝑏, the substitution of (13) into (17), calculations similar to previous ones, and the evaluation of the first integral yield𝐼2ξ€·(π‘Ž,𝜎)=𝜎expβˆ’π‘2ξ€Έ/2βˆšβˆ’2πœ‹+(lnπ‘Ž)𝑄(𝑏)βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šπ‘Žβˆ’π‘šβˆ’π‘š2𝑄(π‘šπœŽ+𝑏).(18) The addition of (16) and (18) gives the general family of series representations for 𝐼(π‘Ž,𝜎) when π‘Ž>0,β€‰β€‰πœŽ>0,  𝑏 is defined by (10), and 𝛽β‰₯0:𝐼(π‘Ž,𝜎)=𝜎expβˆ’π‘2ξ€»/2βˆšβˆ’2πœ‹+(lnπ‘Ž)𝑄(𝑏)+ln(1+𝛽)𝑄(βˆ’π‘)βˆžξ“π‘š=11π‘š(1+𝛽)π‘šπ‘šξ“π‘˜=0(βˆ’1)π‘˜ξ‚΅π‘šπ‘˜ξ‚Άπ›½π‘šβˆ’π‘˜π‘Žπ‘˜βˆ’π‘˜2×𝑄(π‘˜πœŽβˆ’π‘)βˆ’βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šπ‘Žβˆ’π‘šβˆ’π‘š2𝑄(π‘šπœŽ+𝑏).(19)

If π‘Ž=exp(βˆ’πœŽ2/2) and 𝛽=0 in (19), then, for 𝜎β‰₯0, an algebraic simplification yieldsπΌξ‚€π‘’βˆ’πœŽ2/2=ξ€·,𝜎𝜎expβˆ’πœŽ2ξ€Έ/8βˆšβˆ’ξ‚΅πœŽ2πœ‹22ξ‚Άπ‘„ξ‚€πœŽβˆ’12ξ‚βˆ’βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šξ‚ΈπœŽ(π‘š+1)exp22ξ€·π‘š+π‘š2ξ€Έξ‚Ήξ‚€πœŽΓ—π‘„π‘šπœŽ+2,(20) where the validity of this equation when 𝜎=0 can be separately verified. The substitution of (8), (20), and lnπ‘₯=log2π‘₯/ln2 into (2) proves the series representation of (3). Similarly, if π‘Ž=exp(βˆ’πœŽ2/2) and 𝛽=1 in (19), then, for 𝜎β‰₯0, an algebraic simplification yields πΌξ‚€π‘’βˆ’πœŽ2/2=ξ€·,𝜎𝜎expβˆ’π‘2ξ€Έ/2βˆšβˆ’πœŽ2πœ‹22βˆ’π‘„(𝑏)+ln(2)𝑄(βˆ’π‘)βˆžξ“π‘š=1(βˆ’1)π‘šπ‘šξ‚ΈπœŽexp22ξ€·π‘š+π‘š2ξ€Έξ‚Ήβˆ’π‘„(π‘šπœŽ+𝑏)βˆžξ“π‘›=11𝑛2π‘›π‘›ξ“π‘˜=0(βˆ’1)π‘˜ξ‚΅π‘›π‘˜ξ‚Άξ‚ΈπœŽexp22ξ€·π‘˜2ξ€Έξ‚Ήπ‘„βˆ’π‘˜(π‘˜πœŽβˆ’π‘),(21) whereπœŽπ‘=2+ln3𝜎,(22) and we use 𝑄(βˆ’βˆž)=1 and 𝑄(∞)=0 when 𝜎=0. The substitution of (8), (21), 𝑄(βˆ’π‘)=1βˆ’π‘„(𝑏), and lnπ‘₯=log2π‘₯/ln2 into (2) proves the series representation of (6).

3. Evaluation of Series

To numerically evaluate (3) and (6), the summations must be evaluated with a finite number of terms. Let the large- 𝜎 series be 𝐽𝑀(𝜎), that is, (3) with the upper limit on the summation over π‘š replaced with 𝑀. Similarly, define 𝐽𝑀,𝑁(𝜎) to be the small-𝜎 series given by (6) with the upper limits of the summations over π‘š and 𝑛 replaced with 𝑀 and 𝑁, respectively.

The rapid convergence of the large-𝜎 series is illustrated in Figure 1, which shows the value of the truncated series 𝐽𝑀(𝜎) as a function of 𝑀 for 𝜎=0.25. The error bounds determined by (5), which are also shown, are observed to be pessimistic.

Figure 2 shows the number of terms required for the large-𝜎 series to converge to attain various error magnitudes as a function of 𝜎. Figure 2 shows that (3) is not efficiently computed for small values of 𝜎 and error magnitude, which motivates the use of (6) for small 𝜎.

Evaluation of (6) is complicated by the presence of two infinite summations. For the range of 𝜎 of interest, the summation over 𝑛 is the dominant of the two infinite summations. This behavior is illustrated in Figure 3, which shows the values of the summations over π‘š and 𝑛 for 𝜎=0.25 as a function of the number of terms. When computed to 10 terms, the value of the summation over 𝑛 is 7.8Γ—10βˆ’3, while the value of the summation over π‘š is only βˆ’8.5Γ—10βˆ’7. For lower values of 𝜎, the magnitude of the summation over π‘š is even smaller and becomes negligible as 𝜎 approaches zero. For this reason, we evaluate the summation over π‘š first and select the upper limit on the summation 𝑀 such that𝑀=minπ‘šβˆΆξ‚»1π‘šξ‚ΈπœŽexp22ξ€·π‘š+π‘š2ξ€Έξ‚Ήξ‚Ό,𝑄(π‘šπœŽ+𝑏)<𝜁(23) where 𝜁 is a small threshold value. If the numerical accuracy requirements are modest, the summation over π‘š can be omitted.

After computing the summation over π‘š to 𝑀 terms, (6) is evaluated with the number of terms 𝑁 in the summation over 𝑛 chosen to satisfy𝑁=minπ‘›βˆΆξƒ―||||𝐽1βˆ’π‘€,𝑛(𝜎)𝐽𝑀,π‘›βˆ’1||||ξƒ°(𝜎)<πœ–.(24) After each term is added to the summation over 𝑛 in (6), the absolute value in (24) is evaluated, and the process halts once the threshold πœ– is reached.

The number of terms 𝑀 and 𝑁 to achieve convergence criterion (23) with 𝜁=10βˆ’4 and (24) with πœ–=10βˆ’2 is shown in Figure 4 for πœŽβ‰€0.5. For higher values of 𝜎, evaluation of (6) becomes unstable because the large upper limit on the summation over 𝑛 results in large binomial coefficients that cause numerical overflow in typical implementations. Also shown in Figure 4 is the number of terms 𝑀 required to compute the truncated large-𝜎 series (3) with convergence threshold πœ–=10βˆ’2, where 𝑀 is related to πœ– by𝑀=minπ‘šβˆΆξƒ―||||𝐽1βˆ’π‘š(𝜎)π½π‘šβˆ’1||||ξƒ°(𝜎)<πœ–.(25)

From Figure 4, it might appear that the small-𝜎 series (6) is less complex to evaluate than the large-𝜎 series (3) for all πœŽβ‰€0.5. However, due to the presence of the double summation in (6), this is not necessarily true. To determine a threshold below which the small-𝜎 series is preferable computationally, one should consider the total number of terms containing an exponential and/or 𝑄-function. To evaluate the truncated large-𝜎 series (3), a total of 𝑀+2 terms involving exp and/or 𝑄 must be computed. On the other hand, to evaluate the truncated small-𝜎 series (6), a total of 𝑀+2+𝑁(𝑁+1)/2 terms involving exp and/or 𝑄 must be computed. Figure 5 compares the total number of terms involving exp and/or 𝑄 that must be computed for each of the two series representations as a function of 𝜎. From this figure, it is seen that for πœŽβ‰€0.26, fewer terms are required for the small-𝜎 series. For all values of 𝜎, the number of terms is fewer than 28, and for most 𝜎 it is significantly smaller.

Figure 6 compares the series representations against the value of (1) found using Monte Carlo integration with one million trials per value of 𝜎. The small-𝜎 series is used for πœŽβ‰€0.25, and the large-𝜎 series is used for 𝜎>0.25. As before, πœ–=10βˆ’2 for both series, and 𝜁=10βˆ’4 for the small-𝜎 series. There is no discernible difference between the series representations and the Monte Carlo integration, and any small differences can be attributed mainly to the finite number of Monte Carlo trials.

Given the rapid rate of convergence of (3), it is interesting to see the value of the large-𝜎 series when only one term is maintained in the summation or if the summation is dropped completely. Figure 7 compares the value of the truncated large-𝜎 series for 𝑀={0,1} and the 𝑀 required to satisfy the convergence criterion with πœ–=10βˆ’4. Using 𝑀=0 provides an upper bound that is tight only for large values of 𝜎, such as 𝜎>2, whereas using 𝑀=1 provides a tight lower bound even for relatively small values of 𝜎. Using 𝑀=4 gives two decimal places of accuracy for all 𝜎β‰₯0.1.

4. Comparison with Gauss-Hermite Quadrature

The integral given by (1) and (2) may also be solved using a form of numerical integration known as Gauss-Hermite quadrature [7]. After a change of variables (βˆšπ‘§=𝑦/2), the integral may be written as1𝐽(𝜎)=1βˆ’βˆšπœ‹ξ€œβˆžβˆ’βˆžπ‘’βˆ’π‘§2𝑓(𝑧)𝑑𝑧,(26) where𝑓(𝑧)=log2ξ‚΅ξ‚»βˆ’βˆš1+exp𝜎2πœŽπ‘§βˆ’22ξ‚Όξ‚Ά.(27)

With the Gauss-Hermite quadrature, the integral in (26) is evaluated using ξ€œβˆžβˆ’βˆžπ‘’βˆ’π‘§2𝑓(𝑧)π‘‘π‘§β‰ˆπ‘›ξ“π‘–=1𝑀𝑖𝑓𝑧𝑖,(28) where 𝑓(𝑧) is given by (27), 𝑧𝑖 are the roots of the 𝑛th-degree Hermite polynomial𝐻𝑛(𝑧)=(βˆ’1)𝑛𝑒𝑧2π‘‘π‘›π‘‘π‘§π‘›π‘’βˆ’π‘§2,(29) and 𝑀𝑖 are the associated weights𝑀𝑖=2π‘›βˆ’1βˆšπ‘›!πœ‹π‘›2ξ€Ίπ»π‘›βˆ’1(𝑧𝑖)ξ€»2.(30) The roots {𝑧1,…,𝑧𝑛} of 𝐻𝑛(𝑧) may be found using Newton's method [7].

We compare five realizations of the 𝐽(𝜎) function as follows:(1) the β€œinfinite” large-𝜎 series representation, computed with very large 𝑀 (i.e., 𝑀=750); (2) the truncated large-𝜎 series representation (3), truncated to 𝑀 terms;(3) the truncated small-𝜎 series representation (6), with the first summation truncated to 𝑀=1 terms and the second (double) summation truncated to an upper limit on the outer summation of 𝑁;(4) the Gauss-Hermite quadrature with 𝑛=𝑀 terms, that is, the same number of terms as the truncated series representation;(5) Monte Carlo integration with 1 million trials.

For each realization, we determine the error to be the magnitude of the difference between the calculated value and the value computed by the β€œinfinite” series.

Figure 8 shows the errors of the two truncated series and the Gauss-Hermite quadrature. The Gauss-Hermite quadrature is evaluated with 𝑀=6 terms, and the large-𝜎 series is truncated to 𝑀=6 terms. In order to have a comparable number of terms in the summations in (6), 𝑀=1 and 𝑁=2 are used for the small-𝜎 series. The error of the Monte Carlo approach is also shown (dotted line).

When 𝜎>1.6, the truncated large-𝜎 series usually provides smaller error than Gauss-Hermite quadrature (except at 𝜎=2.5 and 𝜎=3.9, where the Gauss-Hermite quadrature briefly has a smaller error). Since the amount of computation required per term of the series is roughly the same, the large-𝜎 series is preferable to the Gauss-Hermite quadrature. For 𝜎<1.6, the error of the Gauss-Hermite quadrature is smaller than that of the large-𝜎 series. In this region, the small-𝜎 series could be used, since it provides a smaller error than the large-𝜎 series below 𝜎=0.38.

5. Conclusions

Series representations have been derived for a mutual-information function that is used in EXIT-chart analysis and the evaluation of the capacity of a binary-input AWGN channel. Truncated versions of the series are computationally competitive with the Gauss-Hermite quadrature and do not require finding roots of Hermite polynomials. The series are useful for computation and to provide simple lower and upper bounds.