About this Journal Submit a Manuscript Table of Contents
ISRN Communications and Networking
Volume 2011 (2011), Article ID 546205, 6 pages
http://dx.doi.org/10.5402/2011/546205
Research Article

Rapidly-Converging Series Representations of a Mutual-Information Integral

1Computational and Information Sciences Directorate, U.S. Army Research Laboratory, Adelphi, MD 20783-1197, USA
2Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA

Received 23 November 2010; Accepted 13 December 2010

Academic Editors: C. Carbonelli and K. Teh

Copyright © 2011 Don Torrieri and Matthew Valenti. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This paper evaluates a mutual-information integral that appears in EXIT-chart analysis and the capacity equation for the binary-input AWGN channel. Rapidly-converging series representations are derived and shown to be computationally efficient. Over a wide range of channel signal-to-noise ratios, the series are more accurate than the Gauss-Hermite quadrature and comparable to Monte Carlo integration with a very large number of trials.

1. Introduction

The mutual-information between a binary-valued random variable and a consistent, conditionally Gaussian random variable with variance 𝜎2 is𝐽(𝜎)=1exp1/2𝜎2𝑦𝜎2/222𝜋𝜎log2(1+𝑒𝑦)𝑑𝑦.(1) This function and its inverse play a central role in EXIT-chart analysis, which may be used to design and predict the performance of turbo codes [1], low-density parity-check codes [2], and bit-interleaved coded modulation with iterative decoding (BICM-ID) [3]. A change of variable yields𝐽(𝜎)=1exp𝑦2/22𝜋log21+𝑒𝜎𝑦𝜎2/2𝑑𝑦(2) which has the same form as the equation for the capacity of a binary-input AWGN channel [4]. Despite its origins in the early years of information theory, the integral in (2) has not been expressed in closed form or represented as a rapidly-converging infinite series. Consequently, (2) has been evaluated by numerical integration, Monte Carlo simulation, or approximation [2].

As will be shown subsequently, a rapidly-converging series representation for (1) and (2) is1𝐽(𝜎)=1ln2𝜎exp𝜎2/8𝜎2𝜋22𝑄𝜎12𝑚=1(1)𝑚𝑚𝜎(𝑚+1)exp22𝑚+𝑚2𝜎×𝑄𝑚𝜎+2,(3) where1𝑄(𝑥)=2𝑥erfc2=𝑥1𝑦2𝜋exp22𝑑𝑦.(4)

Numerical evaluation of (3) requires truncation to a finite number of terms. Since an alternating series satisfying the Leibniz convergence criterion [5] appears in (3), simple upper and lower bounds are easily obtained by series truncation, and the maximum error is easily computed from the first omitted term. Only six terms in the summation over 𝑚 in (3) are needed to compute 𝐽(𝜎) with an error less than 0.01.

Define 𝐽𝑀(𝜎) to be (3) with the upper limit on the summation over 𝑚 replaced with 𝑀. Let 𝐸(𝑀,𝜎) be the magnitude of the error when the series that is truncated after 𝑀 terms, i.e. 𝐸(𝑀,𝜎)=|𝐽(𝜎)𝐽𝑀(𝜎)|. Since 𝑄(𝑥)exp(𝑥2/2)/2,  𝑥0, the error magnitude 𝐸(𝑀,𝜎) satisfies𝐸(𝑀,𝜎)exp𝜎2/82(𝑀+1)(𝑀+2)(5) which indicates a rapid convergence of (3).

Because of the dependence on 𝜎 in the bound given by (5), the number of terms required to achieve a very small error may become large for low 𝜎. This motivates the use of an alternative series expression for (1) and (2) that is suitable for small 𝜎. A rapidly-converging series representation for 0<𝜎0.25 is1𝐽(𝜎)=ln2𝜎exp𝑏2/2+𝜎2𝜋22++ln2𝑄(𝑏)𝑚=1(1)𝑚𝑚𝜎exp22𝑚+𝑚2+𝑄(𝑚𝜎+𝑏)𝑛=11𝑛2𝑛𝑛𝑘=0(1)𝑘𝑛𝑘𝜎×exp22𝑘2,𝑘𝑄(𝑘𝜎𝑏)(6) where𝜎𝑏=2+ln3𝜎.(7) Note that (6) is valid for 𝜎=0 if we use 𝑄()=1 and 𝑄()=0.

Both series entail the computation of the 𝑄 function, which is itself an integral. However, the central importance of the 𝑄 function has led to the development of very efficient approximations to compute it, for instance, by using the erfc function found in MATLAB or the approximation proposed in [6].

In EXIT-chart analysis [1], the inverse of 𝐽(𝜎) is required. The truncated series can be easily inverted by using numerical algorithms such as the MATLAB function fzero, which uses a combination of bisection and inverse quadratic interpolation.

The remainder of this paper is organized as follows. The series are derived in Section 2. Computational aspects of (3) and (6) are discussed in Section 3. The series are compared against Gauss-Hermite quadrature in Section 4, and the paper concludes in Section 5.

2. Derivation of Series

To derive the desired representations, we first derive a family of series representations of a more general integral. Consider the following integral:𝐼(𝑎,𝜎)=𝐺(𝑥)ln(1+𝑎𝑒𝜎𝑥)𝑑𝑥,𝑎0,𝜎0,(8) where1𝐺(𝑥)=𝑥2𝜋exp22.(9) Since 𝐼(0,𝜎)=0 and 𝐼(𝑎,0)=ln(1+𝑎),𝑎>0 and 𝜎>0 are assumed henceforth. For any 𝛽0, define1𝑏=𝜎ln1+2𝛽𝑎.(10) Dividing the integral in (8) into two integrals, changing variables in the second one, and using the fact that 𝐺(𝑥) is an even function, we obtain 𝐼(𝑎,𝜎)=𝐼1(𝑎,𝜎)+𝐼2(𝑎,𝜎), where𝐼1(𝑎,𝜎)=𝑏𝐺(𝑥)ln(1+𝑎𝑒𝜎𝑥𝐼)𝑑𝑥,(11)2(𝑎,𝜎)=𝑏𝐺(𝑥)ln(1+𝑎𝑒𝜎𝑥)𝑑𝑥.(12) A uniformly convergent Taylor series expansion of the logarithm over the interval [0,1+2𝛽] isln(1+𝑦)=ln(1+𝛽)+𝑚=1(1)𝑚+1𝑚(1+𝛽)𝑚(𝑦𝛽)𝑚,(13) where 0𝑦1+2𝛽. The uniform convergence can be proved by application of the Leibniz criterion for alternating series [5].

Since 𝑎exp(𝜎𝑥)1+2𝛽 for 𝑥𝑏, (13) indicates that ln(1+𝑎𝑒𝜎𝑥) in (11) may be expressed as a uniformly convergent series of continuous functions in the interval of integration. Since 𝑏𝐺(𝑥)𝑑𝑥 is bounded, the infinite summation and the integration may be interchanged and𝐼1+(𝑎,𝜎)=ln(1+𝛽)𝑄(𝑏)𝑚=1(1)𝑚+1𝑚(1+𝛽)𝑚𝑏𝐺(𝑥)(𝑎𝑒𝜎𝑥𝛽)𝑚𝑑𝑥,(14) where1𝑄(𝑥)=2𝑥erfc2=𝑥𝐺(𝑦)𝑑𝑦.(15) Substituting the binomial expansion, interchanging the finite summation and integration, completing the square of the argument of the exponential in the integrand, changing variables, and then evaluating the integral, we obtain𝐼1(𝑎,𝜎)=ln(1+𝛽)𝑄(𝑏)𝑚=11𝑚(1+𝛽)𝑚𝑚𝑘=0(1)𝑘×𝑚𝑘𝛽𝑚𝑘𝑎𝑘𝑘2𝑄(𝑘𝜎𝑏).(16)

Since ln(1+𝑎𝑒𝜎𝑥)=ln[𝑎𝑒𝜎𝑥(1+𝑎1𝑒𝜎𝑥)]=𝜎𝑥+ln𝑎+ln(1+𝑎1𝑒𝜎𝑥),  𝑎0, (12) may be expressed as𝐼2(𝑎,𝜎)=𝑏+𝐺(𝑥)(𝜎𝑥+ln𝑎)𝑑𝑥𝑏𝐺(𝑥)ln1+𝑎1𝑒𝜎𝑥𝑑𝑥.(17) Since 𝑎1𝑒𝜎𝑥(1+2𝛽)11+2𝛽 for 𝑥𝑏, the substitution of (13) into (17), calculations similar to previous ones, and the evaluation of the first integral yield𝐼2(𝑎,𝜎)=𝜎exp𝑏2/22𝜋+(ln𝑎)𝑄(𝑏)𝑚=1(1)𝑚𝑚𝑎𝑚𝑚2𝑄(𝑚𝜎+𝑏).(18) The addition of (16) and (18) gives the general family of series representations for 𝐼(𝑎,𝜎) when 𝑎>0,  𝜎>0,  𝑏 is defined by (10), and 𝛽0:𝐼(𝑎,𝜎)=𝜎exp𝑏2/22𝜋+(ln𝑎)𝑄(𝑏)+ln(1+𝛽)𝑄(𝑏)𝑚=11𝑚(1+𝛽)𝑚𝑚𝑘=0(1)𝑘𝑚𝑘𝛽𝑚𝑘𝑎𝑘𝑘2×𝑄(𝑘𝜎𝑏)𝑚=1(1)𝑚𝑚𝑎𝑚𝑚2𝑄(𝑚𝜎+𝑏).(19)

If 𝑎=exp(𝜎2/2) and 𝛽=0 in (19), then, for 𝜎0, an algebraic simplification yields𝐼𝑒𝜎2/2=,𝜎𝜎exp𝜎2/8𝜎2𝜋22𝑄𝜎12𝑚=1(1)𝑚𝑚𝜎(𝑚+1)exp22𝑚+𝑚2𝜎×𝑄𝑚𝜎+2,(20) where the validity of this equation when 𝜎=0 can be separately verified. The substitution of (8), (20), and ln𝑥=log2𝑥/ln2 into (2) proves the series representation of (3). Similarly, if 𝑎=exp(𝜎2/2) and 𝛽=1 in (19), then, for 𝜎0, an algebraic simplification yields 𝐼𝑒𝜎2/2=,𝜎𝜎exp𝑏2/2𝜎2𝜋22𝑄(𝑏)+ln(2)𝑄(𝑏)𝑚=1(1)𝑚𝑚𝜎exp22𝑚+𝑚2𝑄(𝑚𝜎+𝑏)𝑛=11𝑛2𝑛𝑛𝑘=0(1)𝑘𝑛𝑘𝜎exp22𝑘2𝑄𝑘(𝑘𝜎𝑏),(21) where𝜎𝑏=2+ln3𝜎,(22) and we use 𝑄()=1 and 𝑄()=0 when 𝜎=0. The substitution of (8), (21), 𝑄(𝑏)=1𝑄(𝑏), and ln𝑥=log2𝑥/ln2 into (2) proves the series representation of (6).

3. Evaluation of Series

To numerically evaluate (3) and (6), the summations must be evaluated with a finite number of terms. Let the large- 𝜎 series be 𝐽𝑀(𝜎), that is, (3) with the upper limit on the summation over 𝑚 replaced with 𝑀. Similarly, define 𝐽𝑀,𝑁(𝜎) to be the small-𝜎 series given by (6) with the upper limits of the summations over 𝑚 and 𝑛 replaced with 𝑀 and 𝑁, respectively.

The rapid convergence of the large-𝜎 series is illustrated in Figure 1, which shows the value of the truncated series 𝐽𝑀(𝜎) as a function of 𝑀 for 𝜎=0.25. The error bounds determined by (5), which are also shown, are observed to be pessimistic.

546205.fig.001
Figure 1: Truncated large-𝜎 series 𝐽𝑀(𝜎) for 𝜎=0.25 and a bound on the error.

Figure 2 shows the number of terms required for the large-𝜎 series to converge to attain various error magnitudes as a function of 𝜎. Figure 2 shows that (3) is not efficiently computed for small values of 𝜎 and error magnitude, which motivates the use of (6) for small 𝜎.

546205.fig.002
Figure 2: Minimum number of terms required to attain various error magnitudes 𝐸(𝑀,𝜎).

Evaluation of (6) is complicated by the presence of two infinite summations. For the range of 𝜎 of interest, the summation over 𝑛 is the dominant of the two infinite summations. This behavior is illustrated in Figure 3, which shows the values of the summations over 𝑚 and 𝑛 for 𝜎=0.25 as a function of the number of terms. When computed to 10 terms, the value of the summation over 𝑛 is 7.8×103, while the value of the summation over 𝑚 is only 8.5×107. For lower values of 𝜎, the magnitude of the summation over 𝑚 is even smaller and becomes negligible as 𝜎 approaches zero. For this reason, we evaluate the summation over 𝑚 first and select the upper limit on the summation 𝑀 such that𝑀=min𝑚1𝑚𝜎exp22𝑚+𝑚2,𝑄(𝑚𝜎+𝑏)<𝜁(23) where 𝜁 is a small threshold value. If the numerical accuracy requirements are modest, the summation over 𝑚 can be omitted.

fig3
Figure 3: Evaluation of the two summations in (6) for 𝜎=0.25. (a) Value of the summation over 𝑚 as a function of the number of terms 𝑀; (b) Value of the summation over 𝑛 as a function of the number of terms 𝑁.

After computing the summation over 𝑚 to 𝑀 terms, (6) is evaluated with the number of terms 𝑁 in the summation over 𝑛 chosen to satisfy𝑁=min𝑛||||𝐽1𝑀,𝑛(𝜎)𝐽𝑀,𝑛1||||(𝜎)<𝜖.(24) After each term is added to the summation over 𝑛 in (6), the absolute value in (24) is evaluated, and the process halts once the threshold 𝜖 is reached.

The number of terms 𝑀 and 𝑁 to achieve convergence criterion (23) with 𝜁=104 and (24) with 𝜖=102 is shown in Figure 4 for 𝜎0.5. For higher values of 𝜎, evaluation of (6) becomes unstable because the large upper limit on the summation over 𝑛 results in large binomial coefficients that cause numerical overflow in typical implementations. Also shown in Figure 4 is the number of terms 𝑀 required to compute the truncated large-𝜎 series (3) with convergence threshold 𝜖=102, where 𝑀 is related to 𝜖 by𝑀=min𝑚||||𝐽1𝑚(𝜎)𝐽𝑚1||||(𝜎)<𝜖.(25)

546205.fig.004
Figure 4: The number of terms 𝑀 and 𝑁 required to evaluate the small-𝜎 series (6) with 𝜁=104 and 𝜖=102. Also shown is the number of terms 𝑀 required to evaluate the large-𝜎 series (3) with 𝜖=102.

From Figure 4, it might appear that the small-𝜎 series (6) is less complex to evaluate than the large-𝜎 series (3) for all 𝜎0.5. However, due to the presence of the double summation in (6), this is not necessarily true. To determine a threshold below which the small-𝜎 series is preferable computationally, one should consider the total number of terms containing an exponential and/or 𝑄-function. To evaluate the truncated large-𝜎 series (3), a total of 𝑀+2 terms involving exp and/or 𝑄 must be computed. On the other hand, to evaluate the truncated small-𝜎 series (6), a total of 𝑀+2+𝑁(𝑁+1)/2 terms involving exp and/or 𝑄 must be computed. Figure 5 compares the total number of terms involving exp and/or 𝑄 that must be computed for each of the two series representations as a function of 𝜎. From this figure, it is seen that for 𝜎0.26, fewer terms are required for the small-𝜎 series. For all values of 𝜎, the number of terms is fewer than 28, and for most 𝜎 it is significantly smaller.

546205.fig.005
Figure 5: The total number of terms involving exp and/or 𝑄 in the small-𝜎 series (6) and the large-𝜎 series (3).

Figure 6 compares the series representations against the value of (1) found using Monte Carlo integration with one million trials per value of 𝜎. The small-𝜎 series is used for 𝜎0.25, and the large-𝜎 series is used for 𝜎>0.25. As before, 𝜖=102 for both series, and 𝜁=104 for the small-𝜎 series. There is no discernible difference between the series representations and the Monte Carlo integration, and any small differences can be attributed mainly to the finite number of Monte Carlo trials.

546205.fig.006
Figure 6: The truncated series representation (6) for 𝜎0.25 and (3) for 𝜎>0.25 and the integral evaluated using Monte Carlo integration.

Given the rapid rate of convergence of (3), it is interesting to see the value of the large-𝜎 series when only one term is maintained in the summation or if the summation is dropped completely. Figure 7 compares the value of the truncated large-𝜎 series for 𝑀={0,1} and the 𝑀 required to satisfy the convergence criterion with 𝜖=104. Using 𝑀=0 provides an upper bound that is tight only for large values of 𝜎, such as 𝜎>2, whereas using 𝑀=1 provides a tight lower bound even for relatively small values of 𝜎. Using 𝑀=4 gives two decimal places of accuracy for all 𝜎0.1.

546205.fig.007
Figure 7: The truncated large-𝜎 series 𝐽𝑚(𝜎) with 𝑚=0, 𝑚=1, and 𝑚=𝑀 required to satisfy the convergence criterion with 𝜖=104.

4. Comparison with Gauss-Hermite Quadrature

The integral given by (1) and (2) may also be solved using a form of numerical integration known as Gauss-Hermite quadrature [7]. After a change of variables (𝑧=𝑦/2), the integral may be written as1𝐽(𝜎)=1𝜋𝑒𝑧2𝑓(𝑧)𝑑𝑧,(26) where𝑓(𝑧)=log21+exp𝜎2𝜎𝑧22.(27)

With the Gauss-Hermite quadrature, the integral in (26) is evaluated using 𝑒𝑧2𝑓(𝑧)𝑑𝑧𝑛𝑖=1𝑤𝑖𝑓𝑧𝑖,(28) where 𝑓(𝑧) is given by (27), 𝑧𝑖 are the roots of the 𝑛th-degree Hermite polynomial𝐻𝑛(𝑧)=(1)𝑛𝑒𝑧2𝑑𝑛𝑑𝑧𝑛𝑒𝑧2,(29) and 𝑤𝑖 are the associated weights𝑤𝑖=2𝑛1𝑛!𝜋𝑛2𝐻𝑛1(𝑧𝑖)2.(30) The roots {𝑧1,,𝑧𝑛} of 𝐻𝑛(𝑧) may be found using Newton's method [7].

We compare five realizations of the 𝐽(𝜎) function as follows:(1) the “infinite” large-𝜎 series representation, computed with very large 𝑀 (i.e., 𝑀=750); (2) the truncated large-𝜎 series representation (3), truncated to 𝑀 terms;(3) the truncated small-𝜎 series representation (6), with the first summation truncated to 𝑀=1 terms and the second (double) summation truncated to an upper limit on the outer summation of 𝑁;(4) the Gauss-Hermite quadrature with 𝑛=𝑀 terms, that is, the same number of terms as the truncated series representation;(5) Monte Carlo integration with 1 million trials.

For each realization, we determine the error to be the magnitude of the difference between the calculated value and the value computed by the “infinite” series.

Figure 8 shows the errors of the two truncated series and the Gauss-Hermite quadrature. The Gauss-Hermite quadrature is evaluated with 𝑀=6 terms, and the large-𝜎 series is truncated to 𝑀=6 terms. In order to have a comparable number of terms in the summations in (6), 𝑀=1 and 𝑁=2 are used for the small-𝜎 series. The error of the Monte Carlo approach is also shown (dotted line).

546205.fig.008
Figure 8: Error (relative to infinite series) of truncated series with 6 terms and Gauss-Hermite quadrature with 6 terms. Monte Carlo with 1 million trials shown for comparison purposes.

When 𝜎>1.6, the truncated large-𝜎 series usually provides smaller error than Gauss-Hermite quadrature (except at 𝜎=2.5 and 𝜎=3.9, where the Gauss-Hermite quadrature briefly has a smaller error). Since the amount of computation required per term of the series is roughly the same, the large-𝜎 series is preferable to the Gauss-Hermite quadrature. For 𝜎<1.6, the error of the Gauss-Hermite quadrature is smaller than that of the large-𝜎 series. In this region, the small-𝜎 series could be used, since it provides a smaller error than the large-𝜎 series below 𝜎=0.38.

5. Conclusions

Series representations have been derived for a mutual-information function that is used in EXIT-chart analysis and the evaluation of the capacity of a binary-input AWGN channel. Truncated versions of the series are computationally competitive with the Gauss-Hermite quadrature and do not require finding roots of Hermite polynomials. The series are useful for computation and to provide simple lower and upper bounds.

References

  1. S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes,” IEEE Transactions on Communications, vol. 49, no. 10, pp. 1727–1737, 2001. View at Publisher · View at Google Scholar · View at Scopus
  2. S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-density parity-check codes for modulation and detection,” IEEE Transactions on Communications, vol. 52, no. 4, pp. 670–678, 2004. View at Publisher · View at Google Scholar · View at Scopus
  3. S. ten Brink, “Convergence of iterative decoding,” Electronics Letters, vol. 35, no. 10, pp. 806–808, 1999. View at Publisher · View at Google Scholar · View at Scopus
  4. J. G. Proakis and M. Salehi, Digital Communications, McGraw-Hill, New York, NY, USA, 5th edition, 2008.
  5. E. Kreyszig, Advanced Engineering Mathematics, Wiley, New York, NY, USA, 9th edition, 2006.
  6. G. K. Karagiannidis and A. S. Lioumpas, “An improved approximation for the Gaussian Q-function,” IEEE Communications Letters, vol. 11, no. 8, pp. 644–646, 2007. View at Publisher · View at Google Scholar · View at Scopus
  7. P. Moin, Fundamentals of Engineering Numerical Analysis, Cambridge University Press, Cambridge, UK, 2001.