Rapidly-Converging Series Representations of a Mutual-Information Integral

Torrieri, Don; Valenti, Matthew

doi:https://doi.org/10.5402/2011/546205

International Scholarly Research Notices

On this page

Abstract Introduction Conclusions References Copyright Related Articles

Research Article | Open Access

Volume 2011 | Article ID 546205 | https://doi.org/10.5402/2011/546205

Rapidly-Converging Series Representations of a Mutual-Information Integral

Don Torrieri¹and Matthew Valenti²

Academic Editor: C. Carbonelli, K. Teh

Received23 Nov 2010

Accepted13 Dec 2010

Published27 Dec 2010

Abstract

This paper evaluates a mutual-information integral that appears in EXIT-chart analysis and the capacity equation for the binary-input AWGN channel. Rapidly-converging series representations are derived and shown to be computationally efficient. Over a wide range of channel signal-to-noise ratios, the series are more accurate than the Gauss-Hermite quadrature and comparable to Monte Carlo integration with a very large number of trials.

1. Introduction

The mutual-information between a binary-valued random variable and a consistent, conditionally Gaussian random variable with variance is This function and its inverse play a central role in EXIT-chart analysis, which may be used to design and predict the performance of turbo codes [1], low-density parity-check codes [2], and bit-interleaved coded modulation with iterative decoding (BICM-ID) [3]. A change of variable yields which has the same form as the equation for the capacity of a binary-input AWGN channel [4]. Despite its origins in the early years of information theory, the integral in (2) has not been expressed in closed form or represented as a rapidly-converging infinite series. Consequently, (2) has been evaluated by numerical integration, Monte Carlo simulation, or approximation [2].

As will be shown subsequently, a rapidly-converging series representation for (1) and (2) is where

Numerical evaluation of (3) requires truncation to a finite number of terms. Since an alternating series satisfying the Leibniz convergence criterion [5] appears in (3), simple upper and lower bounds are easily obtained by series truncation, and the maximum error is easily computed from the first omitted term. Only six terms in the summation over in (3) are needed to compute with an error less than 0.01.

Define to be (3) with the upper limit on the summation over replaced with . Let be the magnitude of the error when the series that is truncated after terms, i.e. . Since , , the error magnitude satisfies which indicates a rapid convergence of (3).

Because of the dependence on in the bound given by (5), the number of terms required to achieve a very small error may become large for low . This motivates the use of an alternative series expression for (1) and (2) that is suitable for small . A rapidly-converging series representation for is where Note that (6) is valid for if we use and .

Both series entail the computation of the function, which is itself an integral. However, the central importance of the function has led to the development of very efficient approximations to compute it, for instance, by using the erfc function found in MATLAB or the approximation proposed in [6].

In EXIT-chart analysis [1], the inverse of is required. The truncated series can be easily inverted by using numerical algorithms such as the MATLAB function fzero, which uses a combination of bisection and inverse quadratic interpolation.

The remainder of this paper is organized as follows. The series are derived in Section 2. Computational aspects of (3) and (6) are discussed in Section 3. The series are compared against Gauss-Hermite quadrature in Section 4, and the paper concludes in Section 5.

2. Derivation of Series

To derive the desired representations, we first derive a family of series representations of a more general integral. Consider the following integral: where Since and , and are assumed henceforth. For any , define Dividing the integral in (8) into two integrals, changing variables in the second one, and using the fact that is an even function, we obtain , where A uniformly convergent Taylor series expansion of the logarithm over the interval is where . The uniform convergence can be proved by application of the Leibniz criterion for alternating series [5].

Since for , (13) indicates that in (11) may be expressed as a uniformly convergent series of continuous functions in the interval of integration. Since is bounded, the infinite summation and the integration may be interchanged and where Substituting the binomial expansion, interchanging the finite summation and integration, completing the square of the argument of the exponential in the integrand, changing variables, and then evaluating the integral, we obtain

Since , , (12) may be expressed as Since for , the substitution of (13) into (17), calculations similar to previous ones, and the evaluation of the first integral yield The addition of (16) and (18) gives the general family of series representations for when , , is defined by (10), and :

If and in (19), then, for , an algebraic simplification yields where the validity of this equation when can be separately verified. The substitution of (8), (20), and into (2) proves the series representation of (3). Similarly, if and in (19), then, for , an algebraic simplification yields where and we use and when . The substitution of (8), (21), , and into (2) proves the series representation of (6).

3. Evaluation of Series

To numerically evaluate (3) and (6), the summations must be evaluated with a finite number of terms. Let the large- series be , that is, (3) with the upper limit on the summation over replaced with . Similarly, define to be the small- series given by (6) with the upper limits of the summations over and replaced with and , respectively.

The rapid convergence of the large- series is illustrated in Figure 1, which shows the value of the truncated series as a function of for . The error bounds determined by (5), which are also shown, are observed to be pessimistic.

Figure 2 shows the number of terms required for the large- series to converge to attain various error magnitudes as a function of . Figure 2 shows that (3) is not efficiently computed for small values of and error magnitude, which motivates the use of (6) for small .

Evaluation of (6) is complicated by the presence of two infinite summations. For the range of of interest, the summation over is the dominant of the two infinite summations. This behavior is illustrated in Figure 3, which shows the values of the summations over and for as a function of the number of terms. When computed to 10 terms, the value of the summation over is , while the value of the summation over is only . For lower values of , the magnitude of the summation over is even smaller and becomes negligible as approaches zero. For this reason, we evaluate the summation over first and select the upper limit on the summation such that where is a small threshold value. If the numerical accuracy requirements are modest, the summation over can be omitted.

(a)

(b)

After computing the summation over to terms, (6) is evaluated with the number of terms in the summation over chosen to satisfy After each term is added to the summation over in (6), the absolute value in (24) is evaluated, and the process halts once the threshold is reached.

The number of terms and to achieve convergence criterion (23) with and (24) with is shown in Figure 4 for . For higher values of , evaluation of (6) becomes unstable because the large upper limit on the summation over results in large binomial coefficients that cause numerical overflow in typical implementations. Also shown in Figure 4 is the number of terms required to compute the truncated large- series (3) with convergence threshold , where is related to by

From Figure 4, it might appear that the small- series (6) is less complex to evaluate than the large- series (3) for all . However, due to the presence of the double summation in (6), this is not necessarily true. To determine a threshold below which the small- series is preferable computationally, one should consider the total number of terms containing an exponential and/or -function. To evaluate the truncated large- series (3), a total of terms involving exp and/or must be computed. On the other hand, to evaluate the truncated small- series (6), a total of terms involving exp and/or must be computed. Figure 5 compares the total number of terms involving exp and/or that must be computed for each of the two series representations as a function of . From this figure, it is seen that for , fewer terms are required for the small- series. For all values of , the number of terms is fewer than 28, and for most it is significantly smaller.

Figure 6 compares the series representations against the value of (1) found using Monte Carlo integration with one million trials per value of . The small- series is used for , and the large- series is used for . As before, for both series, and for the small- series. There is no discernible difference between the series representations and the Monte Carlo integration, and any small differences can be attributed mainly to the finite number of Monte Carlo trials.

Given the rapid rate of convergence of (3), it is interesting to see the value of the large- series when only one term is maintained in the summation or if the summation is dropped completely. Figure 7 compares the value of the truncated large- series for and the required to satisfy the convergence criterion with . Using provides an upper bound that is tight only for large values of , such as , whereas using provides a tight lower bound even for relatively small values of . Using gives two decimal places of accuracy for all .

4. Comparison with Gauss-Hermite Quadrature

The integral given by (1) and (2) may also be solved using a form of numerical integration known as Gauss-Hermite quadrature [7]. After a change of variables (), the integral may be written as where

With the Gauss-Hermite quadrature, the integral in (26) is evaluated using where is given by (27), are the roots of the th-degree Hermite polynomial and are the associated weights The roots of may be found using Newton's method [7].

We compare five realizations of the function as follows:(1) the “infinite” large- series representation, computed with very large (i.e., ); (2) the truncated large- series representation (3), truncated to terms;(3) the truncated small- series representation (6), with the first summation truncated to terms and the second (double) summation truncated to an upper limit on the outer summation of ;(4) the Gauss-Hermite quadrature with terms, that is, the same number of terms as the truncated series representation;(5) Monte Carlo integration with 1 million trials.

For each realization, we determine the error to be the magnitude of the difference between the calculated value and the value computed by the “infinite” series.

Figure 8 shows the errors of the two truncated series and the Gauss-Hermite quadrature. The Gauss-Hermite quadrature is evaluated with terms, and the large- series is truncated to terms. In order to have a comparable number of terms in the summations in (6), and are used for the small- series. The error of the Monte Carlo approach is also shown (dotted line).

When , the truncated large- series usually provides smaller error than Gauss-Hermite quadrature (except at and , where the Gauss-Hermite quadrature briefly has a smaller error). Since the amount of computation required per term of the series is roughly the same, the large- series is preferable to the Gauss-Hermite quadrature. For , the error of the Gauss-Hermite quadrature is smaller than that of the large- series. In this region, the small- series could be used, since it provides a smaller error than the large- series below .

5. Conclusions

Series representations have been derived for a mutual-information function that is used in EXIT-chart analysis and the evaluation of the capacity of a binary-input AWGN channel. Truncated versions of the series are computationally competitive with the Gauss-Hermite quadrature and do not require finding roots of Hermite polynomials. The series are useful for computation and to provide simple lower and upper bounds.

References

S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes,” IEEE Transactions on Communications, vol. 49, no. 10, pp. 1727–1737, 2001.
View at: Publisher Site | Google Scholar
S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-density parity-check codes for modulation and detection,” IEEE Transactions on Communications, vol. 52, no. 4, pp. 670–678, 2004.
View at: Publisher Site | Google Scholar
S. ten Brink, “Convergence of iterative decoding,” Electronics Letters, vol. 35, no. 10, pp. 806–808, 1999.
View at: Publisher Site | Google Scholar
J. G. Proakis and M. Salehi, Digital Communications, McGraw-Hill, New York, NY, USA, 5th edition, 2008.
E. Kreyszig, Advanced Engineering Mathematics, Wiley, New York, NY, USA, 9th edition, 2006.
G. K. Karagiannidis and A. S. Lioumpas, “An improved approximation for the Gaussian Q-function,” IEEE Communications Letters, vol. 11, no. 8, pp. 644–646, 2007.
View at: Publisher Site | Google Scholar
P. Moin, Fundamentals of Engineering Numerical Analysis, Cambridge University Press, Cambridge, UK, 2001.

Copyright

Copyright © 2011 Don Torrieri and Matthew Valenti. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

953

Downloads

919

Citations