The straightforward application of Shannon's separation principle may entail a significant
suboptimality in practical systems with limited coding delay and complexity.
This is particularly evident when the lossy source code is based on entropy-coded quantization.
In fact, it is well known that entropy coding is not robust to residual channel errors.
In this paper, a joint source-channel coding scheme is advocated that combines the advantages and simplicity of
entropy-coded quantization with the robustness of linear codes. The idea is to combine entropy coding and channel coding into
a single linear encoding stage. If the channel is symmetric, the scheme can asymptotically achieve the optimal
rate-distortion limit. However, its advantages are more clearly evident under finite coding delay and complexity.
The sequence of quantization indices is decomposed into bitplanes, and each bitplane is independently mapped onto a sequence of channel coded symbols. The coding rate of each bitplane is chosen according to the bitplane conditional entropy rate.
The use of systematic raptor encoders is proposed, in order to obtain a continuum of coding rates with a single basic encoding
algorithm.
Simulations show that the proposed scheme can outperform the separated baseline scheme
for finite coding length and comparable complexity and, as expected, it is much more robust to channel errors in the case of channel capacity mismatch.
1. Introduction
A stationary ergodic source can be transmitted over an
information-stable channel with end-to-end average distortion with bandwidth expansion factor not lower than channel symbols per source sample, where is the source rate distortion function and is the channel capacity. Shannon's
source-channel separation principle [1] ensures that this
optimal performance can be approached by independently designing the
source coding and the
channel coding schemes.
The bandwidth expansion factor is defined as the number of channel symbols
per source symbol. If a block of source symbols is transmitted through the
channel in channel uses, then .
This provides a definite architectural advantage in practical systems, where
typically (lossy) source coding is implemented at the application layer, while
channel coding is designed and optimized for the physical layer.
On the other hand, this separated source-channel
coding (SSCC) approach may incur substantial suboptimality, due to the nonideal
behavior of finite length, finite complexity, source and channel codes. In
fact, source codes designed without taking into account the presence of channel
decoding errors are typically very fragile and this might impose unnecessarily
restrictive constraints on the performance of channel coding. In such cases,
joint source-channel coding (JSCC) may lead to performance improvement (i.e., a
better operating point) for the same level of
complexity.
Most practical lossy source coding schemes for natural
sources (e.g., images, audio, video) are based on the idea of transform coding [2]. Source blocks are projected
onto a suitable basis by a linear transformation, such that the source is well
described by only a small number of significant transform coefficients. Then,
the coefficients are scalar-quantized, and finally the resulting sequence of
quantization indices are entropy coded. The theoretical foundation of this
approach relies on the universality of entropy-coded quantization, and dates
back to the work of Ziv [3]. In general, the linear transform is adapted to the
given class of sources (e.g., wavelet transforms for images [4]). The statistics of the
quantization indices is not known a priori. However, the memory structure of
the underlying discrete source is fixed and it is typically described as a
finite-memory tree-source (e.g., the context structure of JPEG2000 [5, 6]). Then, data compression is obtained by using an
adaptive entropy coding scheme that estimates the transition probabilities of
the source statistical model. For example, arithmetic coding [7] with Krichevsky-Trofimov
(KT) sequential probability estimation is a common choice [8].
For the sake of simplicity, this paper treats only
independent and identically distributed (i.i.d.) sources with known statistics,
that is, it neither deals with the transform coding aspect, nor with the
universal implementation of entropy coding. However, our results can be
generalized along the lines of what done in [5, 6]. Even in the nonuniversal case, classical lossless
compression is catastrophic: a
small Hamming distortion (number of bits in error) in the entropy-coded
sequence is mapped into a large distortion in the reconstructed source
sequence. This imposes a very strict target error probability on the channel
coding stage, thus involving both complex channel coding and operating points
that may be quite far from the theoretical limits. This is even more evident in
applications where the coding delay is limited, thus preventing the use of very
large block lengths.
It was shown in [9] that fixed-to-fixed length data compression of a
discrete source with linear codes is asymptotically optimal, in the sense that
compression up to the source entropy rate can be achieved. This is strongly
related to transmission using the same linear code on a discrete additive noise
channel where the noise has the same statistics as the discrete source. This
analogy can be exploited in order to design a JSCC scheme. We wish to maintain
the simplicity of the transform coding approach while improving the robustness
of the scheme. The rationale behind the proposed design is the following: since
linear codes achieve the entropy rate of discrete sources and the capacity of
symmetric channels, we can combine the entropy coding stage and the coding
stage into a single linear encoding stage. The advantage of this approach is
that the design of noncatastrophic linear encoders is very well understood.
Therefore, the proposed scheme can approach the optimal separation limit for
large block length, while achieving better robustness to channel errors at
finite decoding delay and complexity.
In [5], this JSCC approach was applied to the transmission of JPEG2000-like encoded images (in the sense that the
wavelet transform, the quantization scheme and the tree-source memory structure
were borrowed from JPEG2000), by using a family of progressively punctured
turbo codes to map directly the redundant quantization bits into channel
symbols. As stated above, here we focus on simpler i.i.d. sources with
perfectly known statistics (i.e., the nonuniversal case) and investigate in
greater detail the performance analysis and the comparison with the baseline
SSCC approach. In this work, we use raptor codes [10] in order to map the
redundant quantization bits into channel-coded symbols.
Our scheme works as follows. A source block of length is quantized symbol by symbol. The sequence of
quantization indices, represented as binary vectors, are partitioned into
bitplanes. The bitplanes are separately encoded into channel symbols by a bank
of binary raptor encoders. Each bitplane is encoded at a rate that depends on its
conditional entropy rate given the bitplanes previously encoded. At the
decoder, the bitplanes are decoded in sequence using a multistage decoder,
where in each stage we use a belief propagation (BP) iterative decoder that
takes into account both the already decoded bits from previous planes, and the
a priori statistics of the current bitplane as well as the received channel
output.
Raptor codes are a particularly useful class of
rateless codes. The advantage of using a rateless code is clear: with a single
basic encoding machine we can generate a continuum of coding rates. Therefore,
the scheme can adapt naturally to the entropy rate of the source and to the
capacity of the channel. Although we do not pursue the universal setting in
this work, we notice here that the proposed architecture allows a very fine
rate matching between the (unknown a priori) source entropy and the channel
capacity without resorting to a library of progressively punctured codes as is
done in [5].
We express the performance of a source-channel coding
scheme in terms of its peak signal-to-noise ratio (PSNR), expressed in dB,
defined as In particular, we will focus on
a standard Gaussian i.i.d. source and on the mean-squared distortion .
In this case, the distortion-rate function is .
At the Shannon separation limit, that is, letting ,
we have
Our aim is to design a family of practical schemes
that operate close to the curve versus .
Notice that we do not pursue here the design of embedded schemes, that is, of
single coding schemes that achieve multiple points. Nevertheless, the bitplane layered
structure of the proposed encoder and the proposed multistage decoder lend
themselves quite naturally to the design of embedded JSCC schemes. We leave
this aspect for future work and comment on it further in the concluding
section.
The rest of this paper is organized as follows. In
Section 2, we review the limits of scalar entropy-coded quantization and define
the target “operational Shannon limit” of our scheme. In Section 3 we present
a comprehensive analysis of the baseline SSCC scheme which represents our term of comparison.Section 4 presents the details of the proposed
scheme, its analysis and an algorithm for progressive incremental redundancy in
order to optimize the coding rates at each bitplane. Section 5 presents some
additional numerical comparisons between the baseline SSCC and the JSCC
schemes, and in Section 5 we present some concluding remarks. Raptor codes, BP
decoding, EXIT chart analysis and some ancillary results are presented in the
appendices.
2. Entropy-Coded Scalar Quantization
A source sequence of length ,
is quantized by applying componentwise the scalar quantizer ,
where bits are used to represent the magnitude and
one bit represents the sign. Let denote the sequence of quantization indices
and let denote the bits forming the th index. The sequence can be thought as a binary array, where each row is called a
“bitplane." Without loss of generality, we associate the th row with the sign bit and the rows from 1
to with the magnitude bits, with the convention that the first bitplane is
the least significant and the th bitplane is the most significant.
As anticipated in the Introduction, we fix the
quantizer and compare the performance of an SSCC approach based on the
concatenation of a conventional entropy coding stage with a conventional
channel code, with the performance of a JSCC that merges the two operations
into a single linear encoding map. Therefore, in the absence of channel
residual errors, both schemes achieve the same minimum distortion due to the
quantizer, denoted by .
Letting denote the entropy rate of ,
measured in bit per quantization index, we have that the
point is the best achievable point for
any scheme based on the fixed quantizer .
Following [2], we
refer to this point as the “operational Shannon limit” for schemes with fixed
quantizers.
We consider uniform scalar quantizers where the
interval size is chosen in order to minimize the mean-squared distortion of the
Gaussian unit variance i.i.d. source, for a fixed number of intervals. In [3], Ziv showed that a coding
scheme formed by a scalar uniform quantizer followed by entropy coding yields a
rate penalty of no more than bits per sample with respect to the limit. Thus, constraining the quantizer to be
a uniform scalar quantizer should cost no more then a channel symbols per source symbols.
In Figure 2 we compare the PSNR versus curves for the Shannon limit, the Ziv bound
and the operational Shannon limit for a family of optimized uniform scalar
quantizers with and ,
and for channel capacity .
All results in this paper make use of this family of quantizers.
3. Analysis of the Baseline SSCC Scheme
In this section, we study the performance of nonideal
SSCC. First, we consider the performance degradation due to nonideal source and
channel codes that operate at source coding rate and channel coding rate ,
respectively, where and are positive rate gaps. This analysis assumes
no errors at the output of the channel decoder.
Then, we introduce the channel decoding error
probability, and obtain a distortion upper bound as a function of and ,
closely following the analysis of [11]. This analysis is based on the random coding exponent
for channel codes, and essentially validates the error-free rate-gap analysis
even for moderately large block length .
Finally, we consider a very practical scheme, based on
the concatenation of arithmetic entropy coding and a conventional binary raptor
code. We provide a very accurate semi-analytic approximation for the achievable
PSNR of this scheme and show that the achieved results follow closely the error
free rate-gap analysis by matching the parameters and .
We also notice that for finite block length the practical scheme suffers from an
additional performance degradation, especially visible at high resolution
(large PSNR). We quantify this additional performance degradation by looking at
the finite-length versus infinite-length error performance of raptor codes.
3.1. Rate-Gap Analysis
Consider a separated scheme that makes use of channel
coding at rate and source coding at rate ,
where are rate gaps, and where the residual
bit-error rate (BER) at the output of the channel decoder is (essentially)
zero. Using and we obtain We notice that the slope of the
straight line characterizing PSNR versus decreases with the channel coding gap ,
while the source coding rate gap involves only a horizontal shift. As a result,
an SSCC whose channel coding stage achieves negligible BER works further and
further away from the Shannon limit as PSNR increases (high resolution).
3.2. SSCC with Codes Achieving Positive Error Exponent
In order to take into account channel decoding errors,
we modify slightly the approach of [11] and obtain the achievable PSNR lower bound (we omit
the details since the derivation follows trivially from [11]): where denotes the random coding error exponent for a
given coding ensemble over the considered transmission channel. Notice that for all ,
and therefore, the error exponent is positive for all rate gaps .
For values of such that ,
(5) essentially coincides with (4).
Figure 3 compares (4) and (5) for different values of and for block length (which is the finite source block length that
we will use throughout this paper), and .
This value of is chosen in order to match the rate gap
attained by the quantizers (see Figure 5). In these results, we considered a
binary symmetric channel (BSC) with capacity (cross-over probability ). The exponent for the BSC can be found, for example, in
[12]. For the
parameters of Figure 3 we notice that (4) and (5) do not coincide for too small (e.g., in the figure) while they coincide for large
enough (in this case, ). For finite but large block lengths, as in
this case, the threshold is given by the minimum value of the channel
coding rate gap above which the exponent becomes negative. In Figure 4, we plot and versus ,
when the parameters of Figure 3 are considered. It
has been observed that for different values of given in the range of Figure 3, is constant. Thus in Figure 4, we use .
3.3. SSCC with Arithmetic Coding and Raptor Codes
We provide an accurate approximated analysis of the
performance of a practical SSCC scheme that can be regarded as our baseline
scheme, since its encoding and decoding complexity is very similar to that of
the JSCC scheme examined in the next section. With reference to the block
diagram of Figure 1, we consider the concatenation of an optimized uniform
scalar quantizer with quantization bits with an arithmetic encoder.
The resulting entropy-coded bits are then channel encoded using a raptor code
of suitable rate.
Figure 1: Conceptual block diagram of the
conventional SSCC and the proposed JSCC schemes. The two schemes coincide but
for the fact that the concatenation of entropy coding and channel coding (SSCC)
is replaced by a single linear encoding block (JSCC).
Figure 2: PSNR versus for the Shannon Limit, the Ziv bound and the
operational Shannon limit for the considered family of scalar quantizers for and for channel capacity .
Figure 3: The SSCC rate-gap approximation
(
4) is compared with the PSNR upperbound (
5) for
and
.
Figure 4: and versus for , , , and .
Figure 5: Performance of
the concatenation of arithmetic coding and raptor code for infinite channel
coding length, source length
,
and a BSC with
.
, and
are the values of rate-gaps for the
operational Shannon limit (
4). For the base line SSCC, these rate-gap values are
empirically found to be
and
.
A sufficiently large interleaver is placed between the
entropy coding and the channel coding stages, such that the bit decoding errors
at the input of the arithmetic decoder can be considered to be i.i.d.. Since
the arithmetic encoder has perfect knowledge of the probability distribution of
the discrete source at the quantizer output, it can approach very
closely the source entropy rate even for moderate source block length .
We approximate the performance of such a scheme by
assuming that the arithmetic decoder produces random data after the first bit
error at its input. Let denote the number of entropy-coded bits
produced by the arithmetic encoder. These bits are channel encoded and decoded.
Let denote the position of the last correctly
decoded bit before the first bit error at the arithmetic decoder input. Under
the assumption of i.i.d. bit errors, is a truncated geometric random variable with
probability mass function for ,
and ,
where denotes the BER at the output of the channel
decoder. We approximate the number of correctly decoded quantization indices by (neglecting integer effects). After the first
bit error, the arithmetic decoder produces random symbols distributed as the
quantization indices (i.e., according to the given discrete-source probability
distribution) but essentially statistically independent of the source sequence.
Therefore, the average distortion in this case is given by where denotes a random variable distributed as the
quantizer reconstruction points, and denotes its variance. On the other hand, before
the first bit error, the system reconstructs the correct quantization points correctly; therefore the average distortion in
this case coincides with the quantization distortion .
Eventually, the total average distortion of the system
is approximated by The approximate analysis
requires the evaluation of the residual BER at the channel decoder output. This
can be obtained by simulation of the stand-alone raptor code with given finite
length, or by using any suitable approximation or semi-analytic technique, such
as Density Evolution or EXIT chart methods [13–17]. In particular, we make use of the EXIT chart
approximation, reviewed in Appendix B.
In Figure 5 we report the PSNR versus (obtained by using (8)) for different values
of when ,
a BSC with capacity and where the raptor code output BER is
approximated via the EXIT chart method. These results assume implicitly
infinite channel coding block length. In order to validate the approximated
distortion analysis of (8), we run simulations of the arithmetic decoder and
quantization reconstruction, fed with the quantization bits corrupted by
independent bit errors at a rate equal to the raptor code output BER. As seen
in Figure 5, the results of the simulated arithmetic decoder match remarkably
well the approximation (8), thus showing that the arithmetic decoder indeed
produces approximately random data after the first bit error.
The results in Figure 5 show that for the case of very
large channel coding block length the performance of the baseline SSCC scheme
is remarkably close to the operational Shannon limit and therefore the scheme
is hard to beat by any scheme using the same set of quantizers. However, the
picture changes when we consider a finite channel coding block length. In
particular, we consider independent encoding of each source block, so that the
system latency is dictated by the source block length and not by the channel coding block length.
This corresponds to choosing the raptor code input bits block length equal to .
For the system parameters as before, the PSNR results in this case are shown in
Figure 6. We notice that the finite channel coding block length yields an
additional degradation that increases with PSNR.
Figure 6: Performance of the
concatenation of arithmetic coding and a raptor code for finite channel coding
length, source length ,
and a BSC with .
We can explain and quantify the increasing bandwidth
expansion gap shown in Figure 6 as follows. Let and denote the channel coding rates needed by the
raptor code to reach a small BER such that the effective distortion is
virtually identical to the quantization distortion. For example, Figure 7 plots
the PSNR corresponding to the distortion (8) as a function of .
We notice that for the quantization distortion (corresponding to
the maximum achievable PSNR) is essentially reached. Then, Figure 8 plots the
raptor code BER for the BSC with capacity ,
as a function of the reciprocal of the channel coding rate .
Notice that the raptor code is a rateless code, and therefore we can generate
as many coded symbols as we like. In order to generate Figure 8 we keep the
channel parameter fixed (corresponding to ) and run encoding and decoding for smaller
and smaller coding rates. The infinite block length case is obtained by using
the EXIT chart approximation.
Figure 7: Output PSNR as a function of the
channel decoding residual BER .
Figure 8: Raptor code output BER for the
infinite block length case (EXIT approximation) and for the finite length case,
obtained by simulation, as a function of the reciprocal of the coding rate for .
The Finite length is taken to be which is the approximate number of bits at the
output of arithmetic encoder, since the Gaussian block has entropy rate when quantized with .
Figure 8 shows that the target BER of is reached at certain rates and for the cases of infinite and finite block
length, respectively, and allows us to find the difference ,
shown in the figure.
Finally, we can quantify the bandwidth expansion gaps
shown in Figure 6 by noticing that, since ,
we have It is clear that the gap is increasing with the quantizer resolution ,
and therefore with PSNR. This is a further confirmation of the fact that, in
practice, it becomes more and more difficult to approach the Shannon limit as
the resolution increases.
4. Joint Source-Channel Coding Scheme
In this section we describe the encoder and decoder of
the proposed JSCC scheme. Then, we discuss an incremental redundancy rate
allocation procedure that allows the optimization of the scheme. We hasten to
say that this rate allocation procedure is run off-line, and serves to design
the coding scheme for given source and channel statistics. More generally, an
adaptive scheme that allocates coded bits to the bitplanes on the fly,
depending on the empirical entropy rate of the source and on the capacity of
the channel may be envisaged in a universal JSCC setting, where the source
statistics are not known a priori and are learned instead from the source
sequence itself. However, we do not pursue this approach here.
Figure 9 shows the encoder block diagram. Each
bitplane (row of the binary array of quantization indices produced by the
quantizer), is mapped into a sequence of coded symbols. Here we consider binary
coding, and a BSC. Letting denote the th row of ,
the corresponding block of coded symbols is given by , where is a suitable encoding matrix of size .
Then, the encoded blocks are transmitted in sequence over the BSC. The
resulting bandwidth expansion factor is Given the source symmetry, it is
clear that the sign bit is equiprobable and has entropy .
Furthermore, it is independent of the magnitude bits. Hence, the target nominal
rate for the encoder of the sign bit is .
As for the th magnitude bit, we allocate a nominal target
rate equal to ,
where denotes the conditional entropy rate of the th bitplane, conditioned on the bitplanes .
It follows that the nominal bandwidth expansion is given by which is optimal.
Figure 9: Diagram of the proposed JSCC encoder.
In order to be able to decode at these rates, we
consider a multistage decoder as shown in Figure 10, that considers the
bitplanes in sequence. The sign bit is independently decoded. The magnitude
bits are decoded in sequence, starting from the th plane. At each decoding stage ,
the hard decisions of the already decoded planes are used by the BP decoder to
compute the conditional a priori probabilities of the th bitplane, as explained in Appendix A.
Assuming that at each level ,
the previous levels are correctly decoded, then the rates are achievable.
Figure 10: Multistage decoder for successive bitplane decoding
and source reconstruction.
In practice, due to the fact that the raptor codes do
not achieve sufficiently low BER if their rate is too close to the nominal rate
limit, we must allocate the rates allowing for some gap. The rate allocation
problem is made more complicated by the fact that in the multistage decoder the
decoding of the different planes is not independent. In particular, if the th plane fails with many bits in errors, then
it is very likely that all the planes will also fail, since their decoders are fed
with incorrect a priori conditional probabilities. We will address the problem
of rate allocation for the multistage decoder at the end of this section.
Next, let us examine in more detail how encoding of
the th plane is implemented with raptor codes
[10]. Raptor codes can
substantially be viewed as an extension of Luby Transform codes (LT codes)
[18], since they are
based on the concatenation of an outer linear code (in our case we consider
low-density-parity check (LDPC) codes) with an inner LT code (see Appendix A
for details). We use raptor codes in systematic form. In particular, let be a full-rank binary matrix given by , where is a submatrix of the LT code generator matrix
at encoding level and is the generator matrix of the LDPC code (see
[10] for details). The
encoder produces a vector of intermediate symbols, denoted by .
Then, the intermediate symbols are expanded by high-rate LDPC encoding, into .
Finally, the encoded symbols are obtained from ,
by applying nonsystematic rateless encoding, that is, the symbols are produced in sequence, and each is given as the sum of elements of selected at random according to the LT degree
distribution .
Notice that .
Therefore, in the Tanner graph representing the code [19, 20] the nodes corresponding to
source symbols have a degree distribution identical to that
of a standard nonsystematic raptor code. Furthermore, although is sparse, its inverse is sufficiently dense
such that the symbols are close to being uniform and random i.i.d.
Notice that this is essential to the scheme, since in order to drive the
channel with the correct input distribution we need to send the nonsystematic
symbols through the channel, and their distribution
should be as close as possible to i.i.d. and equiprobable.
A key component in the systematic raptor code design
consists of finding a suitable nonsingular matrix ,
with given column weight distribution, and such that its inverse looks as much
as possible like a random binary matrix. As for the LDPC code (often referred
to as the “precode” in the raptor coding literature), we used a regular code
with parameters .
Let us focus now on decoding and source
reconstruction. The multistage decoder of Figure 10 is based on BP at each
stage in order to approximately compute the
symbol-by-symbol posterior marginal Log-Likelihood Ratios (LLRs) defined as where denotes the channel output corresponding to
the input ,
and the conditioning is with respect to the already decoded bitplanes. This is
obtained by feeding the hard decisions from the planes to the BP decoder at level .
An iterative version of the multistage decoder where soft messages in the form
of a posteriori LLRs are exchanged instead of hard decisions was also
considered, but it was observed that this does not provide any significant
improvement and therefore was not pursued further, given its much greater
complexity.
The information about the already decoded bitplanes is
incorporated into the BP decoder for bitplane in the following way. As explained in Appendix
A, the BP algorithm is initialized with input messages at all the source and
coded nodes in the Tanner graph of the code. The coded nodes (corresponding to
the coded symbols ), receive their input message from the
corresponding channel observation. In the case of a BSC, this is given
by The source nodes (corresponding
to the source bits ), are associated with the input
messages where are the hard decisions obtained from the
previous stages.
The BP decoder at each stage runs for a given desired
number of iterations, and eventually outputs both hard decisions to be passed
to the next stage and soft outputs in the form of the posterior LLRs given by
(12). Once all bitplanes have been decoded, the source is reconstructed as
follows. Consider without loss of generality the inverse quantization
mapping that yields the mid-point of
each quantization interval given the set of quantization bits.
Then, we can either consider hard reconstruction,
which consists of using the hard decisions in (15), or soft reconstruction, which makes
use of the (approximate) posterior LLRs in order to compute the
minimum-mean-square-error (MMSE) estimator of the source samples given the
channel output, that is, the conditional mean estimator .
Treating the decoder estimated posterior LLRs as if they were the true
posterior LLRs, we obtain
In Appendix A, we prove an interesting isomorphism
between the BP decoder of the joint source-channel problem as described above
and a related standard channel coding problem. Let us focus on a single binary
independent source sequence of length ,
with probabilities for .
This is encoded into a binary codeword ,
of length ,
where is a raptor encoder as previously described. Let us
transmit through a BSC with cross-over probability ,
and let denote the corresponding output. The result holds for any binary input symmetric
output channel, but here we focus on the BSC for simplicity of
exposition. Then, the BP decoder for this problem
is isomorphic to a decoder for the following related channel coding problem:
consider transmission of the all-zero codeword from the systematic code with generator
matrix ,
of size over a channel that for the first components operates as where is the th source symbol, and for the remaining components operates as In order words, there exists a
one-to-one mapping of the messages of the BP decoder for the first problem
(joint source channel) and the messages of the BP decoder for the second
problem (channel only), for every edge of the decoder graph and every decoder
iteration.
This means that the source-channel BP decoding can be
analyzed, for example, by using the EXIT chart method, by considering the
associated “virtual” channel, where the all-zero codeword from the associated
systematic code is transmitted partly on a binary additive noise channel with
noise realization identical to the source realization of the source-channel
problem, and partly on the same BSC (with the same noise realization) of the source-channel problem. We use
this BP isomorphism result in order to derive a simple EXIT chart analysis of
the BP decoder at each stage of the multistage decoder, under the assumption
that the hard decisions from previous stages are correct.
4.1. Rate Allocation Algorithm
The rate allocation of each bitplane encoder is
established offline by using the greedy algorithm described below. Again, we
notice that we do not consider adaptive rate allocation: given the source and
channel statistics, we run the greedy allocation algorithm in order to design
the JSCC coding scheme.
Allocating the number of coded symbols according to
the optimal limits, that is, yields very bad performance even
at very large block length, since it is known that raptor codes converge to
very small BER at a fixed (small) gap from capacity on general binary-input
symmetric output channels [14].
Therefore, we have to allow for some increment in the coded block lengths,
normally referred to as “overhead” in the raptor coding literature. The
problem is how to allocate a total overhead among the stages. In order to do so, we propose the
following greedy overhead allocation algorithm.
We initialize the lengths according to their nominal value given by
(19). At each iteration of the allocation algorithm, we allocate a given number of additional coded symbols to one of the codes. Let denote the achieved average distortion of the
JSCC scheme when coding lengths are used and let .
Then, for iterations ,
do the following.
(i)For all ,
compute (ii)Find .(iii)Let for all ,
and .(iv)If ,
exit. Otherwise, let and go back to 1. The quantity is the tolerance within which we wish to
achieve the target quantization distortion.
In essence, the above algorithm allocates at each
iteration a packet of -coded bits to the bitplane raptor encoder
that yields the largest decrease in the overall average distortion. The
distortion can be computed either by Monte Carlo simulation, or by using the
EXIT chart approximation. The latter method is much faster, but cannot take
into account the effect of finite block length and the error propagation
between the stages of the multistage decoder.
In Figure 11,
we report the comparison between the finite length simulation and the infinite
length EXIT approximation for the same setting of ranging from 1 to 6, the BSC with capacity ,
and source block length used throughout the paper. As we can see, the
two cases yield almost identical results. This allows us to use the infinite
length EXIT approximation to estimate (with very good approximation) a suitable
rate allocation among the stages for the finite length case. Finally, in Figure 12, for the case of ,
we report the relative overhead versus produced by the greedy allocation algorithm. As one might expect, the
greedy algorithm starts increasing the overhead of the sign bitplane and then
continues from the most significant to least significant magnitude bitplanes.
Eventually, each bitplane is allocated a coding length between 12% and 18%
larger than the nominal length (19), in line with standard raptor coding
reported results. Furthermore, notice that this scheme tends to give larger
overheads to most significant bitplanes, that is, it implicitly implements
unequal error protection across the layers, which is a very well-known design
approach for multilevel coded modulation with multistage decoding [21].
Figure 11: PSNR versus comparison of JSCC
scheme for finite block length simulation and infinite block length EXIT
approximation for .
Figure 12: Relative overhead versus produced by the greedy allocation algorithm for the case .
We notice that the bitplane coding overheads are incremented one at a time, in
sequence.
5. Numerical Results
In this section, we provide both finite length and infinite
length results. We considered source block length for the finite length results. In all the
numerical results of this paper, we considered raptor codes with the “LT”
degree distribution [14] As outer code
we used a regular high-rate LDPC code with degrees and rate 0.98. The source symbols are
estimated after running 100 iterations of the decoding algorithm.
We would like to stress the fact that the LT and LDPC
degree distribution polynomials have been chosen without considering any
optimization method and that we have averaged over the ensemble of randomly
generated raptor codes with the given parameters. In practice, one would
carefully design an LDPC graph with good properties for the desired length and degree distributions.
This section is subdivided in two parts. In the first
part we described the results obtained by varying of the bandwidth expansion
factor, when the capacity of the BSC is fixed to ,
corresponding to crossover probability .
The aim of this section is to compare the performance of families of SSCC and
JSCC codes for different values of ,
and to see how they approach the operational Shannon limit.
In the second part we examine the behavior of a single fixed code, designed for a nominal
channel crossover probability and target PSNR, when we vary the channel
crossover probability. This set of results illustrates the robustness of a
given coding scheme to nonideal channel conditions.
In both subsections we provide results for infinite
and finite codeword length cases. The infinite case results have been generated
by using the EXIT chart approximation of Appendix B.
5.1. Approaching the Operational Shannon Limit
In Figure 13, we plot the performance comparison
between the proposed JSSC scheme and the SSCC scheme, when infinite codeword
length is considered. In this case, the SSCC scheme outperforms the proposed
scheme in the sense that it reaches the quantization distortion at slightly
lower values of ,
for all .
The SSCC schemes show a very sharp transition (“all or nothing” behavior). In
contrast, the JSCC schemes reach their quantization PSNR more gradually: as we
increase the overhead, the performance gradually improves.
Figure 13: JSCC and SSCC
infinite block length comparison for and .
The situation radically changes when we consider
finite codeword length. In Figure 14 we plot the performance of JSSC and SSCC
schemes for finite block length. In this case, the JSCC schemes outperform
their SSCC counterpart. In particular, as we have already remarked, the JSCC
performance is almost identical to that for infinite block length, while the
SSCC suffers much more evidently from the residual BER due to finite length
practical codes. This also hints that the EXIT approximated analysis yields very
faithful results for the JSCC scheme, while it provides optimistic results for
the SSCC scheme. This can be explained by the fact that the BER performance of
infinite length codes exhibits a very sharp “waterfall” threshold, beyond
which the BER is zero, while for finite length the waterfall is smoother.
Figure 14: JSCC and SSCC finite
block length comparison for and .
An important advantage of the JSCC is that the PSNR
value gradually increases as increases, while a sharp threshold effect can
be seen in the case of SSCC. In [5] it was shown that, with natural sources such as
images, PSNR values lower than peak value were still perceptually acceptable
for the JSCC scheme, while the SSCC scheme degrades abruptly also from the
perceptual viewpoint.
5.2. Robustness
In the previous set of results, we have fixed the
channel capacity and the (quantized) source entropy rate and we have examined
families of codes operating at different points. Now, we take a complementary view and
fix the channel code while letting the channel capacity vary. This setting is
relevant when a given code, designed for some nominal channel conditions, is
used on a channel of variable quality, and therefore we are interested in the
robustness of the performance with respect to the channel parameters. Also,
this setting is more akin to the standard way of studying the performance of
channel coding, where the BER is plotted as a function of the channel
parameters ( in the case of a BSC), for a given channel
code.
In order to have a fair comparison between the two
schemes, the bandwidth expansion factor (i.e., the code used) has been fixed in
the following way: we keep the minimum value of such that both schemes reach the quantization
PSNR on the previous set of results (see Figure 14). In particular we keep and for and ,
respectively. Since the JSCC scheme needs lower values of to reach the quantization PSNR in both cases,
we add some extra bits to reach the same values of .
We have examined the two extreme cases of low
resolution () and high resolution (). In Figures 15 and 16 we notice that in both
cases the JSCC scheme outperforms the SSCC scheme in terms of PSNR. Moreover,
as expected, the PSNR of the SSCC scheme degrades sharply, while the PSNR of
the JSCC scheme degrades gradually as the channel crossover probability
increases. For example, considering ,
if increases from its nominal value to a higher value the JSCC scheme loses about 6 dB in PSNR,
while the SSCC loses 24 dB. We interpret this sharp degradation as an effect of
the catastrophic behavior of the entropy coding stage in SSCC, which is greatly
mitigated by the linear coding stage in the proposed JSCC scheme.
Figure 15: Comparison of performance degradation of JSCC and SSCC
as the cross-over probability of the BSC increases for and .
Figure 16: Comparison of performance degradation of JSCC and SSCC
as the cross-over probability of the BSC increases for and .
6. Conclusions
Unlike most JSCC schemes
presented in the literature, which are carefully targeted for specific source
and channel pairs, the scheme proposed here can closely approach the
rate-distortion separation limit for virtually any well-behaved source under
quadratic distortion and any symmetric channel, owing to the universality of
entropy-coded quantization and the optimality of linear codes for both data
compression and channel coding. Furthermore, we have demonstrated that beyond
operating close to optimal, the proposed scheme is better and more robust than
a separated approach, especially in the practical case of finite block length.
We wish to conclude this paper with some
considerations for future work. Following [5], the JSCC scheme can be applied to any class of
sources for which efficient transform coding has been designed. In particular,
images, audio and video are natural and relevant candidates. The scheme takes
advantage of the know-how and careful source statistical characterization
developed in designing lossy coding standards, and preserves the structure of
the transform coder. This makes it easy to introduce the JSCC scheme into
practical applications, for example, by introducing a trans-coding stage at the
physical layer, while preserving the network architecture and the source coding
standards developed at the application layer.
Although we have not pursued this aspect here, the
bitplane layered encoding and multistage successive decoding architectures of
the proposed scheme lend themselves quite naturally to a multiresolution, or
“embedded," implementation. For example, it is sufficient to use an embedded
scalar quantizer in order to obtain such a scheme: bitplanes will be
transmitted in sequence, and the resolution of the reconstructed source
improves at each additional layer received.
A different route for future investigation involves
the use of nonbinary linear codes. Also for the proposed JSCC scheme, the gap
from the Shannon limit increases with the PSNR (resolution). This is due to the
fact that each layer needs to be encoded with a fixed overhead, such that the
overall overhead increases with the number of layers. As an alternative, we may
wish to use a nonbinary raptor code operating over symbols of bits, and mapping directly the quantization
indices over the channel symbols. The hope is that the overhead of such
nonbinary codes does not depend (or at least depends in a sublinear way) on .
This may lead to better bandwidth expansion gaps at high resolution.
Appendices
A. Raptor Codes and BP Decoding
Raptor codes [10] are a class of rateless codes designed for
transmission over erasure channels with unknown capacity. They are an extension
of Luby Transform codes (LT codes) [18], since they are based on the concatenation of an
outer linear code (precode) with an inner LT code. To be compliant with the
raptor codes terminology, let us define the input symbols as the symbols
generated from the source symbols by the linear precode encoder, and output
symbols as the symbols generated from the input symbols by the LT encoder.
Formally a raptor code is defined by the triplet ,
where is the source symbols length, is a linear encoder and represents the generating function of the
probability distribution on that generates the LT codewords.
The LT code ensemble corresponds to the ensemble
of binary matrices, for all ,
with columns randomly generated according to ,
where each matrix yields an encoding mapping.
The operations to generate a generic column of an LT
encoding matrix can be summarized in two steps:
(1)sample the distribution to obtain a weight between 1 and ;(2)generate the column uniformly at random from all binary vectors of weight and length ; As shown in
[14], it is possible
to adapt raptor codes for transmission over memoryless symmetric channels. The
decoding is performed by using the classical belief propagation algorithm (see
[14] for details).
In this paper, we exploit a high rate LDPC code as the
precoder, then the input nodes can also be seen as the bitnodes of the LDPC code.
A.1. BP Decoder Isomorphism
As anticipated in Section 4, there is an interesting
isomorphism between the standard channel coding problem when an all zero
codeword is transmitted (we refer to this as Scheme ) and the joint
source-channel coding problem as defined at each
stage of the multistage decoder (we refer to this as Scheme ).
Consider the following unified scheme. Let the vector be the output block when a vector of length is channel coded with a systematic raptor code
and where has length (i.e., the raptor code rate is equal to ()). Let us assume that the output block is
transmitted over a hybrid channel such that the first output symbols are distorted by noise vector where for and the remaining output symbols are distorted by the BSC
channel noise vector where .
Then, the hybrid channel is characterized by many BSCs with crossover
probabilities and .
The channel observation block is then composed of Notice that when then ,
and .
In this case the unified scheme becomes Scheme A. On the other hand, when ,
then ,
and the unified scheme becomes Scheme B. Let us consider the th iteration of the BP decoder. We use the
following notation (see Figure 17):
Figure 17: Raptor code factor graph for the
application of belief propagation.
(i) are the messages passed from the th input node to the th output node and from the th output node to the th input node, respectively, of the LT-decoder;(ii) are the messages passed from the th input node (the so called variable node in
classical LDPC notations) to the th check node and from the th check node to the th input node, respectively, of the LDPC decoder;(iii) is the message generated from the th LDPC input node and passed to the
corresponding input node of the LT-decoder;(iv) is the message generated from the th LT input node and passed to the corresponding
input node of the LDPC decoder; and(v) is the LLR of the th output symbol received from noisy channel;
notice for while for .
Using the notation above, we can define the updating
rules for the LT and the LDPC decoders separately.
For the LT decoder, at the th iteration, we have where the product is taken over
all input nodes adjacent to other than and the summation is taken over all output
nodes adjacent to other than .
For ,
we set for .
For the LDPC decoder, at the th iteration, we have The messages and passed from the LT to the LDPC decoder and
from the LDPC to the LT decoder, respectively, are defined by where the summation is taken
over all output nodes adjacent to or overall checknodes adjacent to .
The overall
factor graph (FG) of the proposed decoding algorithm is displayed in Figure 17
for the case of JSCC .
We use Wiberg's notation (see [20]), that is, the FG is a bipartite graph with variable
nodes (circles) and function nodes (boxes). A variable node is connected to a
function node if the corresponding variable is an argument of the corresponding
factor [20]. In our
case, the variable nodes correspond to the input symbols of the LT code and to
the input symbols of the LDPC code. The function nodes correspond to the output
symbols of the LT code and to the check nodes of the LDPC code. To explicitly
represent the messages passed between the two decoders at each stage, we split
the graph into two parts connected to each other by “equality constraints.”
Finally, to distinguish between channel outputs received from the equivalent
channel and channel outputs received from the noiseless channel, we explicitly
represent the source symbols , and the output of the noisy channel with input .
Let us also denote the input block by .
As we can see from the updating rules described above
and from the factor graph, the decoder can be modeled as two independent factor
graphs that exchange information between themselves after each
iteration.
Theorem 1. The magnitude of the BP messages exchanged between input and output
symbols for the same Tanner graph is the same for both Schemes A and B. In
particular, at BP round ,
the relationship between the messages passed in Schemes A and B
is (where is used to denote messages for Scheme A and is used for Scheme B).
Belief propagation equations (A.2)–(A.4) can
be also written in an explicit form by using a map from the real numbers to defined by .
Clearly is bijective and there exists an inverse .
Moreover, where addition is component-wise in and in .
Another important property is as follows:
Rewriting (A.2), (A.4) in terms of the mapping and using (A.8), we
have where and .
Similarly, we have
Proof. To prove the
theorem, the BP equations for each scheme will be given explicitly and then
starting with the round, the relationship between the messages
corresponding to different schemes will be verified. The proof follows by
induction, after showing that, given the rule holds for round , it also hold for round .
BP for Scheme A: In this case we have
BP for Scheme B: In this case, we have
Note that in the above equations, we have provided two
different versions of equations for for both Scheme A and Scheme B for values of and for .
We call these ranges of the first block and the second block,
respectively.
By applying the BP rules at round zero, we have the
following relationships between Scheme A and Scheme B: Then for round zero (A.7) are
satisfied.
Now let us
assume that the theorem holds for the th round. Then we have the
following equations for Round
Consequently, the equations for Round can be written as follows. Letting and denote any output symbols, from the first and the second output blocks, respectively, and letting and denote any adjacent input nodes, we can write:
Using the assumption, we can write
In order to find a relationship similar to what obtained
before, we need to apply (A.8). By applying (A.8), summation coefficient terms
such as or can be separated from the other summands. By
(A.8), it is known that the number of terms in the summation is important. For
any ,
denote by the adjacent input node set for the output
node .
Then since the summation of all of the input nodes should
give the value of the corresponding output node without additive noise. Similarly, for any ,
define as the set of
adjacent input nodes for the output node . Then
It is worth noting that by applying the round hypotheses to (A.5), we obtain ; that is, when we only consider the LDPC iterations, Scheme A and
Scheme B differ only in the signs of the channel observations. It can easily be
shown that, with such an input relationship between two schemes, the messages
will be also closely related as follows: for any .
Then by (A.6) we obtain so that we can
write Applying (A.23) for round ,
Equations (A.20), (A.21), and (A.25) are identical to the
ones assumed in (A.15) of the round. This completes the proof by induction.
Due to Theorem 1, the BER of the pure channel coding
scheme (assuming the all zero codeword) is equal to the BER of the source bits
in the JSCC scheme. Based on this result, we can obtain an EXIT chart by
considering the associated channel coding problem.
B. EXIT Chart Approximation
The standard analysis tool for graph-based codes under
BP iterative decoding, in the limit of infinite block length, is density
evolution (DE) [22, 23]. DE is typically computationally heavy, and
numerically not very well conditioned. A much simpler approximation of DE
consists of the so-called EXIT chart, which corresponds to DE by imposing the
restriction that message densities are of some particular form. In particular,
the EXIT with Gaussian approximation (GA) assumes that at every iteration the
BP message distribution is Gaussian having a particular symmetry condition,
which imposes that the variance is equal to times the mean [13]. At this point, densities
are uniquely identified by a single parameter, and the approximate DE tracks
the evolution of this single parameter across the decoding rounds.
In particular, the EXIT chart tracks the mutual
information between the message on a random edge of the graph and the
associated binary variable node connected to the edge. By the isomorphism
proved before, we know that the JSCC scheme and the “two-channel” scheme have
the same performance. For the sake of completeness, in this section we apply
the EXIT chart analysis to the to “two-channel” case. The resulting EXIT
chart applies directly to the JSCC EXIT chart for a binary source. Finally, we
briefly discuss how to apply the EXIT chart method to the multistage decoder
used by our JSCC scheme. The resulting EXIT chart analysis provides very
accurate approximations of the actual JSCC scheme performance, also in the
finite (moderately large) block length case (see Figure 11).
For the graph induced by the raptor (LT) distribution,
we define the input nodes (also called information bitnodes), the output nodes
(also called coded bitnodes) and the checknodes. For LDPC codes, we define just
the bitnodes and the checknodes, since any set of bitnodes that form an
information set, can be taken as information bitnodes (see Figure 17).
There are different ways of scheduling for raptor decoder.
Practical schedule
Activate in parallel all
input LT checknodes, then all LDPC bitnodes (corresponding to LT input nodes),
then all LDPC checknodes, then back to the LDPC bitnodes. This forms a complete
cycle of scheduling, which is repeated an arbitrarily large number of times.
This is the scheduling that was used in our finite length simulation.
Conceptually simple schedule
Activate the
LT checknodes. Then, reset the LDPC decoder and treat the messages generated by
the LT checknodes as inputs for the LDPC decoder. Perform infinite iterations
of the LDPC decoder. After reaching a fixed point of the LDPC decoder, take the
LLRs produced for the bitnodes by the LDPC decoder at the fixed-point
equilibrium and incorporate these messages as “virtual channel observations”
for the input nodes of the LT code. Then, activate all LT input nodes. This
provides a complete cycle of scheduling, which is repeated an arbitrarily large
number of times. Our EXIT chart equations are obtained assuming this
scheduling.
EXIT charts can be seen as a multidimensional dynamic
system. We are interested in studying the fixed points and the trajectories of
this system. As such, an EXIT chart has state variables. Proceeding to find an
EXIT recursion for the conceptually simple schedule, we will denote by and the state variables of the LT EXIT chart, and
by and the corresponding state variables for the LDPC
EXIT chart.
We use the following
notations.
(i) denotes the mutual information between a message sent along an edge with “left-degree” and the symbol corresponding to the bitnode ,
and the average of “" over all edges . Following standard parlance of LDPC codes, we
refer to the degree of the bitnode connected to an edge as the left degree of
that edge, and to the degree of the checknode connected to an edge as the right
degree of that edge.(ii) denotes the mutual information between a message sent along an edge with “right-degree” and the symbol corresponding to the bitnode and denotes the average of over all edges .(iii) denotes the mutual information between a message sent along an edge with “left-degree” and the symbol corresponding to the bitnode ,
and denotes the average of over all edge .(iv) denotes the mutual information between a message sent along an edge with “right-degree” and the symbol corresponding to the bitnode ,
and denotes the average of over all edge .(v)For an LDPC code, we let and denote the generating functions of the
edge-centric left- and right-degree distributions,
and we let denote the bit-centric
left-degree distribution.(vi)For an LT code, we let denote the edge-centric degree distribution of
the input nodes, denote the edge-centric degree distribution of
the “output nodes” or, equivalently, the edge-centric degree distribution of
the checknodes. The node-centric degree distribution of the checknodes, is
given by (vii)For the concatenation of the LT code with the
LDPC code we also have the node-centric degree distribution of the LT input
nodes. This is given by
We consider the class of EXIT functions that make use
of Gaussian approximation of the BP messages. Imposing the symmetry condition
and Gaussianity, the conditional distribution of each message in direction is Gaussian ,
for some value .
Hence, letting denote the corresponding bitnode variable, we
have where .
In BP, the message on is the sum of all messages incoming to on all other edges. The sum of Gaussian random
variables is also Gaussian, and its mean is the sum of the means of the
incoming messages. It follows that where is the mutual information (capacity) between
the bitnode variable and the corresponding LLR at the (binary-input symmetric
output) channel output. In the raptor case, the bitnodes correspond to
variables that are observed through a virtual channel by the LDPC decoder.
Averaging with respect to the edge-degree distribution, we have As far as checknodes are
concerned, we use the well-known quasiduality approximation and replace
checknodes with bitnodes by changing mutual information into entropy (i.e.,
replacing by ). Then
Let us consider now the “two-channel"
scenario induced by the JSCC isomorphism. Let denote the number of source bits, and denote the number of parity bits. In the
corresponding LT code, we have output nodes. The first output nodes are “observed” through a
channel with capacity (i.e., the channel corresponds to the source
statistics), while the second output nodes are observed through the actual
transmission channel, with capacity .
This channel feature is taken into account by an outer
expectation in the EXIT functions. Therefore, the LT EXIT chart can be written
in terms of the state equations as follows: where and ,
and where is the mutual information input by the LDPC
graph into the LT code graph via the node of degrees as explained in the following.
Equation (B.34) follows from the fact that a random edge is connected with probability to a source bit (i.e., to the channel with
capacity ), while with probability to a parity bit (i.e., to the channel with
capacity ).
Consider an LDPC bitnode that coincides with an input node of the LT
code. The degree of this node with respect to the LDPC graph is ,
while the degree of with respect to the LT graph is .
For a randomly generated graph, and a random choice of , and are independent random variables, with joint distribution
given by The mutual information input by
the LT graph into the LDPC graph via the node of degrees is given by Therefore, the LDPC EXIT chart
can be written in terms of the following state
equations: The mutual information input by
the LDPC graph into the LT graph via the node of degrees is given by
Equations (B.37), (B.33), and (B.34) form the state
equations of the global EXIT chart of the concatenated LT-LDPC graph, where the
state variables are ,
while the parameters are and ,
and the degree sequences ,
and .
Finally, in order to get the reconstruction distortion
we need to obtain the conditional probability density function (pdf) of the
LLRs output by BP for the source bits. Under the Gaussian approximation, the
LLR is Gaussian. Let denote the mean of the LLR of a source bitnode
corrected to a checknode of degree ,
given by Then, we approximate the average
BER of the source bits as
B.1. Multilayer EXIT Chart Analysis
For each bitplane, at every location, the entropy of
the bit depends on the realization of the bits at the previous (more
significant) bitplanes. We are then in the presence of a “time-varying”
memoryless channel in the corresponding channel coding problem. To develop the
equations for the multilayer case, we use the same idea described in the
previous section, namely, an outer expectation. At the th most significant level, the previous
corresponding bit locations might have different combinations, with possibly
different probabilities which are denoted by for .
Let denote the conditional entropy of a bit at the th most significant plane given that the value of
the corresponding more significant bits' combination is .
At the th most significant level, the channel has
capacity with probability ,
while it has capacity for with probability .
Following this approach we can modify (B.34) for the
decoding of the magnitude plane. It is worth noting that when
the biplane is considered, we assume that the sign
plane and (from to ) magnitude planes have been processed. Since
it is known that the magnitude plane model does not depend on the sign plane,
we take into account different realizations.
That is, we have We would like to underline that
the sign bitplane and the most important biplane do not depend on any other
bitplane, and so .
Similar to (B.41), we can update (B.40) as
follows: where
Note that a genie-aided scheme was assumed for the
EXIT analysis, where there is no error-propagation between layers. In fact, it
is not possible to take into account error propagation using the EXIT chart,
since the underlying assumption is that the message exchanged at each iteration
of the BP is a true LLR, that is, an LLR computed on the basis of the correct
conditional probabilities. Decision errors, instead, would feed the decoder at
a lower stage with “false” a priori probabilities.
As was done in the finite-length case, we will use
soft reconstruction for the infinite-length case where the conditional mean
estimator of the reconstruction points will be calculated using the fact that
the source bit LLRs have the symmetric Gaussian distribution. From the EXIT
chart, we can obtain the mean of the Gaussian approximation of the
conditional pdf of any LLR in the graph. Hence, this can be used to compute the
MMSE of the soft-reconstruction estimator (16).
Acknowledgments
This research
was supported in part by the National Science Foundation under Grants
ANI-03-38807, CNS-06-25637 and NeTS-NOSS-07-22073 and in part by the USC
Annenberg Graduate Fellowship Program.