Abstract

The straightforward application of Shannon's separation principle may entail a significant suboptimality in practical systems with limited coding delay and complexity. This is particularly evident when the lossy source code is based on entropy-coded quantization. In fact, it is well known that entropy coding is not robust to residual channel errors. In this paper, a joint source-channel coding scheme is advocated that combines the advantages and simplicity of entropy-coded quantization with the robustness of linear codes. The idea is to combine entropy coding and channel coding into a single linear encoding stage. If the channel is symmetric, the scheme can asymptotically achieve the optimal rate-distortion limit. However, its advantages are more clearly evident under finite coding delay and complexity. The sequence of quantization indices is decomposed into bitplanes, and each bitplane is independently mapped onto a sequence of channel coded symbols. The coding rate of each bitplane is chosen according to the bitplane conditional entropy rate. The use of systematic raptor encoders is proposed, in order to obtain a continuum of coding rates with a single basic encoding algorithm. Simulations show that the proposed scheme can outperform the separated baseline scheme for finite coding length and comparable complexity and, as expected, it is much more robust to channel errors in the case of channel capacity mismatch.

1. Introduction

A stationary ergodic source can be transmitted over an information-stable channel with end-to-end average distortion ๐ท with bandwidth expansion factor ๐‘ not lower than ๐‘…(๐ท)/๐ถ channel symbols per source sample, where ๐‘…(๐ท) is the source rate distortion function and ๐ถ is the channel capacity. Shannon's source-channel separation principle [1] ensures that this optimal performance can be approached by independently designing the source coding and the channel coding schemes. The bandwidth expansion factor ๐‘ is defined as the number of channel symbols per source symbol. If a block of ๐พ source symbols is transmitted through the channel in ๐‘ channel uses, then ๐‘=๐‘/๐พ. This provides a definite architectural advantage in practical systems, where typically (lossy) source coding is implemented at the application layer, while channel coding is designed and optimized for the physical layer.

On the other hand, this separated source-channel coding (SSCC) approach may incur substantial suboptimality, due to the nonideal behavior of finite length, finite complexity, source and channel codes. In fact, source codes designed without taking into account the presence of channel decoding errors are typically very fragile and this might impose unnecessarily restrictive constraints on the performance of channel coding. In such cases, joint source-channel coding (JSCC) may lead to performance improvement (i.e., a better (๐‘,๐ท) operating point) for the same level of complexity.

Most practical lossy source coding schemes for natural sources (e.g., images, audio, video) are based on the idea of transform coding [2]. Source blocks are projected onto a suitable basis by a linear transformation, such that the source is well described by only a small number of significant transform coefficients. Then, the coefficients are scalar-quantized, and finally the resulting sequence of quantization indices are entropy coded. The theoretical foundation of this approach relies on the universality of entropy-coded quantization, and dates back to the work of Ziv [3]. In general, the linear transform is adapted to the given class of sources (e.g., wavelet transforms for images [4]). The statistics of the quantization indices is not known a priori. However, the memory structure of the underlying discrete source is fixed and it is typically described as a finite-memory tree-source (e.g., the context structure of JPEG2000 [5, 6]). Then, data compression is obtained by using an adaptive entropy coding scheme that estimates the transition probabilities of the source statistical model. For example, arithmetic coding [7] with Krichevsky-Trofimov (KT) sequential probability estimation is a common choice [8].

For the sake of simplicity, this paper treats only independent and identically distributed (i.i.d.) sources with known statistics, that is, it neither deals with the transform coding aspect, nor with the universal implementation of entropy coding. However, our results can be generalized along the lines of what done in [5, 6]. Even in the nonuniversal case, classical lossless compression is catastrophic: a small Hamming distortion (number of bits in error) in the entropy-coded sequence is mapped into a large distortion in the reconstructed source sequence. This imposes a very strict target error probability on the channel coding stage, thus involving both complex channel coding and operating points that may be quite far from the theoretical limits. This is even more evident in applications where the coding delay is limited, thus preventing the use of very large block lengths.

It was shown in [9] that fixed-to-fixed length data compression of a discrete source with linear codes is asymptotically optimal, in the sense that compression up to the source entropy rate can be achieved. This is strongly related to transmission using the same linear code on a discrete additive noise channel where the noise has the same statistics as the discrete source. This analogy can be exploited in order to design a JSCC scheme. We wish to maintain the simplicity of the transform coding approach while improving the robustness of the scheme. The rationale behind the proposed design is the following: since linear codes achieve the entropy rate of discrete sources and the capacity of symmetric channels, we can combine the entropy coding stage and the coding stage into a single linear encoding stage. The advantage of this approach is that the design of noncatastrophic linear encoders is very well understood. Therefore, the proposed scheme can approach the optimal separation limit for large block length, while achieving better robustness to channel errors at finite decoding delay and complexity.

In [5], this JSCC approach was applied to the transmission of JPEG2000-like encoded images (in the sense that the wavelet transform, the quantization scheme and the tree-source memory structure were borrowed from JPEG2000), by using a family of progressively punctured turbo codes to map directly the redundant quantization bits into channel symbols. As stated above, here we focus on simpler i.i.d. sources with perfectly known statistics (i.e., the nonuniversal case) and investigate in greater detail the performance analysis and the comparison with the baseline SSCC approach. In this work, we use raptor codes [10] in order to map the redundant quantization bits into channel-coded symbols.

Our scheme works as follows. A source block of length ๐พ is quantized symbol by symbol. The sequence of quantization indices, represented as binary vectors, are partitioned into bitplanes. The bitplanes are separately encoded into channel symbols by a bank of binary raptor encoders. Each bitplane is encoded at a rate that depends on its conditional entropy rate given the bitplanes previously encoded. At the decoder, the bitplanes are decoded in sequence using a multistage decoder, where in each stage we use a belief propagation (BP) iterative decoder that takes into account both the already decoded bits from previous planes, and the a priori statistics of the current bitplane as well as the received channel output.

Raptor codes are a particularly useful class of rateless codes. The advantage of using a rateless code is clear: with a single basic encoding machine we can generate a continuum of coding rates. Therefore, the scheme can adapt naturally to the entropy rate of the source and to the capacity of the channel. Although we do not pursue the universal setting in this work, we notice here that the proposed architecture allows a very fine rate matching between the (unknown a priori) source entropy and the channel capacity without resorting to a library of progressively punctured codes as is done in [5].

We express the performance of a source-channel coding scheme in terms of its peak signal-to-noise ratio (PSNR), expressed in dB, defined as PSNRโ‰œโˆ’10log10(๐ท).(1) In particular, we will focus on a standard Gaussian i.i.d. source ๐‘†โˆผ๐’ฉ(0,1) and on the mean-squared distortion ๎๐ท=๐”ผ[|๐‘†โˆ’๐‘†|2]. In this case, the distortion-rate function is ๐ท=2โˆ’2๐‘…. At the Shannon separation limit, that is, letting ๐‘…=๐‘๐ถ, we have PSNRShannon=20log10(2)๐‘๐ถ=(6dB)ร—๐‘๐ถ.(2)

Our aim is to design a family of practical schemes that operate close to the curve PSNRShannon versus ๐‘. Notice that we do not pursue here the design of embedded schemes, that is, of single coding schemes that achieve multiple (PSNR,๐‘) points. Nevertheless, the bitplane layered structure of the proposed encoder and the proposed multistage decoder lend themselves quite naturally to the design of embedded JSCC schemes. We leave this aspect for future work and comment on it further in the concluding section.

The rest of this paper is organized as follows. In Section 2, we review the limits of scalar entropy-coded quantization and define the target โ€œoperational Shannon limitโ€ of our scheme. In Section 3 we present a comprehensive analysis of the baseline SSCC scheme which represents our term of comparison.Section 4 presents the details of the proposed scheme, its analysis and an algorithm for progressive incremental redundancy in order to optimize the coding rates at each bitplane. Section 5 presents some additional numerical comparisons between the baseline SSCC and the JSCC schemes, and in Section 5 we present some concluding remarks. Raptor codes, BP decoding, EXIT chart analysis and some ancillary results are presented in the appendices.

2. Entropy-Coded Scalar Quantization

A source sequence of length ๐พ,๐ฌโˆˆโ„๐พ, is quantized by applying componentwise the scalar quantizer ๐’ฌ๐ตโˆถโ„โ†’๐”ฝ2๐ต+1, where ๐ตโ€‰bits are used to represent the magnitude and one bit represents the sign. Let ๐ฎ=๐’ฌ๐ต(๐ฌ) denote the sequence of quantization indices and let ๐‘ข๐‘,๐‘˜โˆถ๐‘=0,โ€ฆ,๐ต denote the bits forming the ๐‘˜th index. The sequence ๐ฎ can be thought as a (๐ต+1)ร—๐พ binary array, where each row is called a โ€œbitplane." Without loss of generality, we associate the 0th row with the sign bit and the rows from 1 to ๐ต with the magnitude bits, with the convention that the first bitplane is the least significant and the ๐ตth bitplane is the most significant.

As anticipated in the Introduction, we fix the quantizer and compare the performance of an SSCC approach based on the concatenation of a conventional entropy coding stage with a conventional channel code, with the performance of a JSCC that merges the two operations into a single linear encoding map. Therefore, in the absence of channel residual errors, both schemes achieve the same minimum distortion due to the quantizer, denoted by ๐ท๐’ฌ(๐ต). Letting ๐ป๐ต(๐‘ˆ) denote the entropy rate of ๐ฎ, measured in bit per quantization index, we have that the point ๎‚€๐ป๐‘=๐ต(๐‘ˆ)๐ถ,PSNR=โˆ’10log10๎‚€๐ท๐’ฌ(๐ต)๎‚๎‚(3) is the best achievable point for any scheme based on the fixed quantizer ๐’ฌ๐ต. Following [2], we refer to this point as the โ€œoperational Shannon limitโ€ for schemes with fixed quantizers.

We consider uniform scalar quantizers where the interval size is chosen in order to minimize the mean-squared distortion of the Gaussian unit variance i.i.d. source, for a fixed number 2๐ต+1 of intervals. In [3], Ziv showed that a coding scheme formed by a scalar uniform quantizer followed by entropy coding yields a rate penalty of no more than 0.754โ€‰bits per sample with respect to the ๐‘…(๐ท) limit. Thus, constraining the quantizer to be a uniform scalar quantizer should cost no more then a 0.754/๐ถ channel symbols per source symbols.

In Figure 2 we compare the PSNR versus ๐‘ curves for the Shannon limit, the Ziv bound and the operational Shannon limit for a family of optimized uniform scalar quantizers with ๐ต=1,2,3,4,5, and 6, and for channel capacity ๐ถ=0.5. All results in this paper make use of this family of quantizers.

3. Analysis of the Baseline SSCC Scheme

In this section, we study the performance of nonideal SSCC. First, we consider the performance degradation due to nonideal source and channel codes that operate at source coding rate ๐‘…๐‘ =๐‘…(๐ท)+๐›ฟ๐‘  and channel coding rate ๐‘…๐‘=๐ถโˆ’๐›ฟ๐‘, respectively, where ๐›ฟ๐‘  and ๐›ฟ๐‘ are positive rate gaps. This analysis assumes no errors at the output of the channel decoder.

Then, we introduce the channel decoding error probability, and obtain a distortion upper bound as a function of ๐›ฟ๐‘ and ๐›ฟ๐‘ , closely following the analysis of [11]. This analysis is based on the random coding exponent for channel codes, and essentially validates the error-free rate-gap analysis even for moderately large block length ๐พ.

Finally, we consider a very practical scheme, based on the concatenation of arithmetic entropy coding and a conventional binary raptor code. We provide a very accurate semi-analytic approximation for the achievable PSNR of this scheme and show that the achieved results follow closely the error free rate-gap analysis by matching the parameters ๐›ฟ๐‘ and ๐›ฟ๐‘ . We also notice that for finite block length ๐พ the practical scheme suffers from an additional performance degradation, especially visible at high resolution (large PSNR). We quantify this additional performance degradation by looking at the finite-length versus infinite-length error performance of raptor codes.

3.1. Rate-Gap Analysis

Consider a separated scheme that makes use of channel coding at rate ๐‘…๐‘=๐ถโˆ’๐›ฟ๐‘ and source coding at rate ๐‘…๐‘ =๐‘…(๐ท)+๐›ฟ๐‘ , where ๐›ฟ๐‘,๐›ฟ๐‘ >0 are rate gaps, and where the residual bit-error rate (BER) at the output of the channel decoder is (essentially) zero. Using ๐ท=2โˆ’2(๐‘…๐‘ โˆ’๐›ฟ๐‘ ) and ๐‘=๐‘…๐‘ /๐‘…๐‘ we obtain ๎‚€๐‘๎‚€PSNR=(6dB)ร—๐ถโˆ’๐›ฟ๐‘๎‚โˆ’๐›ฟ๐‘ ๎‚.(4) We notice that the slope of the straight line characterizing PSNR versus ๐‘ decreases with the channel coding gap ๐›ฟ๐‘, while the source coding rate gap involves only a horizontal shift. As a result, an SSCC whose channel coding stage achieves negligible BER works further and further away from the Shannon limit as PSNR increases (high resolution).

3.2. SSCC with Codes Achieving Positive Error Exponent

In order to take into account channel decoding errors, we modify slightly the approach of [11] and obtain the achievable PSNR lower bound (we omit the details since the derivation follows trivially from [11]): ๎‚€๐‘๎‚€PSNRโ‰ฅ(6dB)ร—๐ถโˆ’๐›ฟ๐‘๎‚โˆ’๐›ฟ๐‘ ๎‚โˆ’10log10๎‚€1+2โˆ’๐พ๐‘๐ธ๐‘Ÿ(๐ถโˆ’๐›ฟ๐‘)+1+2๐‘(๐ถโˆ’๐›ฟ๐‘)โˆ’2๐›ฟ๐‘ ๎‚,(5) where ๐ธ๐‘Ÿ(๐‘…๐‘) denotes the random coding error exponent for a given coding ensemble over the considered transmission channel. Notice that ๐ธ๐‘Ÿ(๐‘…๐‘)>0 for all ๐‘…๐‘<๐ถ, and therefore, the error exponent is positive for all rate gaps ๐›ฟ๐‘>0. For values of ๐พ,๐›ฟ๐‘,๐›ฟ๐‘ ,๐‘,๐ถ such that 2โˆ’๐พ๐‘๐ธ๐‘Ÿ(๐ถโˆ’๐›ฟ๐‘)+1+2๐‘(๐ถโˆ’๐›ฟ๐‘)โˆ’2๐›ฟ๐‘ โ‰ช1, (5) essentially coincides with (4).

Figure 3 compares (4) and (5) for different values of ๐›ฟ๐‘ and for block length ๐พ=10000 (which is the finite source block length that we will use throughout this paper), and ๐›ฟ๐‘ =0.3821. This value of ๐›ฟ๐‘  is chosen in order to match the rate gap attained by the quantizers (see Figure 5). In these results, we considered a binary symmetric channel (BSC) with capacity ๐ถ=0.5 (cross-over probability ๐œ–=0.11). The exponent ๐ธ๐‘Ÿ(๐‘…๐‘) for the BSC can be found, for example, in [12]. For the parameters of Figure 3 we notice that (4) and (5) do not coincide for too small ๐›ฟ๐‘ (e.g., ๐›ฟ๐‘=0.01 in the figure) while they coincide for large enough ๐›ฟ๐‘ (in this case, ๐›ฟ๐‘โ‰ฅ๐›ฟ๐‘โˆ—=0.016). For finite but large block lengths, as in this case, the threshold ๐›ฟ๐‘โˆ— is given by the minimum value of the channel coding rate gap above which the exponent ๐‘‡(๐ถ,๐›ฟ๐‘,๐›ฟ๐‘ ,๐พ,๐‘)โ‰œโˆ’๐พ๐‘๐ธ๐‘Ÿ(๐ถโˆ’๐›ฟ๐‘)+1+2๐‘(๐ถโˆ’๐›ฟ๐‘)โˆ’2๐›ฟ๐‘  becomes negative. In Figure 4, we plot ๐‘‡(๐ถ,๐›ฟ๐‘,๐›ฟ๐‘ ,๐พ,๐‘) and ๐ธ๐‘Ÿ(๐ถโˆ’๐›ฟ๐‘) versus ๐›ฟ๐‘, when the parameters of Figure 3 are considered. It has been observed that for different values of ๐‘ given in the range of Figure 3, ๐›ฟ๐‘ is constant. Thus in Figure 4, we use ๐‘=7.2036.

3.3. SSCC with Arithmetic Coding and Raptor Codes

We provide an accurate approximated analysis of the performance of a practical SSCC scheme that can be regarded as our baseline scheme, since its encoding and decoding complexity is very similar to that of the JSCC scheme examined in the next section. With reference to the block diagram of Figure 1, we consider the concatenation of an optimized uniform scalar quantizer with ๐ต+1 quantization bits with an arithmetic encoder. The resulting entropy-coded bits are then channel encoded using a raptor code of suitable rate.

A sufficiently large interleaver is placed between the entropy coding and the channel coding stages, such that the bit decoding errors at the input of the arithmetic decoder can be considered to be i.i.d.. Since the arithmetic encoder has perfect knowledge of the probability distribution of the discrete source ๐ฎ at the quantizer output, it can approach very closely the source entropy rate ๐ป๐ต(๐‘ˆ) even for moderate source block length ๐พ.

We approximate the performance of such a scheme by assuming that the arithmetic decoder produces random data after the first bit error at its input. Let ๐‘€=๐พ๐ป๐ต(๐‘ˆ) denote the number of entropy-coded bits produced by the arithmetic encoder. These bits are channel encoded and decoded. Let ๐‘š denote the position of the last correctly decoded bit before the first bit error at the arithmetic decoder input. Under the assumption of i.i.d. bit errors, ๐‘š is a truncated geometric random variable with probability mass function ๎‚€๐‘ƒ(๐‘š=๐‘–)=1โˆ’๐‘ƒ๐‘๎‚๐‘–๐‘ƒ๐‘(6) for ๐‘–=0,1,โ€ฆ,๐‘€โˆ’1, and โˆ‘๐‘ƒ(๐‘š=๐‘€)=1โˆ’๐‘€โˆ’1๐‘–=0(1โˆ’๐‘ƒ๐‘)๐‘–๐‘ƒ๐‘, where ๐‘ƒ๐‘ denotes the BER at the output of the channel decoder. We approximate the number of correctly decoded quantization indices by ๐‘š/๐ป๐ต(๐‘ˆ) (neglecting integer effects). After the first bit error, the arithmetic decoder produces random symbols distributed as the quantization indices (i.e., according to the given discrete-source probability distribution) but essentially statistically independent of the source sequence. Therefore, the average distortion in this case is given by๐œŽ2๎‚๐‘†๎‚=๐”ผ๎‚ƒ๎‚€๐‘†โˆ’2๎‚„๎‚ƒ๐‘†=๐”ผ2๎‚„๎‚ƒ๎‚‹๐‘†+๐”ผ2๎‚„=1+๐œŽ2๐’ฌ,(7) where ๎‚๐‘† denotes a random variable distributed as the quantizer reconstruction points, and ๐œŽ2๐’ฌ denotes its variance. On the other hand, before the first bit error, the system reconstructs the correct quantization points ๎๐‘† correctly; therefore the average distortion in this case coincides with the quantization distortion ๐ท๐’ฌ(๐ต).

Eventually, the total average distortion of the system is approximated by ๐ทโ‰ˆ๐‘ƒ(๐‘š=๐‘€)๐ท๐’ฌ+(๐ต)๐‘€โˆ’1๎“๐‘–=0๎‚€1โˆ’๐‘ƒ๐‘๎‚๐‘–๐‘ƒ๐‘๎‚ƒ๐ท๐’ฌ(๐ต)๐‘–/๐ป๐ต(๐‘ˆ)๐พ+๐œŽ2๐พโˆ’๐‘–/๐ป๐ต(๐‘ˆ)๐พ๎‚„.(8) The approximate analysis requires the evaluation of the residual BER at the channel decoder output. This can be obtained by simulation of the stand-alone raptor code with given finite length, or by using any suitable approximation or semi-analytic technique, such as Density Evolution or EXIT chart methods [13โ€“17]. In particular, we make use of the EXIT chart approximation, reviewed in Appendix B.

In Figure 5 we report the PSNR versus ๐‘ (obtained by using (8)) for different values of ๐ต when ๐พ=10000, a BSC with capacity ๐ถ=0.5 and where the raptor code output BER is approximated via the EXIT chart method. These results assume implicitly infinite channel coding block length. In order to validate the approximated distortion analysis of (8), we run simulations of the arithmetic decoder and quantization reconstruction, fed with the quantization bits corrupted by independent bit errors at a rate equal to the raptor code output BER. As seen in Figure 5, the results of the simulated arithmetic decoder match remarkably well the approximation (8), thus showing that the arithmetic decoder indeed produces approximately random data after the first bit error.

The results in Figure 5 show that for the case of very large channel coding block length the performance of the baseline SSCC scheme is remarkably close to the operational Shannon limit and therefore the scheme is hard to beat by any scheme using the same set of quantizers. However, the picture changes when we consider a finite channel coding block length. In particular, we consider independent encoding of each source block, so that the system latency is dictated by the source block length ๐พ and not by the channel coding block length. This corresponds to choosing the raptor code input bits block length equal to ๐‘€=๐พ๐ป๐ต(๐‘ˆ). For the system parameters as before, the PSNR results in this case are shown in Figure 6. We notice that the finite channel coding block length yields an additional degradation that increases with PSNR.

We can explain and quantify the increasing bandwidth expansion gap ฮ”๐‘(๐ต) shown in Figure 6 as follows. Let ๐‘…inf and ๐‘…๏ฌn denote the channel coding rates needed by the raptor code to reach a small BER such that the effective distortion is virtually identical to the quantization distortion. For example, Figure 7 plots the PSNR corresponding to the distortion (8) as a function of ๐‘ƒ๐‘. We notice that for ๐‘ƒ๐‘=10โˆ’7 the quantization distortion (corresponding to the maximum achievable PSNR) is essentially reached. Then, Figure 8 plots the raptor code BER for the BSC with capacity ๐ถ=0.5, as a function of the reciprocal of the channel coding rate 1/๐‘…๐‘. Notice that the raptor code is a rateless code, and therefore we can generate as many coded symbols as we like. In order to generate Figure 8 we keep the channel parameter ๐œ–=0.11 fixed (corresponding to ๐ถ=0.5) and run encoding and decoding for smaller and smaller coding rates. The infinite block length case is obtained by using the EXIT chart approximation.

Figure 8 shows that the target BER of 10โˆ’7 is reached at certain rates ๐‘…inf and ๐‘…๏ฌn for the cases of infinite and finite block length, respectively, and allows us to find the difference 1/๐‘…infโˆ’1/๐‘…๏ฌn, shown in the figure.

Finally, we can quantify the bandwidth expansion gaps shown in Figure 6 by noticing that, since ๐‘=๐ป๐ต(๐‘ˆ)/๐‘…๐‘, we have ฮ”๐‘(๐ต)=๐ป๐ต๎‚€1(๐‘ˆ)๐‘…infโˆ’1๐‘…๏ฌn๎‚.(9) It is clear that the gap ฮ”๐‘(๐ต) is increasing with the quantizer resolution ๐ต, and therefore with PSNR. This is a further confirmation of the fact that, in practice, it becomes more and more difficult to approach the Shannon limit as the resolution increases.

4. Joint Source-Channel Coding Scheme

In this section we describe the encoder and decoder of the proposed JSCC scheme. Then, we discuss an incremental redundancy rate allocation procedure that allows the optimization of the scheme. We hasten to say that this rate allocation procedure is run off-line, and serves to design the coding scheme for given source and channel statistics. More generally, an adaptive scheme that allocates coded bits to the bitplanes on the fly, depending on the empirical entropy rate of the source and on the capacity of the channel may be envisaged in a universal JSCC setting, where the source statistics are not known a priori and are learned instead from the source sequence itself. However, we do not pursue this approach here.

Figure 9 shows the encoder block diagram. Each bitplane (row of the binary array ๐ฎ of quantization indices produced by the quantizer), is mapped into a sequence of coded symbols. Here we consider binary coding, and a BSC. Letting ๐ฎ๐‘ denote the ๐‘th row of ๐ฎ, the corresponding block of coded symbols is given by ๐ฑ๐‘=๐ฎ๐‘๐†๐‘, where ๐†๐‘ is a suitable encoding matrix of size ๐พร—๐‘๐‘. Then, the encoded blocks ๐ฑ0,โ€ฆ,๐ฑ๐ต are transmitted in sequence over the BSC. The resulting bandwidth expansion factor is โˆ‘๐‘=๐ต๐‘=0๐‘๐‘๐พ.(10) Given the source symmetry, it is clear that the sign bit is equiprobable and has entropy ๐ป(๐‘ˆ0)=1. Furthermore, it is independent of the magnitude bits. Hence, the target nominal rate for the encoder of the sign bit is ๐พ/๐‘0=๐ถ. As for the ๐‘th magnitude bit, we allocate a nominal target rate equal to ๐พ/๐‘๐‘=๐ถ/๐ป(๐‘ˆ๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต), where ๐ป(๐‘ˆ๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต) denotes the conditional entropy rate of the ๐‘th bitplane, conditioned on the bitplanes ๐‘+1,๐‘+2,โ€ฆ,๐ต. It follows that the nominal bandwidth expansion is given byโˆ‘๐‘=๐พ/๐ถ+๐ต๐‘=1๎‚€๐‘ˆ๐พ๐ป๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต๎‚/๐ถ๐พ=โˆ‘1+๐ต๐‘=1๐ป๎‚€๐‘ˆ๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต๎‚๐ถ=๐ป๐ต(๐‘ˆ)๐ถ,(11) which is optimal.

In order to be able to decode at these rates, we consider a multistage decoder as shown in Figure 10, that considers the bitplanes in sequence. The sign bit is independently decoded. The magnitude bits are decoded in sequence, starting from the ๐ตth plane. At each decoding stage ๐‘, the hard decisions of the already decoded planes are used by the BP decoder to compute the conditional a priori probabilities of the ๐‘th bitplane, as explained in Appendix A. Assuming that at each level ๐‘, the previous levels are correctly decoded, then the rates ๐พ/๐‘๐‘=๐ถ/๐ป(๐‘ˆ๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต) are achievable.

In practice, due to the fact that the raptor codes do not achieve sufficiently low BER if their rate is too close to the nominal rate limit, we must allocate the rates allowing for some gap. The rate allocation problem is made more complicated by the fact that in the multistage decoder the decoding of the different planes is not independent. In particular, if the ๐‘th plane fails with many bits in errors, then it is very likely that all the planes ๐‘โˆ’1,๐‘โˆ’2,โ€ฆ,1 will also fail, since their decoders are fed with incorrect a priori conditional probabilities. We will address the problem of rate allocation for the multistage decoder at the end of this section.

Next, let us examine in more detail how encoding of the ๐‘th plane is implemented with raptor codes [10]. Raptor codes can substantially be viewed as an extension of Luby Transform codes (LT codes) [18], since they are based on the concatenation of an outer linear code (in our case we consider low-density-parity check (LDPC) codes) with an inner LT code (see Appendix A for details). We use raptor codes in systematic form. In particular, let ๐’๐‘ be a ๐พร—๐พ full-rank binary matrix given by ๐’๐‘=๐†ldpc๐€๐‘, where ๐€๐‘ is a submatrix of the LT code generator matrix at encoding level ๐‘ and ๐†ldpc is the generator matrix of the LDPC code (see [10] for details). The encoder produces a vector of ๐พ intermediate symbols, denoted by ๐ฎ๎…ž๐‘=๐ฎ๐‘๐’๐‘โˆ’1. Then, the intermediate symbols are expanded by high-rate LDPC encoding, into ๐ฎ๐‘๎…ž๎…ž=๐ฎ๎…ž๐‘๐†ldpc. Finally, the encoded symbols ๐ฑ are obtained from ๐ฎ๎…ž๎…ž, by applying nonsystematic rateless encoding, that is, the symbols ๐‘ฅ1,๐‘ฅ2,โ€ฆ,๐‘ฅ๐‘๐‘ are produced in sequence, and each ๐‘ฅ๐‘– is given as the sum of elements of ๐ฎ๎…ž๎…ž selected at random according to the LT degree distribution ฮฉ.

Notice that ๐ฎ๐‘=๐ฎ๎…ž๐‘๐’๐‘. Therefore, in the Tanner graph representing the code [19, 20] the nodes corresponding to source symbols ๐›๐‘ have a degree distribution identical to that of a standard nonsystematic raptor code. Furthermore, although ๐’๐‘ is sparse, its inverse is sufficiently dense such that the symbols ๐ฎ๎…ž๐‘ are close to being uniform and random i.i.d. Notice that this is essential to the scheme, since in order to drive the channel with the correct input distribution we need to send the nonsystematic symbols ๐ฑ๐‘ through the channel, and their distribution should be as close as possible to i.i.d. and equiprobable.

A key component in the systematic raptor code design consists of finding a suitable nonsingular ๐พร—๐พ matrix ๐’๐‘, with given column weight distribution, and such that its inverse looks as much as possible like a random binary matrix. As for the LDPC code (often referred to as the โ€œprecodeโ€ in the raptor coding literature), we used a regular code with parameters (2,100).

Let us focus now on decoding and source reconstruction. The multistage decoder of Figure 10 is based on BP at each stage ๐‘ in order to approximately compute the symbol-by-symbol posterior marginal Log-Likelihood Ratios (LLRs) {๐œ†๐‘,๐‘˜โˆถ๐‘˜=1,โ€ฆ,๐พ} defined as ๐œ†๐‘,๐‘˜๐‘ƒ๎‚€๐‘ข=log๐‘,๐‘˜=0โˆฃ๐ฒ๐‘,๐ฎ๐‘+1,โ€ฆ,๐ฎ๐ต๎‚๐‘ƒ๎‚€๐‘ข๐‘,๐‘˜=1โˆฃ๐ฒ๐‘,๐ฎ๐‘+1,โ€ฆ,๐ฎ๐ต๎‚,(12) where ๐ฒ๐‘ denotes the channel output corresponding to the input ๐ฑ๐‘, and the conditioning is with respect to the already decoded bitplanes. This is obtained by feeding the hard decisions from the planes ๐‘+1,โ€ฆ,๐ต to the BP decoder at level ๐‘. An iterative version of the multistage decoder where soft messages in the form of a posteriori LLRs are exchanged instead of hard decisions was also considered, but it was observed that this does not provide any significant improvement and therefore was not pursued further, given its much greater complexity.

The information about the already decoded bitplanes is incorporated into the BP decoder for bitplane ๐‘ in the following way. As explained in Appendix A, the BP algorithm is initialized with input messages at all the source and coded nodes in the Tanner graph of the code. The coded nodes (corresponding to the coded symbols ๐ฑ๐‘), receive their input message from the corresponding channel observation. In the case of a BSC, this is given by ๐œ‡๐‘,๐‘–=(โˆ’1)๐‘ฆ๐‘,๐‘–log1โˆ’๐œ–๐œ–,๐‘–=1,โ€ฆ,๐‘๐‘.(13) The source nodes (corresponding to the source bits ๐ฎ๐‘), are associated with the input messages ๐œˆ๐‘,๐‘˜๐‘ƒ๎‚€๐‘ข=log๐‘,๐‘˜=0โˆฃฬ‚๐‘ข๐‘+1,๐‘˜,โ€ฆ,ฬ‚๐‘ข๐ต,๐‘˜๎‚๐‘ƒ๎‚€๐‘ข๐‘,๐‘˜=1โˆฃฬ‚๐‘ข๐‘+1,๐‘˜,โ€ฆ,ฬ‚๐‘ข๐ต,๐‘˜๎‚,(14) where ฬ‚๐‘ข๐‘+1,๐‘˜,โ€ฆ,ฬ‚๐‘ข๐ต,๐‘˜ are the hard decisions obtained from the previous stages.

The BP decoder at each stage runs for a given desired number of iterations, and eventually outputs both hard decisions to be passed to the next stage and soft outputs in the form of the posterior LLRs given by (12). Once all bitplanes have been decoded, the source is reconstructed as follows. Consider without loss of generality the inverse quantization mapping ๐’ฌ๐ตโˆ’1๎‚€๐‘ข๐‘˜๎‚=(โˆ’1)๐‘ข0,๐‘˜ฮ”๐ต2([๐ต๎“๐‘=1๐‘ข๐‘,๐‘˜2๐‘]+1)(15) that yields the mid-point of each quantization interval given the set of quantization bits.

Then, we can either consider hard reconstruction, which consists of using the hard decisions ฬ‚๐‘ข๐‘,๐‘˜ in (15), or soft reconstruction, which makes use of the (approximate) posterior LLRs in order to compute the minimum-mean-square-error (MMSE) estimator of the source samples given the channel output, that is, the conditional mean estimator ฬ‚๐‘ ๐‘˜=๐”ผ[๐‘ ๐‘˜|๐ฒ]. Treating the decoder estimated posterior LLRs as if they were the true posterior LLRs, we obtain ฬ‚๐‘ ๐‘˜=ฮ”๐ต2๎‚€๐œ†tanh0,๐‘˜2๎‚([๐ต๎“๐‘=12๐‘1+๐‘’๐œ†๐‘,๐‘˜]+1).(16)

In Appendix A, we prove an interesting isomorphism between the BP decoder of the joint source-channel problem as described above and a related standard channel coding problem. Let us focus on a single binary independent source sequence ๐ฎ of length ๐พ, with probabilities ๐‘๐‘˜โ‰œ๐‘ƒ(๐‘ข๐‘˜=1) for ๐‘˜=1,โ€ฆ,๐พ. This is encoded into a binary codeword ๐ฑ=๐ฎ๐†, of length ๐‘, where ๐† is a ๐พร—๐‘ raptor encoder as previously described. Let us transmit ๐ฑ through a BSC with cross-over probability ๐œ–, and let ๐ฒ=๐ฑโŠ•๐ž denote the corresponding output. The result holds for any binary input symmetric output channel, but here we focus on the BSC for simplicity of exposition. Then, the BP decoder for this problem is isomorphic to a decoder for the following related channel coding problem: consider transmission of the all-zero codeword from the systematic code with generator matrix [๐ˆ|๐†], of size ๐พร—(๐พ+๐‘) over a channel that for the first ๐พ components operates as ๐‘ฆ๐‘˜=๐‘ฅ๐‘˜โŠ•๐‘ข๐‘˜,๐‘˜=1,โ€ฆ,๐พ,(17) where ๐‘ข๐‘˜ is the ๐‘˜th source symbol, and for the remaining ๐‘ components operates as ๐‘ฆ๐พ+๐‘›=๐‘ฅ๐พ+๐‘›โŠ•๐‘’๐‘›,๐‘›=1,โ€ฆ,๐‘.(18) In order words, there exists a one-to-one mapping of the messages of the BP decoder for the first problem (joint source channel) and the messages of the BP decoder for the second problem (channel only), for every edge of the decoder graph and every decoder iteration.

This means that the source-channel BP decoding can be analyzed, for example, by using the EXIT chart method, by considering the associated โ€œvirtualโ€ channel, where the all-zero codeword from the associated systematic code is transmitted partly on a binary additive noise channel with noise realization identical to the source realization of the source-channel problem, and partly on the same BSC (with the same noise realization) of the source-channel problem. We use this BP isomorphism result in order to derive a simple EXIT chart analysis of the BP decoder at each stage of the multistage decoder, under the assumption that the hard decisions from previous stages are correct.

4.1. Rate Allocation Algorithm

The rate allocation of each bitplane encoder is established offline by using the greedy algorithm described below. Again, we notice that we do not consider adaptive rate allocation: given the source and channel statistics, we run the greedy allocation algorithm in order to design the JSCC coding scheme.

Allocating the number of coded symbols according to the optimal limits, that is, ๐‘๐‘๎‚€๐‘ˆ=๐ป๐‘โˆฃ๐‘ˆ๐‘+1,โ€ฆ,๐‘ˆ๐ต๎‚๐พ๐ถ(19) yields very bad performance even at very large block length, since it is known that raptor codes converge to very small BER at a fixed (small) gap from capacity on general binary-input symmetric output channels [14]. Therefore, we have to allow for some increment in the coded block lengths, normally referred to as โ€œoverheadโ€ in the raptor coding literature. The problem is how to allocate a total overhead among the ๐ต+1 stages. In order to do so, we propose the following greedy overhead allocation algorithm.

We initialize the lengths ๐‘๐‘(0) according to their nominal value given by (19). At each iteration of the allocation algorithm, we allocate a given number ฮ”๐‘ of additional coded symbols to one of the ๐ต+1 codes. Let ๐ท(๐‘0,โ€ฆ,๐‘๐ต) denote the achieved average distortion of the JSCC scheme when coding lengths ๐‘0,โ€ฆ,๐‘๐ต are used and let ๐ท(0)=๐ท(๐‘0(0),โ€ฆ,๐‘๐ต(0)). Then, for iterations ๐‘–=1,2,โ€ฆ, do the following.

(i)For all ๐‘=0,โ€ฆ,๐ต, compute ๐ท๐‘(๐‘–)=๐ท(๐‘0(๐‘–โˆ’1),โ€ฆ,๐‘๐‘(๐‘–โˆ’1)+ฮ”๐‘,โ€ฆ,๐‘๐ต(๐‘–โˆ’1)).(20)(ii)Find ฬ‚๐‘=argmin๐‘=0,โ€ฆ,๐ต๐ท๐‘(๐‘–).(iii)Let ๐‘๐‘(๐‘–)โ†๐‘๐‘(๐‘–โˆ’1) for all ๐‘โ‰ ฬ‚๐‘, and ๐‘(๐‘–)ฬ‚๐‘โ†๐‘(๐‘–โˆ’1)ฬ‚๐‘+ฮ”๐‘.(iv)If |๐ท(๐‘–)ฬ‚๐‘โˆ’๐ท๐’ฌ(๐ต)|โ‰ค๐›ฟ, exit. Otherwise, let ๐ท(๐‘–)โ†๐ท(๐‘–)ฬ‚๐‘ and go back to 1. The quantity ๐›ฟ>0 is the tolerance within which we wish to achieve the target quantization distortion.

In essence, the above algorithm allocates at each iteration a packet of ฮ”๐‘-coded bits to the bitplane raptor encoder that yields the largest decrease in the overall average distortion. The distortion can be computed either by Monte Carlo simulation, or by using the EXIT chart approximation. The latter method is much faster, but cannot take into account the effect of finite block length and the error propagation between the stages of the multistage decoder.

In Figure 11, we report the comparison between the finite length simulation and the infinite length EXIT approximation for the same setting of ๐ต ranging from 1 to 6, the BSC with capacity ๐ถ=0.5, and source block length ๐พ=10000 used throughout the paper. As we can see, the two cases yield almost identical results. This allows us to use the infinite length EXIT approximation to estimate (with very good approximation) a suitable rate allocation among the ๐ต+1 stages for the finite length case. Finally, in Figure 12, for the case of ๐ต=3, we report the relative overhead ๐‘๐‘(๐‘–)/๐‘๐‘(0) versus PSNR(๐‘–)=โˆ’10log10๐ท(๐‘–) produced by the greedy allocation algorithm. As one might expect, the greedy algorithm starts increasing the overhead of the sign bitplane and then continues from the most significant to least significant magnitude bitplanes. Eventually, each bitplane is allocated a coding length between 12% and 18% larger than the nominal length (19), in line with standard raptor coding reported results. Furthermore, notice that this scheme tends to give larger overheads to most significant bitplanes, that is, it implicitly implements unequal error protection across the layers, which is a very well-known design approach for multilevel coded modulation with multistage decoding [21].

5. Numerical Results

In this section, we provide both finite length and infinite length results. We considered source block length ๐พ=10000 for the finite length results. In all the numerical results of this paper, we considered raptor codes with the โ€œLTโ€ degree distribution [14]ฮฉ(๐‘ฅ)=0.008๐‘ฅ+0.494๐‘ฅ2+0.166๐‘ฅ3+0073๐‘ฅ4+0.083๐‘ฅ5+0.056๐‘ฅ8+0.037๐‘ฅ9+0.056๐‘ฅ19+0.025๐‘ฅ65+0.003๐‘ฅ66.(21) As outer code we used a regular high-rate LDPC code with degrees (2,100) and rate 0.98. The source symbols are estimated after running 100 iterations of the decoding algorithm.

We would like to stress the fact that the LT and LDPC degree distribution polynomials have been chosen without considering any optimization method and that we have averaged over the ensemble of randomly generated raptor codes with the given parameters. In practice, one would carefully design an LDPC graph with good properties for the desired length ๐พ and degree distributions.

This section is subdivided in two parts. In the first part we described the results obtained by varying of the bandwidth expansion factor, when the capacity of the BSC is fixed to ๐ถ=0.5, corresponding to crossover probability ๐œ–=0.11. The aim of this section is to compare the performance of families of SSCC and JSCC codes for different values of ๐‘, and to see how they approach the operational Shannon limit.

In the second part we examine the behavior of a single fixed code, designed for a nominal channel crossover probability and target PSNR, when we vary the channel crossover probability. This set of results illustrates the robustness of a given coding scheme to nonideal channel conditions.

In both subsections we provide results for infinite and finite codeword length cases. The infinite case results have been generated by using the EXIT chart approximation of Appendix B.

5.1. Approaching the Operational Shannon Limit

In Figure 13, we plot the performance comparison between the proposed JSSC scheme and the SSCC scheme, when infinite codeword length is considered. In this case, the SSCC scheme outperforms the proposed scheme in the sense that it reaches the quantization distortion at slightly lower values of ๐‘, for all ๐ต=1,โ€ฆ,6. The SSCC schemes show a very sharp transition (โ€œall or nothingโ€ behavior). In contrast, the JSCC schemes reach their quantization PSNR more gradually: as we increase the overhead, the performance gradually improves.

The situation radically changes when we consider finite codeword length. In Figure 14 we plot the performance of JSSC and SSCC schemes for finite block length. In this case, the JSCC schemes outperform their SSCC counterpart. In particular, as we have already remarked, the JSCC performance is almost identical to that for infinite block length, while the SSCC suffers much more evidently from the residual BER due to finite length practical codes. This also hints that the EXIT approximated analysis yields very faithful results for the JSCC scheme, while it provides optimistic results for the SSCC scheme. This can be explained by the fact that the BER performance of infinite length codes exhibits a very sharp โ€œwaterfallโ€ threshold, beyond which the BER is zero, while for finite length the waterfall is smoother.

An important advantage of the JSCC is that the PSNR value gradually increases as ๐‘ increases, while a sharp threshold effect can be seen in the case of SSCC. In [5] it was shown that, with natural sources such as images, PSNR values lower than peak value were still perceptually acceptable for the JSCC scheme, while the SSCC scheme degrades abruptly also from the perceptual viewpoint.

5.2. Robustness

In the previous set of results, we have fixed the channel capacity and the (quantized) source entropy rate and we have examined families of codes operating at different (๐‘,PSNR) points. Now, we take a complementary view and fix the channel code while letting the channel capacity vary. This setting is relevant when a given code, designed for some nominal channel conditions, is used on a channel of variable quality, and therefore we are interested in the robustness of the performance with respect to the channel parameters. Also, this setting is more akin to the standard way of studying the performance of channel coding, where the BER is plotted as a function of the channel parameters (๐œ– in the case of a BSC), for a given channel code.

In order to have a fair comparison between the two schemes, the bandwidth expansion factor (i.e., the code used) has been fixed in the following way: we keep the minimum value of ๐‘ such that both schemes reach the quantization PSNR on the previous set of results (see Figure 14). In particular we keep ๐‘=4.3565 and ๐‘=14.8079 for ๐ต=1 and ๐ต=6, respectively. Since the JSCC scheme needs lower values of ๐‘ to reach the quantization PSNR in both cases, we add some extra bits to reach the same values of ๐‘.

We have examined the two extreme cases of low resolution (๐ต=1) and high resolution (๐ต=6). In Figures 15 and 16 we notice that in both cases the JSCC scheme outperforms the SSCC scheme in terms of PSNR. Moreover, as expected, the PSNR of the SSCC scheme degrades sharply, while the PSNR of the JSCC scheme degrades gradually as the channel crossover probability increases. For example, considering ๐ต=6, if ๐œ– increases from its nominal value 0.11 to a higher value 0.115 the JSCC scheme loses about 6โ€‰dB in PSNR, while the SSCC loses 24โ€‰dB. We interpret this sharp degradation as an effect of the catastrophic behavior of the entropy coding stage in SSCC, which is greatly mitigated by the linear coding stage in the proposed JSCC scheme.

6. Conclusions

Unlike most JSCC schemes presented in the literature, which are carefully targeted for specific source and channel pairs, the scheme proposed here can closely approach the rate-distortion separation limit for virtually any well-behaved source under quadratic distortion and any symmetric channel, owing to the universality of entropy-coded quantization and the optimality of linear codes for both data compression and channel coding. Furthermore, we have demonstrated that beyond operating close to optimal, the proposed scheme is better and more robust than a separated approach, especially in the practical case of finite block length.

We wish to conclude this paper with some considerations for future work. Following [5], the JSCC scheme can be applied to any class of sources for which efficient transform coding has been designed. In particular, images, audio and video are natural and relevant candidates. The scheme takes advantage of the know-how and careful source statistical characterization developed in designing lossy coding standards, and preserves the structure of the transform coder. This makes it easy to introduce the JSCC scheme into practical applications, for example, by introducing a trans-coding stage at the physical layer, while preserving the network architecture and the source coding standards developed at the application layer.

Although we have not pursued this aspect here, the bitplane layered encoding and multistage successive decoding architectures of the proposed scheme lend themselves quite naturally to a multiresolution, or โ€œembedded," implementation. For example, it is sufficient to use an embedded scalar quantizer in order to obtain such a scheme: bitplanes will be transmitted in sequence, and the resolution of the reconstructed source improves at each additional layer received.

A different route for future investigation involves the use of nonbinary linear codes. Also for the proposed JSCC scheme, the gap from the Shannon limit increases with the PSNR (resolution). This is due to the fact that each layer needs to be encoded with a fixed overhead, such that the overall overhead increases with the number of layers. As an alternative, we may wish to use a nonbinary raptor code operating over symbols of ๐ต+1โ€‰bits, and mapping directly the quantization indices over the channel symbols. The hope is that the overhead of such nonbinary codes does not depend (or at least depends in a sublinear way) on ๐ต. This may lead to better bandwidth expansion gaps at high resolution.

Appendices

A. Raptor Codes and BP Decoding

Raptor codes [10] are a class of rateless codes designed for transmission over erasure channels with unknown capacity. They are an extension of Luby Transform codes (LT codes) [18], since they are based on the concatenation of an outer linear code (precode) with an inner LT code. To be compliant with the raptor codes terminology, let us define the input symbols as the symbols generated from the source symbols by the linear precode encoder, and output symbols as the symbols generated from the input symbols by the LT encoder.

Formally a raptor code is defined by the triplet (๐พ,๐’ž,ฮฉ(๐‘ฅ)), where ๐พ is the source symbols length, ๐’ž is a linear encoder ๐’žโˆถ๐”ฝ๐พ2โ†’๐”ฝ๐‘›2 and โˆ‘ฮฉ(๐‘ฅ)=๐‘›๐‘—=1ฮฉ๐‘—๐‘ฅ๐‘— represents the generating function of the probability distribution ฮฉ1,โ€ฆ,ฮฉ๐‘› on {1,โ€ฆ,๐‘›} that generates the LT codewords.

The (๐‘›,ฮฉ(๐‘ฅ)) LT code ensemble corresponds to the ensemble of ๐‘›ร—๐‘ binary matrices, for all ๐‘=1,2,โ€ฆ, with columns randomly generated according to ฮฉ(๐‘ฅ), where each matrix yields an encoding mapping.

The operations to generate a generic column of an LT encoding matrix can be summarized in two steps:

(1)sample the distribution {ฮฉ1,โ€ฆ,ฮฉ๐‘›} to obtain a weight ๐‘ค between 1 and ๐‘›;(2)generate the column (๐‘ฃ1,โ€ฆ,๐‘ฃ๐‘›) uniformly at random from all ๎‚€๐‘›๐‘ค๎‚ binary vectors of weight ๐‘ค and length ๐‘›; As shown in [14], it is possible to adapt raptor codes for transmission over memoryless symmetric channels. The decoding is performed by using the classical belief propagation algorithm (see [14] for details).

In this paper, we exploit a high rate LDPC code as the precoder, then the ๐‘› input nodes can also be seen as the ๐‘› bitnodes of the LDPC code.

A.1. BP Decoder Isomorphism

As anticipated in Section 4, there is an interesting isomorphism between the standard channel coding problem when an all zero codeword is transmitted (we refer to this as Scheme ๐ด) and the joint source-channel coding problem as defined at each stage of the multistage decoder (we refer to this as Scheme ๐ต).

Consider the following unified scheme. Let the vector [๐ฐ๐ณ] be the output block when a vector ๐ฐ of length ๐พ is channel coded with a systematic raptor code and where ๐ณ has length ๐‘ (i.e., the raptor code rate is equal to (๐พ/(๐พ+๐‘))). Let us assume that the output block is transmitted over a hybrid channel such that the first ๐พ output symbols are distorted by noise vector ๐ฎ where ๐‘๐‘˜=๐‘ƒ(๐‘ข๐‘˜=1) for ๐‘˜=1,โ€ฆ,๐พ and the remaining ๐‘ output symbols are distorted by the BSC channel noise vector ๐ž where ๐‘ƒ(๐‘’๐‘˜=1)=๐œ–,๐‘˜=1,โ€ฆ,๐‘. Then, the hybrid channel is characterized by many BSCs with crossover probabilities ๐‘1,โ€ฆ,๐‘๐พ and ๐œ–. The channel observation block is then composed of ๎‚ƒ๐ฒ=๐ฐโŠ•๐ฎ๎‚„๐ณโŠ•๐ž.(A.1) Notice that when ๐ฐ=๐ŸŽ then ๐ณ=๐ŸŽ, and ๐ฎ๐ฒ=[๐ž]. In this case the unified scheme becomes Scheme A. On the other hand, when ๐ฐ=๐ฎ, then 0๐ฒ=[](๐ณโŠ•๐ž), and the unified scheme becomes Scheme B. Let us consider the ๐‘™th iteration of the BP decoder. We use the following notation (see Figure 17):

(i)๐‘š(๐‘™)v,๐‘œ,๐‘š(๐‘™)o,๐‘ฃ are the messages passed from the ๐‘ฃth input node to the ๐‘œth output node and from the ๐‘œth output node to the ๐‘ฃth input node, respectively, of the LT-decoder;(ii)๐‘š(๐‘™)v,๐‘,๐‘š(๐‘™)c,๐‘ฃ are the messages passed from the ๐‘ฃth input node (the so called variable node in classical LDPC notations) to the ๐‘th check node and from the ๐‘th check node to the ๐‘ฃth input node, respectively, of the LDPC decoder;(iii)๐›ฟ(๐‘™),๐‘ฃldpc is the message generated from the ๐‘ฃth LDPC input node and passed to the corresponding input node of the LT-decoder;(iv)๐›ฟ(๐‘™),๐‘ฃlt is the message generated from the ๐‘ฃth LT input node and passed to the corresponding input node of the LDPC decoder; and(v)๐‘๐‘œ is the LLR of the ๐‘œth output symbol received from noisy channel; notice for ๐‘œ=1,โ€ฆ๐พ,๐‘๐‘œ=(โˆ’1)๐‘ข๐‘œโŠ•๐‘ค๐‘œlog((1โˆ’๐‘๐‘œ)/๐‘๐‘œ) while ๐‘๐‘œ=(โˆ’1)๐‘’๐‘œโŠ•๐‘ง๐‘œlog((1โˆ’๐œ–)/๐œ–) for ๐‘œ=๐พ+1,โ€ฆ๐‘+๐พ.

Using the notation above, we can define the updating rules for the LT and the LDPC decoders separately.

For the LT decoder, at the ๐‘™th iteration, we have ๎‚€๐‘štanh(๐‘™)o,๐‘ฃ2๎‚๎‚€=tanh(โˆ’1)๐‘ข๐‘œโŠ•๐‘ค๐‘œ๎‚€log(1โˆ’๐‘๐‘œ)/๐‘๐‘œ๎‚2๎‚โ‹…๎‘๐‘ฃโ€ฒโ‰ ๐‘ฃ๎‚€๐‘štanh(๐‘™)v๎…ž,๐‘œ2๎‚๎‚€๐‘š,๐‘œ=1,โ€ฆ,๐พ,tanh(๐‘™)o,๐‘ฃ2๎‚๎‚€=tanh(โˆ’1)๐‘’๐‘œโŠ•๐‘ง๐‘œ๎‚€๎‚log(1โˆ’๐œ–)/๐œ–2๎‚โ‹…๎‘๐‘ฃโ€ฒโ‰ ๐‘ฃ๎‚€๐‘štanh(๐‘™)v๎…ž,๐‘œ2๎‚๐‘š,๐‘œ=๐พ+1,โ€ฆ,๐‘+๐พ,(๐‘™+1)๐‘ฃ,๐‘œ=๐›ฟ(๐‘™),๐‘ฃldpc+๎“๐‘œโ€ฒโ‰ ๐‘œ๐‘š(๐‘œ๐‘™)โ€ฒ,๐‘ฃ,๐‘ฃ=1,โ€ฆ,๐‘›,(A.2) where the product is taken over all input nodes adjacent to ๐‘œ other than ๐‘ฃ and the summation is taken over all output nodes adjacent to ๐‘ฃ other than ๐‘œ. For ๐‘™=0, we set ๐‘š(0)๐‘ฃ,๐‘œ=0 for ๐‘ฃ=1,โ€ฆ,๐‘›.

For the LDPC decoder, at the ๐‘™th iteration, we have ๐‘š(๐‘™)๐‘ฃ,๐‘=โŽงโŽชโŽจโŽชโŽฉ๐›ฟ0if๐‘™=0,(๐‘™),๐‘ฃlt+๎“๐‘โ€ฒโ‰ ๐‘๐‘š(๐‘๐‘™โˆ’1)โ€ฒ,๐‘ฃ๎‚€๐‘šif๐‘™โ‰ 0,๐‘ฃ=1,โ€ฆ,๐‘›,(A.3)tanh(๐‘™)c,๐‘ฃ2๎‚=๎‘๐‘ฃโ€ฒโ‰ ๐‘ฃ๎‚€๐‘štanh(๐‘™)v๎…ž,๐‘2๎‚,๐‘=1,โ€ฆ,๐‘›โˆ’๐พ.(A.4) The messages ๐›ฟ(๐‘™),๐‘ฃlt and ๐›ฟ(๐‘™),๐‘ฃldpc passed from the LT to the LDPC decoder and from the LDPC to the LT decoder, respectively, are defined by๐›ฟ(๐‘™),๐‘ฃlt=๎“๐‘œ๐‘š(๐‘™)๐‘œ,๐‘ฃ๐›ฟ,๐‘ฃ=1,โ€ฆ,๐‘›,(A.5)(๐‘™),๐‘ฃldpc=๎“๐‘๐‘š(๐‘™)๐‘,๐‘ฃ,๐‘ฃ=1,โ€ฆ,๐‘›,(A.6) where the summation is taken over all output nodes adjacent to ๐‘ฃ or overall checknodes adjacent to ๐‘ฃ.

The overall factor graph (FG) of the proposed decoding algorithm is displayed in Figure 17 for the case of JSCC ๐ฐ=๐ฎ. We use Wiberg's notation (see [20]), that is, the FG is a bipartite graph with variable nodes (circles) and function nodes (boxes). A variable node is connected to a function node if the corresponding variable is an argument of the corresponding factor [20]. In our case, the variable nodes correspond to the input symbols of the LT code and to the input symbols of the LDPC code. The function nodes correspond to the output symbols of the LT code and to the check nodes of the LDPC code. To explicitly represent the messages passed between the two decoders at each stage, we split the graph into two parts connected to each other by โ€œequality constraints.โ€ Finally, to distinguish between channel outputs received from the equivalent channel and channel outputs received from the noiseless channel, we explicitly represent the source symbols ๐ฎ=(๐‘ข1,โ€ฆ,๐‘ข๐พ), and the output ๐ฒ=(๐‘ฆ๐พ+1,โ€ฆ,๐‘ฆ๐พ+๐‘) of the noisy channel with input ๐ณ. Let us also denote the input block by ๐ข.

As we can see from the updating rules described above and from the factor graph, the decoder can be modeled as two independent factor graphs that exchange information between themselves after each iteration.

Theorem 1. The magnitude of the BP messages exchanged between input and output symbols for the same Tanner graph is the same for both Schemes A and B. In particular, at BP round ๐‘™, the relationship between the messages passed in Schemes A and B is ๐ต๐‘š(๐‘™)๐‘ฃ,๐‘œ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™)๐‘ฃ,๐‘œ,๐ต๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™+1)๐‘œ,๐‘ฃ,(A.7) (where ๐ด๐‘š is used to denote messages for Scheme A and ๐ต๐‘š is used for Scheme B).

Belief propagation equations (A.2)โ€“(A.4) can be also written in an explicit form by using a map ๐›พ from the real numbers (โˆ’โˆž,โˆž) to ๐น2ร—[0,โˆž) defined by ๐›พ(๐‘ฅ)โ‰œ(sgn(๐‘ฅ),โˆ’lntanh(|๐‘ฅ|/2)). Clearly ๐›พ is bijective and there exists an inverse ๐›พโˆ’1. Moreover, ๐›พ(๐‘ฅ๐‘ฆ)=๐›พ(๐‘ฅ)+๐›พ(๐‘ฆ) where addition is component-wise in ๐น2 and in [0,โˆž). Another important property is as follows: ๐›พโˆ’1(๎“๐‘–๐›พ๎‚€(โˆ’1)๐‘๐‘–๐ต๐‘–๎‚๎‘)=๐‘–(โˆ’1)๐‘๐‘–๐›พโˆ’1(๎“๐‘–๐›พ๎‚€๐ต๐‘–๎‚).(A.8)

Rewriting (A.2), (A.4) in terms of the ๐›พ mapping and using (A.8), we have ๐‘š(๐‘™)๐‘œ,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ๎‚€)+๐›พ(โˆ’1)๐‘ข๐‘œโŠ•๐‘ค๐‘œ0๐‘ฅ0๐‘“10๐‘“0๎‚๐‘š),(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘’๐‘œโŠ•๐‘ง๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ๐‘š)+๐›พ(๐œ‰)),(๐‘™)๐‘,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐‘š๐‘ฃ(๐‘™โˆ’1)โ€ฒ,๐‘)),(A.9) where ๐’ซ๐‘œโ‰œlog((1โˆ’๐‘๐‘œ)/๐‘๐‘œ) and ๐œ‰โ‰œlog((1โˆ’๐œ–)/๐œ–).

Similarly, we have ๐‘š(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘›๐‘œโŠ•๐‘ง๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ)+๐›พ(๐œ‰)),(A.10)๐‘š(๐‘™)๐‘,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐‘š๐‘ฃ(๐‘™โˆ’1)โ€ฒ,๐‘)).(A.11)

Proof. To prove the theorem, the BP equations for each scheme will be given explicitly and then starting with the 0th round, the relationship between the messages corresponding to different schemes will be verified. The proof follows by induction, after showing that, given the rule holds for round (๐‘™), it also hold for round (๐‘™+1).
BP for Scheme A: In this case we have ๐ด๐‘š(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘ข๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ด๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ๎‚€๐’ซ)+๐›พ๐‘œ๎‚),๐ด๐‘š(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘’๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ด๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ)+๐›พ(๐œ‰)),๐ด๐‘š(๐‘™+1)๐‘ฃ,๐‘œ=๎“๐‘œโ€ฒ๐ดโ‰ ๐‘œ๐‘š๐‘œ(๐‘™)โ€ฒ,๐‘ฃ+๐ด๐›ฟ(๐‘™),๐‘ฃldpc.(A.12)
BP for Scheme B: In this case, we have ๐ต๐‘š(๐‘™)๐‘œ,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ต๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ๎‚€๐’ซ)+๐›พ๐‘œ๎‚),๐ต๐‘š(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘’๐‘œโจ๐‘ง๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ต๐‘š๐‘ฃ(๐‘™)โ€ฒ,๐‘œ)+๐›พ(๐œ‰)),๐ต๐‘š(๐‘™+1)๐‘ฃ,๐‘œ=๎“๐‘œโ€ฒ๐ตโ‰ ๐‘œ๐‘š๐‘œ(๐‘™)โ€ฒ,๐‘ฃ+๐ต๐›ฟ(๐‘™),๐‘ฃldpc.(A.13)
Note that in the above equations, we have provided two different versions of equations for ๐‘š๐‘œ,๐‘ฃ for both Scheme A and Scheme B for values of 1โ‰ค๐‘œโ‰ค๐พ and for ๐พ+1โ‰ค๐‘œโ‰ค๐พ+๐‘. We call these ranges of ๐‘œ the first block and the second block, respectively.
By applying the BP rules at round zero, we have the following relationships between Scheme A and Scheme B:๐ด๐‘š(0)๐‘œ,๐‘ฃ=๐ต๐‘š(0)๐‘œ,๐‘ฃ=0,๐ด๐‘š(1)๐‘ฃ,๐‘œ=๐ต๐‘š(1)๐‘ฃ,๐‘œ=0.(A.14) Then for round zero (A.7) are satisfied.
Now let us assume that the theorem holds for the (๐‘™)th round. Then we have the following equations for Round ๐‘™
๐ต๐‘š(๐‘™)๐‘œ,๐‘ฃ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™)๐‘œ,๐‘ฃ,๐ต๐‘šฬƒ๐‘ฃ(๐‘™)ฬƒ๐‘œ,=(โˆ’1)๐‘–ฬƒ๐‘ฃ๐ด๐‘šฬƒ๐‘ฃ(๐‘™)ฬƒ๐‘œ,,๐ต๐‘š(๐‘™+1)๐‘ฃ,๐‘œ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™+1)๐‘ฃ,๐‘œ.(A.15)
Consequently, the equations for Round (๐‘™+1) can be written as follows. Letting ๐‘œ and ฬƒ๐‘œ denote any output symbols, from the first and the second output blocks, respectively, and letting ๐‘ฃ and ฬƒ๐‘ฃ denote any adjacent input nodes, we can write: ๐ด๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=(โˆ’1)๐‘ข๐‘œ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ด๐‘š๐‘ฃ(๐‘™+1)โ€ฒ,๐‘œ๎‚€๐’ซ)+๐›พ๐‘œ๎‚),๐ด๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,=(โˆ’1)๐‘’ฬƒ๐‘œ๐›พโˆ’1(๎“๎‚๐‘ฃโ€ฒโ‰ ๎‚๐‘ฃ๐›พ(๐ด๐‘š๎‚๐‘ฃ(๐‘™+1)โ€ฒ,ฬƒ๐‘œ)+๐›พ(๐œ‰)),๐ต๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ต๐‘š๐‘ฃ(๐‘™+1)โ€ฒ,๐‘œ๎‚€๐’ซ)+๐›พ๐‘œ๎‚),๐ต๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,=(โˆ’1)๐‘’ฬƒ๐‘œโŠ•๐‘งฬƒ๐‘œ๐›พโˆ’1(๎“๎‚๐‘ฃโ€ฒโ‰ ฬƒ๐‘ฃ๐›พ(๐ต๐‘š๎‚๐‘ฃ(๐‘™+1)โ€ฒ,ฬƒ๐‘œ)+๐›พ(๐œ‰)).(A.16)
Using the assumption, we can write ๐ต๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ((โˆ’1)๐‘–๐‘ฃโ€ฒ๐ด๐‘š๐‘ฃ(๐‘™+1)โ€ฒ,๐‘œ๎‚€๐’ซ)+๐›พ๐‘œ๎‚),๐ต๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,=(โˆ’1)๐‘’ฬƒ๐‘œโŠ•๐‘งฬƒ๐‘œ๐›พโˆ’1(๎“๎‚๐‘ฃโ€ฒโ‰ ฬƒ๐‘ฃ๐›พ((โˆ’1)๐‘–๎‚๐‘ฃโ€ฒ๐ด๐‘š๎‚๐‘ฃ(๐‘™+1)โ€ฒ,ฬƒ๐‘œ)+๐›พ(๐œ‰)).(A.17)
In order to find a relationship similar to what obtained before, we need to apply (A.8). By applying (A.8), summation coefficient terms such as (โˆ’1)๐‘–๐‘ฃโ€ฒ or (โˆ’1)๐‘–๎‚๐‘ฃโ€ฒ can be separated from the other summands. By (A.8), it is known that the number of terms in the summation is important. For any ๐‘œ, denote by โ„๐‘œ the adjacent input node set for the output node ๐‘œ. Then ๎‘๐‘ฃโ€ฒโˆˆโ„๐‘œ,๐‘ฃโ€ฒโ‰ ๐‘ฃ(โˆ’1)๐‘–๐‘ฃโ€ฒ๎‘=(๐‘ฃโ€ฒโˆˆโ„๐‘œ(โˆ’1)๐‘–๐‘ฃโ€ฒ)(โˆ’1)๐‘–๐‘ฃ=(โˆ’1)๐‘ข๐‘œ(โˆ’1)๐‘–๐‘ฃ,(A.18) since the โŠ• summation of all of the input nodes should give the value of the corresponding output node without additive noise. Similarly, for any ฬƒ๐‘ฃฬƒ๐‘œand, define โ„ฬƒ๐‘œ as the set of adjacent input nodes for the output node โ„ฬƒ๐‘œ. Then ๎‘๎‚๐‘ฃโ€ฒโˆˆโ„ฬƒ๐‘œ,๎‚๐‘ฃโ€ฒโ‰ ฬƒ๐‘ฃ(โˆ’1)๐‘–๎‚๐‘ฃโ€ฒ๎‘=(๎‚๐‘ฃโ€ฒโˆˆโ„ฬƒ๐‘œ(โˆ’1)๐‘–๎‚๐‘ฃโ€ฒ)(โˆ’1)๐‘–ฬƒ๐‘ฃ=(โˆ’1)๐‘งฬƒ๐‘œ(โˆ’1)๐‘–ฬƒ๐‘ฃโ‡’๐ต๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=(โˆ’1)๐‘ข๐‘œ(โˆ’1)๐‘–๐‘ฃ๐›พโˆ’1(๎“๐‘ฃโ€ฒโ‰ ๐‘ฃ๐›พ(๐ด๐‘š๐‘ฃ(๐‘™+1)โ€ฒ,๐‘œ)+๐›พ(๐’ซ0)),๐ต๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,=(โˆ’1)๐‘งฬƒ๐‘œ(โˆ’1)๐‘–ฬƒ๐‘ฃ(โˆ’1)๐‘’ฬƒ๐‘œโŠ•๐‘งฬƒ๐‘–๐›พโˆ’1(๎“๎‚๐‘ฃโ€ฒโ‰ ฬƒ๐‘ฃ๐›พ(๐ด๐‘š๎‚๐‘ฃ(๐‘™+1)โ€ฒ,ฬƒ๐‘œโ‡’)+๐›พ(๐œ‰))(A.19)๐ต๐‘š(๐‘™+1)๐‘œ,๐‘ฃ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™+1)๐‘œ,๐‘ฃ,(A.20)๐ต๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,=(โˆ’1)๐‘–ฬƒ๐‘ฃ๐ด๐‘šฬƒ๐‘ฃ(๐‘™+1)ฬƒ๐‘œ,.(A.21)
It is worth noting that by applying the ๐‘™th round hypotheses to (A.5), we obtain ๐ต๐›ฟ(๐‘™),๐‘ฃlt=(โˆ’1)๐‘–๐‘ฃ๐ด๐›ฟ(1),๐‘ฃlt; that is, when we only consider the LDPC iterations, Scheme A and Scheme B differ only in the signs of the channel observations. It can easily be shown that, with such an input relationship between two schemes, the messages will be also closely related as follows: ๐ต๐‘š(๐‘™)๐‘ฃ,๐‘=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™)๐‘ฃ,๐‘,๐ต๐‘š(๐‘™)๐‘,๐‘ฃ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™)๐‘,๐‘ฃ,(A.22) for any ๐‘™.
Then by (A.6) we obtain๐ต๐›ฟ(๐‘™),๐‘ฃldpc=(โˆ’1)๐‘–๐‘ฃ๐ด๐›ฟ(๐‘™),๐‘ฃldpc,(A.23) so that we can write ๐ด๐‘š(๐‘™+2)๐‘ฃ,๐‘œ=โˆ‘๐‘œโ€ฒ๐ดโ‰ ๐‘œ๐‘š๐‘œ(๐‘™+1)โ€ฒ,๐‘ฃ+๐ด๐›ฟ(๐‘™+1),๐‘ฃldpc,๐ต๐‘š(๐‘™+2)๐‘ฃ,๐‘œ=โˆ‘๐‘œโ€ฒ๐ตโ‰ ๐‘œ๐‘š๐‘œ(๐‘™+1)โ€ฒ,๐‘ฃ+๐ต๐›ฟ(๐‘™+1),๐‘ฃldpc.(A.24) Applying (A.23) for round (๐‘™+1), ๐ต๐‘š(๐‘™+2)๐‘ฃ,๐‘œ=๎“๐‘œโ€ฒโ‰ ๐‘œ(โˆ’1)๐‘ฃ๐‘๐ด๐‘š(๐‘™+1)๐‘œ๎…ž,๐‘ฃ+(โˆ’1)๐‘–๐‘ฃ๐ด๐›ฟ(๐‘™+1),๐‘ฃldpc,๐ต๐‘š(๐‘™+2)๐‘ฃ,๐‘œ=(โˆ’1)๐‘–๐‘ฃ(๎“๐‘œโ€ฒ๐ดโ‰ ๐‘œ๐‘š(๐‘™+1)๐‘œ๎…ž,๐‘ฃ+๐ด๐›ฟ(๐‘™+1),๐‘ฃldpc),๐ต๐‘š(๐‘™+2)๐‘ฃ,๐‘œ=(โˆ’1)๐‘–๐‘ฃ๐ด๐‘š(๐‘™+2)๐‘ฃ,๐‘œ.(A.25)
Equations (A.20), (A.21), and (A.25) are identical to the ones assumed in (A.15) of the (๐‘™)th round. This completes the proof by induction.

Due to Theorem 1, the BER of the pure channel coding scheme (assuming the all zero codeword) is equal to the BER of the source bits in the JSCC scheme. Based on this result, we can obtain an EXIT chart by considering the associated channel coding problem.

B. EXIT Chart Approximation

The standard analysis tool for graph-based codes under BP iterative decoding, in the limit of infinite block length, is density evolution (DE) [22, 23]. DE is typically computationally heavy, and numerically not very well conditioned. A much simpler approximation of DE consists of the so-called EXIT chart, which corresponds to DE by imposing the restriction that message densities are of some particular form. In particular, the EXIT with Gaussian approximation (GA) assumes that at every iteration the BP message distribution is Gaussian having a particular symmetry condition, which imposes that the variance is equal to 2 times the mean [13]. At this point, densities are uniquely identified by a single parameter, and the approximate DE tracks the evolution of this single parameter across the decoding rounds.

In particular, the EXIT chart tracks the mutual information between the message on a random edge of the graph and the associated binary variable node connected to the edge. By the isomorphism proved before, we know that the JSCC scheme and the โ€œtwo-channelโ€ scheme have the same performance. For the sake of completeness, in this section we apply the EXIT chart analysis to the to โ€œtwo-channelโ€ case. The resulting EXIT chart applies directly to the JSCC EXIT chart for a binary source. Finally, we briefly discuss how to apply the EXIT chart method to the multistage decoder used by our JSCC scheme. The resulting EXIT chart analysis provides very accurate approximations of the actual JSCC scheme performance, also in the finite (moderately large) block length case (see Figure 11).

For the graph induced by the raptor (LT) distribution, we define the input nodes (also called information bitnodes), the output nodes (also called coded bitnodes) and the checknodes. For LDPC codes, we define just the bitnodes and the checknodes, since any set of bitnodes that form an information set, can be taken as information bitnodes (see Figure 17).

There are different ways of scheduling for raptor decoder.

Practical schedule
Activate in parallel all input LT checknodes, then all LDPC bitnodes (corresponding to LT input nodes), then all LDPC checknodes, then back to the LDPC bitnodes. This forms a complete cycle of scheduling, which is repeated an arbitrarily large number of times. This is the scheduling that was used in our finite length simulation.

Conceptually simple schedule
Activate the LT checknodes. Then, reset the LDPC decoder and treat the messages generated by the LT checknodes as inputs for the LDPC decoder. Perform infinite iterations of the LDPC decoder. After reaching a fixed point of the LDPC decoder, take the LLRs produced for the bitnodes by the LDPC decoder at the fixed-point equilibrium and incorporate these messages as โ€œvirtual channel observationsโ€ for the input nodes of the LT code. Then, activate all LT input nodes. This provides a complete cycle of scheduling, which is repeated an arbitrarily large number of times. Our EXIT chart equations are obtained assuming this scheduling.

EXIT charts can be seen as a multidimensional dynamic system. We are interested in studying the fixed points and the trajectories of this system. As such, an EXIT chart has state variables. Proceeding to find an EXIT recursion for the conceptually simple schedule, we will denote by ๐‘ฅ and ๐‘ฆ the state variables of the LT EXIT chart, and by ๐‘‹ and ๐‘Œ the corresponding state variables for the LDPC EXIT chart.

We use the following notations.

(i)๐‘ฅ๐‘– denotes the mutual information between a message sent along an edge (๐‘ฃ,๐‘œ) with โ€œleft-degreeโ€ ๐‘– and the symbol corresponding to the bitnode ๐‘ฃ, and ๐‘ฅ the average of โ€œ๐‘ฅ๐‘–" over all edges (๐‘ฃ,๐‘œ). Following standard parlance of LDPC codes, we refer to the degree of the bitnode connected to an edge as the left degree of that edge, and to the degree of the checknode connected to an edge as the right degree of that edge.(ii)๐‘ฆ๐‘— denotes the mutual information between a message sent along an edge (๐‘œ,๐‘ฃ) with โ€œright-degreeโ€ ๐‘— and the symbol corresponding to the bitnode ๐‘ฃ and ๐‘ฆ denotes the average of ๐‘ฆ๐‘— over all edges (๐‘œ,๐‘ฃ).(iii)๐‘‹๐‘– denotes the mutual information between a message sent along an edge (๐‘ฃ,๐‘) with โ€œleft-degreeโ€ ๐‘– and the symbol corresponding to the bitnode ๐‘ฃ, and ๐‘‹ denotes the average of ๐‘‹๐‘– over all edge (๐‘ฃ,๐‘).(iv)๐‘Œ๐‘— denotes the mutual information between a message sent along an edge (๐‘,๐‘ฃ) with โ€œright-degreeโ€ ๐‘— and the symbol corresponding to the bitnode ๐‘ฃ, and ๐‘Œ denotes the average of ๐‘Œ๐‘– over all edge (๐‘,๐‘ฃ).(v)For an LDPC code, we let โˆ‘๐œ†(๐‘ฅ)=๐‘–๐œ†๐‘–๐‘ฅ๐‘–โˆ’1 and โˆ‘๐œŒ(๐‘ฅ)=๐‘—๐œŒ๐‘—๐‘ฅ๐‘—โˆ’1 denote the generating functions of the edge-centric left- and right-degree distributions, and we let ๎“ฮ›(๐‘ฅ)=๐‘–ฮ›๐‘–๐‘ฅ๐‘–=โˆซ๐‘ฅ0๐œ†(๐‘ข)๐‘‘๐‘ขโˆซ10๐œ†(๐‘ข)๐‘‘๐‘ข(B.26) denote the bit-centric left-degree distribution.(vi)For an LT code, we let โˆ‘๐œ„(๐‘ฅ)=๐‘–๐œ„๐‘–๐‘ฅ๐‘–โˆ’1 denote the edge-centric degree distribution of the input nodes, โˆ‘๐œ”(๐‘ฅ)=๐‘—๐œ”๐‘—๐‘ฅ๐‘—โˆ’1 denote the edge-centric degree distribution of the โ€œoutput nodesโ€ or, equivalently, the edge-centric degree distribution of the checknodes. The node-centric degree distribution of the checknodes, is given by ๎“ฮฉ(๐‘ฅ)=๐‘–ฮฉ๐‘—๐‘ฅ๐‘—=โˆซ๐‘ฅ0๐œ”(๐‘ข)๐‘‘๐‘ขโˆซ10๐œ”(๐‘ข)๐‘‘๐‘ข.(B.27)(vii)For the concatenation of the LT code with the LDPC code we also have the node-centric degree distribution of the LT input nodes. This is given by ๎“โ„ท(๐‘ฅ)=๐‘–โ„ท๐‘–๐‘ฅ๐‘–=โˆซ๐‘ฅ0๐œ„(๐‘ข)๐‘‘๐‘ขโˆซ10๐œ„(๐‘ข)๐‘‘๐‘ข.(B.28)

We consider the class of EXIT functions that make use of Gaussian approximation of the BP messages. Imposing the symmetry condition and Gaussianity, the conditional distribution of each message โ„’ in direction ๐‘ฃโ†’๐‘ is Gaussian โˆผ๐’ฉ(๐œ‡,2๐œ‡), for some value ๐œ‡โˆˆโ„+. Hence, letting ๐‘‰ denote the corresponding bitnode variable, we have ๎‚ƒ๐ผ(๐‘‰;โ„’)=1โˆ’๐”ผlog2๎‚€1+๐‘’โˆ’โ„’๎‚๎‚„โ‰œ๐ฝ(๐œ‡),(B.29) where โ„’โˆผ๐’ฉ(๐œ‡,2๐œ‡).

In BP, the message on (๐‘ฃ,๐‘œ) is the sum of all messages incoming to ๐‘ฃ on all other edges. The sum of Gaussian random variables is also Gaussian, and its mean is the sum of the means of the incoming messages. It follows that ๐‘ฅ๐‘–๎‚€=๐ฝ(๐‘–โˆ’1)๐ฝโˆ’1(๐‘ฆ)+๐ฝโˆ’1๎‚(๐ถ),(B.30) where ๐ถ is the mutual information (capacity) between the bitnode variable and the corresponding LLR at the (binary-input symmetric output) channel output. In the raptor case, the bitnodes correspond to variables that are observed through a virtual channel by the LDPC decoder. Averaging with respect to the edge-degree distribution, we have ๎“๐‘ฅ=๐‘–๐œ„๐‘–๐ฝ๎‚€(๐‘–โˆ’1)๐ฝโˆ’1(๐‘ฆ)+๐ฝโˆ’1๎‚(๐ถ).(B.31) As far as checknodes are concerned, we use the well-known quasiduality approximation and replace checknodes with bitnodes by changing mutual information into entropy (i.e., replacing ๐‘ฅ by 1โˆ’๐‘ฅ). Then ๐‘ฆ๐‘—๎‚€=1โˆ’๐ฝ(๐‘—โˆ’1)๐ฝโˆ’1(1โˆ’๐‘ฅ)+๐ฝโˆ’1๎‚(1โˆ’๐ถ).(B.32)

Let us consider now the โ€œtwo-channel" scenario induced by the JSCC isomorphism. Let ๐พ denote the number of source bits, and ๐‘ denote the number of parity bits. In the corresponding LT code, we have ๐‘€=๐พ+๐‘ output nodes. The first ๐พ output nodes are โ€œobservedโ€ through a channel with capacity 1โˆ’๐ป (i.e., the channel corresponds to the source statistics), while the second ๐‘ output nodes are observed through the actual transmission channel, with capacity ๐ถ.

This channel feature is taken into account by an outer expectation in the EXIT functions. Therefore, the LT EXIT chart can be written in terms of the state equations as follows: ๎“๐‘ฅ=๐‘˜๎“๐‘–ฮ›๐‘˜๐œ„๐‘–๐ฝ๎‚€(๐‘–โˆ’1)๐ฝโˆ’1(๐‘ฆ)+๐ฝโˆ’1๎‚€โ†“๐‘๐‘˜=๎“๎‚๎‚๐‘˜๎“๐‘–ฮ›๐‘˜๐œ„๐‘–๐ฝ๎‚€(๐‘–โˆ’1)๐ฝโˆ’1(๐‘ฆ)+๐‘˜๐ฝโˆ’1๎‚,(๐‘Œ)(B.33) where ๐พ/๐‘€=๐›ฝ and ๐‘/๐‘€=1โˆ’๐›ฝ, and ๎“๐‘ฆ=1โˆ’๐‘—๐œ”๐‘—๎‚ƒ๎‚€๐›ฝ๐ฝ(๐‘—โˆ’1)๐ฝโˆ’1(1โˆ’๐‘ฅ)+๐ฝโˆ’1๎‚๎‚€(๐ป)+(1โˆ’๐›ฝ)๐ฝ(๐‘—โˆ’1)๐ฝโˆ’1(1โˆ’๐‘ฅ)+๐ฝโˆ’1,(1โˆ’๐ถ)๎‚๎‚„(B.34) where โ†“๐‘๐‘˜ is the mutual information input by the LDPC graph into the LT code graph via the node ๐‘ฃ of degrees (๐‘–,๐‘˜) as explained in the following.

Equation (B.34) follows from the fact that a random edge (๐‘œ,๐‘ฃ) is connected with probability ๐›ฝ to a source bit (i.e., to the channel with capacity 1โˆ’๐ป), while with probability 1โˆ’๐›ฝ to a parity bit (i.e., to the channel with capacity ๐ถ).

Consider an LDPC bitnode ๐‘ฃ that coincides with an input node of the LT code. The degree of this node with respect to the LDPC graph is ๐‘˜, while the degree of ๐‘ฃ with respect to the LT graph is ๐‘–. For a randomly generated graph, and a random choice of ๐‘ฃ,๐‘˜ and ๐‘– are independent random variables, with joint distribution given by ๎‘๐‘–,๐‘˜=โ„ท๐‘–ฮ›๐‘˜.(B.35) The mutual information input by the LT graph into the LDPC graph via the node ๐‘ฃ of degrees (๐‘–,๐‘˜) is given by โ†‘๐‘๐‘–๎‚€=๐ฝ๐‘–๐ฝโˆ’1๎‚(๐‘ฆ).(B.36) Therefore, the LDPC EXIT chart can be written in terms of the following state equations: ๎“๐‘‹=๐‘˜๎“๐‘–๐œ†๐‘˜โ„ท๐‘–๐ฝ๎‚€(๐‘˜โˆ’1)๐ฝโˆ’1(๐‘Œ)+๐ฝโˆ’1(โ†‘๐‘๐‘–)๎‚=๎“๐‘˜๎“๐‘–๐œ†๐‘˜โ„ท๐‘–๐ฝ๎‚€(๐‘˜โˆ’1)๐ฝโˆ’1(๐‘Œ)+๐‘–๐ฝโˆ’1๎‚,๎“(๐‘ฆ)๐‘Œ=1โˆ’โ„“๐œŒโ„“๐ฝ๎‚€(โ„“โˆ’1)๐ฝโˆ’1๎‚.(1โˆ’๐‘‹)(B.37) The mutual information input by the LDPC graph into the LT graph via the node ๐‘ฃ of degrees (๐‘–,๐‘˜) is given by โ†“๐‘๐‘˜๎‚€=๐ฝ๐‘˜๐ฝโˆ’1๎‚(๐‘Œ).(B.38)

Equations (B.37), (B.33), and (B.34) form the state equations of the global EXIT chart of the concatenated LT-LDPC graph, where the state variables are ๐‘ฅ,๐‘ฆ,๐‘‹,๐‘Œ, while the parameters are ๐ป,๐ถ and ๐›ฝ, and the degree sequences ๐œ”,๐œ„,๐œŒ, and ๐œ†.

Finally, in order to get the reconstruction distortion we need to obtain the conditional probability density function (pdf) of the LLRs output by BP for the source bits. Under the Gaussian approximation, the LLR is Gaussian. Let ๐œ‡๐‘— denote the mean of the LLR of a source bitnode corrected to a checknode of degree ๐‘—, given by ๐œ‡๐‘—=๐ฝโˆ’1๎‚€๎‚€1โˆ’๐ฝ๐‘—๐ฝโˆ’1(1โˆ’๐‘ฅ)๎‚๎‚+๐ฝโˆ’1(1โˆ’๐ป).(B.39) Then, we approximate the average BER of the source bits as ๐‘ƒ๐‘=๎“๐‘—ฮฉ๐‘—๎‚™๐‘„(๐œ‡๐‘—2).(B.40)

B.1. Multilayer EXIT Chart Analysis

For each bitplane, at every location, the entropy of the bit depends on the realization of the bits at the previous (more significant) bitplanes. We are then in the presence of a โ€œtime-varyingโ€ memoryless channel in the corresponding channel coding problem. To develop the equations for the multilayer case, we use the same idea described in the previous section, namely, an outer expectation. At the (๐ตโˆ’๐‘)th most significant level, the previous corresponding bit locations might have 2๐‘ different combinations, with possibly different probabilities which are denoted by ๐›พ๐ตโˆ’๐‘(๐‘š) for 0โ‰ค๐‘šโ‰ค2๐‘โˆ’1.

Let ๐ป(๐‘ฅ๐ตโˆ’๐‘|๐‘š) denote the conditional entropy of a bit at the (๐ตโˆ’๐‘)th most significant plane given that the value of the corresponding more significant bits' combination is ๐‘š.

At the (๐ตโˆ’๐‘)th most significant level, the channel has capacity ๐ถ with probability 1โˆ’๐›ฝ, while it has capacity 1โˆ’๐ป(๐‘ฅ๐ตโˆ’๐‘|๐‘š) for 0โ‰ค๐‘šโ‰ค2๐‘โˆ’1 with probability ๐›ฝ๐›พ๐ตโˆ’๐‘(๐‘š).

Following this approach we can modify (B.34) for the decoding of the (๐ตโˆ’๐‘)th magnitude plane. It is worth noting that when the (๐ตโˆ’๐‘)th biplane is considered, we assume that the sign plane and ๐‘ (from ๐ต to ๐ตโˆ’๐‘โˆ’1) magnitude planes have been processed. Since it is known that the magnitude plane model does not depend on the sign plane, we take into account 2๐‘ different realizations.

That is, we have ๎“๐‘ฆ=1โˆ’๐‘—๐œ”๐‘—๎‚€{(1โˆ’๐›ฝ)๐ฝ(๐‘—โˆ’1)๐ฝโˆ’1(1โˆ’๐‘ฅ)+๐ฝโˆ’1๐ถ๎‚๎‚+๐›ฝ(2๐‘โˆ’1๎“๐‘š=0๐›พ๐ตโˆ’๐‘๎‚€(๐‘š)๐ฝ(๐‘—โˆ’1)๐ฝโˆ’1(1โˆ’๐‘ฅ)+๐ฝโˆ’1๎‚€๐ป๎‚€๐‘ฅ๐ตโˆ’๐‘||๐‘š๎‚๎‚)},๐‘=1,โ€ฆ,๐ตโˆ’1.(B.41) We would like to underline that the sign bitplane and the most important biplane do not depend on any other bitplane, and so ๐›พ๐ต(0)=๐›พ0(0)=1.

Similar to (B.41), we can update (B.40) as follows: ๐‘ƒ๐‘=๎“๐‘—ฮฉ๐‘—๎‚™๐‘„(๐œ‡๐‘—2),(B.42) where ๐œ‡๐‘—=๐ฝโˆ’1๎‚€๎‚€1โˆ’๐ฝ๐‘—๐ฝโˆ’1+(1โˆ’๐‘ฅ)๎‚๎‚2๐‘โˆ’1๎“๐‘š=0๐›พ๐ตโˆ’๐‘(๐‘š)๐ฝโˆ’1๎‚€๎‚€๐‘ฅ1โˆ’๐ป๐ตโˆ’๐‘||๐‘š.๎‚๎‚(B.43)

Note that a genie-aided scheme was assumed for the EXIT analysis, where there is no error-propagation between layers. In fact, it is not possible to take into account error propagation using the EXIT chart, since the underlying assumption is that the message exchanged at each iteration of the BP is a true LLR, that is, an LLR computed on the basis of the correct conditional probabilities. Decision errors, instead, would feed the decoder at a lower stage with โ€œfalseโ€ a priori probabilities.

As was done in the finite-length case, we will use soft reconstruction for the infinite-length case where the conditional mean estimator of the reconstruction points will be calculated using the fact that the source bit LLRs have the symmetric Gaussian distribution. From the EXIT chart, we can obtain the mean ๐œ‡ of the Gaussian approximation of the conditional pdf of any LLR in the graph. Hence, this can be used to compute the MMSE of the soft-reconstruction estimator (16).

Acknowledgments

This research was supported in part by the National Science Foundation under Grants ANI-03-38807, CNS-06-25637 and NeTS-NOSS-07-22073 and in part by the USC Annenberg Graduate Fellowship Program.