#### Abstract

We investigate iterative trellis decoding techniques for DAB, with the objective of gaining from processing 2D-blocks in an OFDM scheme, that is, blocks based on the time and frequency dimension, and from trellis decomposition. Trellis-decomposition methods allow us to estimate the unknown channel phase since this phase relates to the sub-trellises. We will determine a-posteriori sub-trellis probabilities, and use these probabilities for weighting the a-posteriori symbol probabilities resulting from all the sub-trellises. Alternatively we can determine a dominant sub-trellis and use the a-posteriori symbol probabilities corresponding to this dominant sub-trellis. This dominant sub-trellis approach results in a significant complexity reduction. We will investigate both iterative and non-iterative methods. The advantage of non-iterative methods is that their forwardbackward procedures are extremely simple; however, also their gain of 0.7 dB, relative to two-symbol differential detection (2SDD) at a BER of , is modest. Iterative procedures lead to the significantly larger gain of 3.7 dB at a BER of for five iterations, where a part of this gain comes from 2D processing. Simulations of our iterative approach applied to the TU-6 (COST207) channel show that we get an improvement of 2.4 dB at a Doppler frequency of 10 Hz.

#### 1. Introduction

##### 1.1. Problem Description

Digital audio broadcasting (DAB) systems, DAB+ systems, and terrestrial-digital multimedia broadcasting (T-DMB) systems use orthogonal frequency division multiplexing (OFDM), for which every OFDM-subcarrier is modulated by -Differentially Encoded-Quaternary PSK (DE-QPSK) [1].

Commonly used *classical* DAB receivers perform noncoherent 2SDD with soft-decision Viterbi decoding [2]. Noncoherent detection schemes like 2SDD are not optimal and can be improved by multisymbol differential detection (MSDD), which is a maximum likelihood procedure for finding a block of information symbols after observing a block of received symbols [3]. For very large numbers of observations, the performance of MSDD approaches the performance of ideal coherent detection of DE-QPSK, which is given in, for example, [4–6]. Noncoherent MSDD can also be used if channel coding is applied in a noniterative way, see [7, 8].

If MSDD is combined with iterative (turbo) processing (parallel concatenated systems were first described by Berrou et al. [9], serial concatenation was developed by Benedetto and coinvestigators [10–12]), it needs to be improved to get a more acceptable complexity. We were motivated by a number of encouraging results on serial concatenation of convolutional encoding followed by differential encoding with turbo-like decoding techniques, also referred to as *Turbo-DPSK*. Turbo-DPSK was investigated for single-carrier transmission on AWGN channels in [13–18], as well as for time-varying channels in [19–23]. The main objective of these papers was to reduce the complexity of the inner decoder. Two main methods can be distinguished: first an explicit estimation of the channel phase followed by coherent detection, see [19, 20], and for the 2D-case [24–26], or secondly by directly calculating the *a posteriori* probabilities of the information symbols as in [17, 18, 22], and for the 2D case [27–29].

We focus in the present paper on 2D processing, that is, in both the frequency and time domains. We will propose methods based on iteratively demodulating and decoding blocks of received symbols in a DAB-transmission stream. First we will; however, summarize other 2D approaches that are relevant to our work.

The work of ten Brink et al. in [24] on 2D phase-estimate methods can be regarded as an extension of the results of Hoeher and Lodge in [20] to the multicarrier case. Park et al. in [25] improved the hard-decision approach of ten Brink et al. by considering soft-decision. Both [24, 25] rely on pilot symbols, which are not present in DAB-transmission [1] unfortunately. Blind channel estimation techniques were proposed by Sanzi and Necker in [26]. They proposed a combination of the iterative scheme of ten Brink in [24] and a fast converging blind channel estimator based on higher-order asymmetrical modulation schemes, which are not used within a DAB-transmission [1].

To obtain a posteriori probabilities of the information symbols in a 2D setting, May, Rohling, and Haase in [27–29] considered iterative decoding schemes for multicarrier modulation with the soft-output Viterbi algorithm (SOVA, [30]). The SOVA was used for differential detection as well as for decoding of the convolutional code. They used in the coherent setting an estimate of the phase based on a block of three by three received symbols, which are adjacent in time and frequency direction. They proposed, for the coherent case, to use only the current received symbol to obtain a symbol metric for the SOVA innerdecoder, actually ignoring the differential encoding. For the incoherent case, they used a transition metric for the SOVA innerdecoder based on the current and previously received symbol. These a posteriori detection schemes produce approximations of the a posteriori probabilities. Procedures that focus on efficient computation of exact values can be found in [5] for the coherent case, but also in [18] for the incoherent case.

To reduce complexity we accept a small performance loss due to channel-phase discretization (see, e.g., Peleg et al. [17] and Chen et al. [22]) in this contribution, but apart from that we determine the exact a posteriori probabilities of the information symbols in a 2D setting. Our starting point will be the techniques proposed by Peleg et al. in [17]. We discretize the channel phase into a number of equispaced values, but do not allow the “side-step’’ transitions that were proposed by Peleg et al. to track small channel-phase variations. Then we calculate, in an efficient way, the a posteriori probabilities of the information symbols using the BCJR-algorithm [31] in a 2D setting, see also [32]. We will consider 2D blocks and trellis decomposition. Each 2D block consists of a number of adjacent subcarriers of a number of subsequent OFDM symbols. Focussing on 2D blocks was motivated by the fact that the channel coherence-time is typically limited to a small number of OFDM symbols, but also since DAB transmissions use time-multiplexing of services, which limits the number of OFDM symbols in a codeword. Extension in the subcarrier direction is required then to get reliable phase estimates. The trellis-decomposition method allow us to estimate the unknown channel-phase efficiently. This phase is related to subtrellises of which we can determine the a posteriori probabilities. With these probabilities we are able to chose a dominant subtrellis, which results in a significant complexity reduction.

Franceschini et al. [33] also use the idea of trellis-decomposition and subtrellises (multiple trellises), to focus on estimating channel parameters. Variation of these parameters is tackled by applying the so-called intermix intervals, in which special manipulations (mix-metric techniques) on the forward and backward metrics are performed. Since we cannot track channel variations here, we apply a 2D approach which is based on the assumption that there are independent channel realizations within distinct blocks. We will explain later, in Section 2.1, why we cannot track the channel phase.

##### 1.2. Paper Outline

In this paper, we will focus both on iterative and noniterative decoding techniques for DAB-like systems. In the next section, we will give a short outline of the DAB system. In Section 3, we will start our analysis by considering noniterative methods for the single-carrier case and introduce trellis-decomposition with a dominant subtrellis approach. In Section 4, we expand our single-carrier methods to the multicarrier case and introduce a 2D-block approach for demodulation. Iterative methods based on serial concatenation of convolutional codes (SCCC) and trellis-decomposition with two dominant subtrellis approaches are considered in Section 5. In Section 6, we generalize our iterative methods for the single-carrier case to the multicarrier case with 2D-block demodulation. Results of applying our approach to a practical case are shown in Section 7. Finally, Section 8 draws conclusions on our decoding procedures based on 2D blocks and trellis-decomposition for DAB-like systems.

#### 2. Description of a Digital Audio Broadcasting (DAB) System

##### 2.1. Overview

Terrestrial digital broadcasting systems like DAB, DAB+, and T-DMB, all members of the “DAB family,’’ comprise a combination of convolutional coding (CC), interleaving, -DE-QPSK modulation followed by OFDM, see Figure 1. Time multiplexing of the transmitted services allows the receiver to perform *per service symbol processing* [1], see Figure 2 where 1536 is the number of “active’’ OFDM subcarriers for a DAB-transmission in Mode-I [1]; hence the receiver can decode a certain service without having to process the OFDM symbols that do not correspond to this service. Consequently, only at particular time instants within a DAB transmission-frame a small number (usually up to four) of OFDM-symbols need to be processed. This results in “idle time’’ for the demodulation and decoding processes.

Note, that due to this “idle time,’’ the mix-metric techniques of [33] cannot be applied to DAB receivers. However, if all the transmitted services are decoded, and there is no idle time, mix-metric techniques could be a valuable extension to the 2D iterative processing methods based on trellis-decomposition that, we will develop here.

In the following subsections we will describe the transmit processes (convolutional encoding, differential modulation, and OFDM) in more detail.

##### 2.2. Convolutional Coding and Interleaving

The convolutional code that is used within DAB has basic code-rate , constraint length , and generator polynomials , , , and . Larger code-rates can be obtained via puncturing of the mother code, see Hagenauer et al. [34]. The time and frequency interleavers in DAB perform bit and bit-pair interleaving, respectively. As a result the code-bits leaving the convolutional encoder are permuted and partitioned over the subcarriers of a number of subsequent OFDM symbols (in subsequent frames). The bits for each subcarrier are grouped in pairs, and each of such pair is mapped onto a phase (difference) that, therefore, can assume four different values. The mapping that is used here is based on the Gray principle, that is, labels that correspond to adjacent phase differences differ only in a single bit position.

##### 2.3. Differential Modulation in Each Subcarrier

For each subcarrier, -DE-QPSK modulation is applied. A sequence consisting of symbols (phase differences) for carries the information that is to be transmitted via this subcarrier. The symbols , assume values in the (offset) alphabet . The transmitted sequence of length follows from by applying differential phase modulation, that is, where for the first symbol .

##### 2.4. OFDM in DAB

OFDM in DAB is realized using a -point complex IFFT, where is 256, 512, 1024, or 2048. To compute the -th time-domain OFDM-symbol , we determine and is the -th differentially encoded symbol corresponding to the -th subcarrier, or equivalently the -th element in the -th frequency-domain OFDM-symbol, see Figure 1. Note, that the IFFT is a computationally efficient inverse discrete Fourier transform (IDFT) for values of that are powers of 2. To prevent Intersymbol interference (ISI) resulting from multipath reception, a cyclic prefix of length is added to the sequence . This leads to the sequence that is finally transmitted.

We assume that the channel is slowly varying with an impulse response shorter than the cyclic-prefix length. Moreover, we assume that the channel coherence bandwidth and coherence time span multiple OFDM subcarriers and multiple OFDM symbols. Therefore, the channel-phase and gain might be assumed to be fixed for a number of adjacent subcarriers and consecutive symbols. This is the assumption on which we base our investigations. The channel phase and gain are assumed constant (yet unknown to the receiver) over a *2D block of symbols*, see Figure 3.

The receiver, in the case of perfect synchronization, removes the (received version of the) cyclic prefix, and then applies a -point complex FFT on the time-domain received sequence , which results in the received symbols OFDM reception can be regarded as parallel matched-filtering corresponding to complex orthogonal waveforms, one for each subcarrier. This results in a channel model, holding for a 2D block of symbols, that is, given by for some subsequent values of and , where the channel gain and phase are unknown to the receiver. It should be noted that a phase rotation proportional to , due to a time delay, is removed by linear phase correction (LPC). This technique modifies the phase of each OFDM subcarrier with an appropriate rotation based on the starting position (time delay) of the FFT window within the OFDM symbol. In practise, this delay can be determined quite accurately.

In the next subsection, we focus on a single subcarrier.

##### 2.5. Incoherent Reception, Channel Gain Known to Receiver

The sequence that is transmitted via a certain subcarrier is now observed by the receiver as sequence . Note that compared to the previous subsection we have dropped the subscript here. Since it is relatively easy to estimate the channel gain, we assume here that it is perfectly known to the receiver, and to ease our analysis we take it to be one. The received sequence now relates to the transmitted sequence as follows: where we assume that is circularly symmetric complex Gaussian with variance per component. Basically we assume that the random channel phase is real-valued and uniform over . This channel phase is fixed over all transmissions and unknown to the receiver.

Accepting a small performance loss as in, for example, Peleg et al. [17] and Chen et al. [22], we may assume that the channel-phase is discrete and uniform over 32 levels, which are uniformly spaced over , hence We will first study the situation in which we consider a uniformly chosen channel phase in a single subcarrier. Later we will also investigate the setting in which a uniformly chosen channel phase is moreover constant over a number of (adjacent) subcarriers.

##### 2.6. Equivalence between DE-QPSK and -DE-QPSK

It is well known and straightforward to show that the -DE-QPSK modulation, which is performed in each of the subcarriers, is equivalent to DE-QPSK. To see this, we define for and for It now follows that , and Now we may conclude that also for all , and that , just like , is circularly symmetric complex Gaussian with variance per component. Moreover, since is Gray-coded with respect to the interleaved code bits, so is . From now on, we will therefore focus on DE-QPSK.

#### 3. Detection and Decoding: Single-Carrier Case, Noniterative

We will start by considering the single-carrier case. For some single subcarrier, we will discuss DE-QPSK modulation with incoherent reception. Based on trellis decoding techniques, we will determine the a posteriori symbol probabilities under the assumption that the (quantized) channel phase is uniform and unknown to the receiver. We also assume the transmitted symbols to be independent of each other and uniform.

##### 3.1. Trellis Representation, Subtrellises, Decomposition

In this section, we will focus on noniterative detection. We start our analysis by noting that if we define for , then, since and is uniform over , it follows that
and . Moreover, for ,
The variables for can now be regarded as states in a trellis, and the *independent uniformly distributed (iud)* symbols correspond to transitions between states. The resulting graphical representation of our trellis can be found in Figure 4.

If we would use the standard BCJR algorithm for computing the a posteriori symbol probabilities in the trellis in Figure 4, we have to do multiplications in the forward pass, multiplications in the backward pass, and multiplications and 4 normalizations in the combination pass, per trellis section, if the a priori probabilities are all equal. In total, this is 512 multiplications and 4 normalizations per trellis section. We suggest to focus only on multiplications and normalizations in this paper since additions have a smaller complexity than multiplications and normalizations. (In the log-domain, multiplications and normalizations are replaced by additions, and additions are typically approximated by maximizations. This would more or less suggest to consider multiplications, normalizations, as well as additions, but for reasons of simplicity we neglect the additions here.)

An important observation for our investigations is that the trellis can be seen to consist of eight subtrellises , that are *not connected to each other*. A similar observation was made by Chen et al. [22]. We will discuss connections between our work on the trellis decomposition and that of [22] later.

Subtrellis consists of states , for . Figure 4 shows the entire subtrellis , and the first section of subtrellis and of subtrellis .

Note that for the likelihood corresponding to some state for in the trellis or in a subtrellis, we can write that

##### 3.2. Forward-Backward Algorithm, Subtrellises

In this subsection, we would like to focus on computing the a posteriori symbol probabilities for all and all values . It will be demonstrated that it is a relatively simple exercise to do this. We will show that the resulting a posteriori probability is a *convex combination* of the a posteriori probabilities corresponding to the eight subtrellises. Computing the a posteriori probabilities for each subtrellis is simple and can be done without performing the BCJR algorithm, as was demonstrated by Colavolpe [5]. The coefficients of the convex combination do not depend on the trellis section index and are quite easy to determine as we will show here.

###### 3.2.1. Forward Recursion

In our forward pass, we focus on subtrellis , for some . For that subtrellis we find out how to compute all the ’s in that subtrellis first. Starting from for all , we can compute the ’s recursively from for and . The notation stands for all states and symbols that lead to next state .

Lemma 1. *If for we define , then we have
**
for and *

*Proof. *Our proof is based on induction. Clearly for the result holds. Now assume that for , then from (13) we obtain
for all .

###### 3.2.2. Backward Recursion

Also in the backward pass we first focus only on subtrellis for some . In this subtrellis, we would like to compute the ’s. Taking for we can compute all other ’s from where again and .

Lemma 2. *Based on definition of for all , we get
**
for and all .*

*Proof. *Again our proof is based on induction. Note first that for the result holds. Now assume that , for . Then
for all .

##### 3.3. Combination

To determine the a posteriori symbol probability for symbol value , we compute the joint probability and density If we consider the “middle’’ term in (19), then we see that From this we may conclude that with Now observing that for and , we can write that The right-hand side of this equation can be interpreted as a convex combination of a posteriori symbol probabilities , one for each subtrellis, where the weighting-coefficients are the a posteriori subtrellis probabilities . An a posteriori subtrellis probability is the conditional probability that the discrete channel phase modulo 8 equals for some given .

The demodulator that operates according to (25) has three tasks, first the eight weighting coefficients (23) have to be computed, then for each of the eight subtrellises for all symbol values and all , the a posteriori symbol probabilities have to be computed. Finally the weighting (25) has to be done. Computing the weighting coefficient requires for each subtrellis the computation of the factors for . These factors should then be multiplied and normalized to form . For these computations, 8 multiplications per trellis section are needed. Computing the a posteriori symbol probabilities can be done efficiently by applying the Colavolpe [5] technique to each subtrellis. As in Colavolpe each such a posteriori symbol probability is based on only two received symbols and as is shown in (24). This avoids the use of the BCJR method in full generality and leads to significant complexity reductions, that is, only multiplications and normalizations are needed per trellis section. The weighting operation requires multiplications, and therefore in total this approach leads to multiplications and 32 normalizations, which is considerably less than what we need for full BCJR.

##### 3.4. Dominant Subtrellis Approach

Equation (25) shows how the *exact* a posteriori symbol probabilities can be determined. If the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other ones then weighting (25) can be approximated by
Observe that this approach involves the computations of the a posteriori symbol probabilities, as described in (24), only for the dominant subtrellis . This requires multiplications and 4 normalizations only per trellis section. Together with the computation of the weighting coefficients multiplications and 4 normalizations are necessary. Therefore, this reduces the number of multiplications with respect to full weighting by a factor of seven.

##### 3.5. Simulations

We use in our simulations, just like Peleg et al. [17], the de facto industry standard convolutional code with generator polynomials and , which is equal to the convolutional code with puncturing index of Table 29 in [1, Section 11.1.2, page 131]. The DAB, DAB+, and T-DMB bit-reversal time interleaver and block frequency interleaver are modeled by a bitwise uniform block interleaver generated for each simulated code block of bits, hence, any permutation of the coded bits is a permissible interleaver and is selected with equal probability, as is done in [17].

The demodulator calculates, for each OFDM-subcarrier, the a posteriori probability given by (25) for , and 32. The demodulator is followed by a convolutional decoder, which needs as input soft-decision information about the coded bits. Now, it follows from Gray mapping, that is, that the desired metrics related to transmission , that is, the log-likelihood ratios (LLRs) [30], can be expressed as with symbol metric and where corresponds to bit and to bit .

Figure 5 shows the Bit-Error Rate (BER) performance with the so-called ideal LLRs for a decomposed trellis for trellis-length , and 32. On the horizontal axis is the signal-to-noise ratio . The demodulator operates according to (25).

We will compare the performance of this demodulator with that of two well-known procedures described in the literature: firstly, to “classical’’ DQPSK [35, Section 4.5-5, page 224], that is, two-symbol differential detection (2SDD). This leads to a posteriori symbol probabilities as in (9) in Divsalar and Simon [3], that is, to where is the zeroth-order modified Bessel function of the first kind. Secondly, we will compare our results to coherently detected DE-QPSK. We assume that the received sequence is perfectly derotated, that is, . Then the a posteriori symbol probabilities are given by as described by Colavolpe [5]. Note that (31) is similar to (24) for .

The simulation results, which are shown in Figure 5, demonstrate that the BER performance curves of 2SDD and trellis length are practically identical as we expect. Moreover, the coherent-detection curve and the curve for very large trellis sizes () are very close. The small performance loss is due to discretizing the channel-phase with 32 levels. Furthermore, Figure 5 shows that (a) larger values of result in performance closer to the coherent-detection performance, and, (b) for ideally computed LLRs for a decomposed trellis perform quite close to coherent detection, that is, the difference in signal-to-noise ratio () is less than dB at a BER of .

Next, in Figure 6, we turn to the dominant subtrellis approach, which is denoted by “Max’’ in the legend. We compare for trellis-length , and 32, the difference in performance between ideal LLRs based on the a posteriori probabilities given by (25) and the approximated LLRs based on the dominant-subtrellis a posteriori probabilities specified in (26). It can be seen from Figure 6 that (a) for larger , the difference between the exact and approximated LLRs becomes smaller, and (b) for , the difference between the ideal LLRs and the approximated LLRs is less than dB.

##### 3.6. Some Conclusions

Our simulations demonstrate that for trellis length , the ideal LLRs and the approximated LLRs have a performance quite close to that of coherent detection. The difference in signal-to-noise ratio is less than dB at a BER of for the dominant-trellis approach. Therefore, if we focus on a BER of for obtaining an acceptable performance, with single subcarrier transmission, we need a trellis length .

With a trellis length of received symbols, the channel coherence time needs to be in the order of , where is the OFDM symbol time. This imposes quite a strong restriction on the time-varying behavior of the channel. In practice, the channel may not be coherent so long, and therefore focussing on trellis-length might not be realistic. We will discuss this effect in more detail in Section 7, where we study a typical urban channel. There is a second reason for arguing that large values of are undesirable. DAB systems support, for complexity reduction, *per service symbol processing*. In such services, typically, at most subsequent OFDM symbols are contained in a single convolutionally encoded word, see Figure 2, and this does not match to processing more than four OFDM symbols in a demodulation trellis.

After having concluded that we cannot make too large, it makes sense to investigate the possibility of using a number of (adjacent) subcarriers to jointly determine the a posteriori symbol probabilities for the corresponding DE-QPSK streams. Instead of using a single trellis with length , we could find out whether a similar performance can be obtained with a 2D block of trellises of length corresponding to adjacent subcarriers, see Figure 3. This will be the subject of the next section.

#### 4. Detection and Decoding: Multicarrier Case, Noniterative

##### 4.1. Demodulation Procedures

We have seen that the trellis-length needs to be as large as possible. For obtaining an acceptable performance, it must be larger than 32. This may not always be true. Therefore, we want to investigate the question of jointly decoding a block (2D) of received symbols. It would again be nice if we could decompose the computation of the a posteriori probabilities as in (25), also if we would concentrate on what was received over several subcarriers.

Now we assume that in each subcarrier , a sequence is conveyed using differential encoding. For the components of the transmitted sequence , we can write and . We assume that the channel phase is constant over the block of symbols; therefore, where and uniform just as before, and the noise variables are circularly complex Gaussian with variance per component. The output sequence corresponding to subcarrier is denoted by .

Just like in the single carrier case, we can determine the a posteriori subtrellis probabilities: where for and where . Note that for the likelihood corresponding to some state for in the trellis or in a subtrellis, we can write that

Now the a posteriori symbol probability for can be written as where for and .

This suggests that the demodulator first determines the a posteriori subtrellis probabilities (weighting coefficients) using (38), for which the first -factors have to be computed. Using the weighting coefficients, the convex combination in (37) then leads to the a posteriori symbol probabilities. Finding the a posteriori symbol probabilities again can be done using the Colavolpe [5] method for each subcarrier and for each subtrellis, where again such an a posteriori symbol probability is based on only the two received symbols and as is shown in (39). Again the BCJR method in full generality is not needed, and the number of required multiplications and normalizations per trellis section is the same as in the single carrier case.

Equation (37) shows how the exact a posteriori symbol probabilities can be determined. Just like in the single-carrier case, if the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other, ones then weighting (37) can be approximated as follows: with Again this approach involves the computations of the a posteriori symbol probabilities only for the dominant subtrellis . The resulting number of multiplications and normalizations per trellis section is the same as for the single carrier case.

##### 4.2. Simulations

In the previous section, we analyzed and simulated the single subcarrier approach. Here we will discuss the simulations corresponding to the multi-carrier method. We will again study the coded BER versus the signal-to-noise ratio . The BER performance for the ideal LLRs, based on a posteriori probabilities computed as in (37), is shown in Figure 7 with a fixed block size of . This fixed block-size is realized by the parameter pair values , and . The detector operates according to (37). The performance of 2SDD and coherently detected DE-QPSK are shown as reference curves.

From Figure 7, it can be observed that a 2D decomposition with a shortest possible trellis-length of and adjacent subcarriers performs identical to the largest trellis-length and subcarrier that is, the single-carrier case. Intermediate cases also have an identical performance.

We do not show the results of the dominant subtrellis approach for the multi-carrier case here, since these results are identical to the corresponding results for the single-carrier case shown in Figure 6 in the previous section.

##### 4.3. Conclusion Noniterative Decoding

Our investigations for the noniterative 2D-case show that we are very close to the performance of coherent detection of DE-QPSK even for small values of the trellis length , by processing simultaneously several subcarriers. A next question is whether we can do better than this. In the literature, see, for example, Peleg et al. in [17] and Chen et al. [22], it is demonstrated that iterative decoding techniques lead to good results for differential encoding. Therefore, in the sequel of this paper we study iterative decoding techniques for DAB-like streams, with a special focus again on 2D blocks.

#### 5. Detection and Decoding: Single-Carrier Case, Iterative

In the following two sections, we consider iterative decoding procedures. Peleg and Shamai [13] first demonstrated that iterative techniques could increase the performance of the demodulation procedures of DE-QPSK streams significantly. We specialize their approach to DAB systems, and solve a problem connected to the, in practise quite small, length of the trellises for each subcarrier, by turning to 2D blocks for iterative demodulation.

##### 5.1. Serial Concatenation

In the current section, we will investigate iterative decoding procedures for DAB-like systems, which are based on convolutional encoding, interleaving, and DE-QPSK modulation. If we consider DE-QPSK modulation as the inner coding method and convolutional encoding as the outer code, then it is obvious that we can apply techniques developed for serially concatenated coding systems here, see Figure 8. Serially concatenated turbo codes were proposed by Benedetto and Montorsi [10] and later investigated in more detail in Benedetto et al. [12]. Iterating between the DPSK-demodulator and convolutional decoder for the incoherent case was first suggested (for a single carrier) by Peleg and Shamai [13]. Hoeher and Lodge [20] also applied iterative techniques to the incoherent case, but focussed on channel estimation, to be able to use coherent detection. For an overview of related results, all for the single-carrier case, see Chen et al. [22].

We will start in this section by considering the single carrier case and our aim is again to find out what we can gain from decomposing the trellis used in the demodulator into a part that corresponds to the channel phase and a part that relates to differential encoding. In the section that follows, we will consider the multi-carrier setting.

##### 5.2. Peleg Approach

In this subsection, we investigate the forward backward procedures where we drop the assumption that the symbols , are uniformly distributed. Interleaving should still guarantee the independence of the symbols, however.

Just like Peleg et al. [17] we focus on the entire trellis . Note, however, that our trellis is different from that of [17], in which tracking of small channel phase variations is made possible by adding “side-step’’ transitions. We do not have such transitions in our trellis and, therefore, our trellis can be decomposed in eight unconnected subtrellises. In the next subsection, we take advantage of this decomposition; however, first we will consider the undecomposed trellis.

Again starting from for all , we can compute the ’s recursively from for and . Also in the backward pass, we consider the entire trellis . Taking for , we can compute all other ’s from where again , and .

To determine the a posteriori symbol probability for symbol value , we compute the joint probability and density This expression also tells us how the resulting extrinsic information can be determined. It can be checked, see Benedetto and Montorsi [10], that multiplying by the factors in the a posteriori information (44) should be omitted for obtaining extrinsic information. The extrinsic information is now further processed by the convolutional decoder. The results of the iterative procedure are discussed in Section 5.5.

Using the standard BCJR algorithm for computing the extrinsic symbol probabilities in the trellis in Figure 4, since a priori symbol probabilities are non-uniform now, leads to multiplications in the forward pass, multiplications in the backward pass, and multiplications and 4 normalizations in the combination pass for computing extrinsic information, per trellis section. In total, this is 768 multiplications and 4 normalizations per trellis section per iteration. In the next subsection, we investigate the decomposition of the demodulation trellis.

##### 5.3. Trellis Decomposition

Here we investigate whether we can decompose the entire trellis for the case where the a priori probabilities are nonuniform. We are interested in decomposing (44) in such a way that we can write for all . The question now is how to compute the a posteriori subtrellis probabilities for .

It can be shown that and therefore Now for each subtrellis, we can determine the a posteriori symbol probabilities using and by omitting the factor in (48), the corresponding extrinsic information. Note that this approach requires a backward pass through the entire trellis , first to find the weighting probabilities , for . This requires multiplications per trellis section observing that in (43), can be put in front of the summation sign. Then for all subtrellises , we do a forward pass (requiring multiplications per section) and then combine the results to obtain the extrinsic symbol probabilities for that subtrellis (for which we need multiplications and normalizations per section). Finally these probabilities have to be weighted as in (45) which requires multiplications. In total, this results in 704 multiplications and 32 normalizations per iteration. It should be noted that decomposition of the trellis does not result in a significant complexity reduction with respect to the Peleg approach. In the next subsection, we will discuss an approach that gives a relevant complexity reduction, however.

##### 5.4. Dominant Subtrellis Approaches

To achieve a complexity reduction, we investigate a method that is based on finding, at the start of a new iteration, the dominant subtrellis first and then do the forward-backward processing for demodulation only in this dominant subtrellis.

Finding the dominant subtrellis for an iteration is done based on the a posteriori subtrellis probabilities that are computed before starting this iteration. Now assuming that one of the a posteriori subtrellis probabilities dominates the other ones, we can write This approach involves the computations of the a posteriori symbol probabilities (and corresponding extrinsic information) as described in (42), (43), and (44) only for the dominant subtrellis . Computing the a posteriori subtrellis probabilities for each iteration and then focussing only on the forward pass and combination computations is less complex than following the Peleg procedure. For the best subtrellis , we do a forward pass ( multiplications per trellis section) and then we combine the results to obtain the a posteriori (actually extrinsic) symbol probabilities for that subtrellis ( multiplications and 4 normalizations per section). In total, we now need 224 multiplications and 4 normalizations per trellis section per iteration.

A second approach involves choosing the dominant subtrellis only once, before starting with the iterations. Since before starting the iterations the a priori probabilities , that is, are all equal, the analysis in Section 3.4 applies. The a posteriori subtrellis probabilities can be computed as in (23). Now we do the iterations only in the subtrellis that was chosen initially. This approach requires 84 multiplications and 4 normalizations per trellis section per iteration and is therefore essentially less complex than the Peleg technique. In our simulations, we will only use this last technique when we address dominant subtrellises.

##### 5.5. Simulations

We simulated the Peleg method described in [17] and determined the BER versus the signal-to-noise ratio . This BER performance is shown in Figure 9 for trellis lengths practically infinite, that is, and ideal LLRs are based on the a posteriori probability given by (44). The BER performance is shown for iterations, where stands for no iterations. Note that since we are using ideal LLRs and infinite trellis lengths, the corresponding curves shown in Figure 9 can be regarded as target curves for the iterative (single-carrier) case. In addition, also here, 2SDD and coherently detected DE-QPSK curves are shown for reference. Not in the figure are the curves corresponding to the approach based on decomposing the trellis and using weighting as in (45). As expected, the performance of this approach shows no differences with the Peleg approach in (44). From Figure 9, it can be seen that for a the improvement in required signal-to-noise ratio is dB after iterations. Figure 9 also shows that improvement decreases with the number of iterations and that the first iteration yields the largest improvement. Similar results were obtained by Peleg et al. [17].

To see how the performance in the iterative case depends on the trellis length , we simulated the Peleg approach for , and 32, for iterations. The results are in Figure 10. It can be seen that the “iterative coding gain’’ increases, as expected, with and that, for , the performance is already quite close to that of .

Finally, we compared for and 32, the difference in BER between the exact LLRs based on the a posteriori (extrinsic) probability given by (44) or (45) and the approximated LLRs based on the a posteriori (extrinsic) probability given by (49). The results are shown in Figure 11. We can conclude from Figure 11 that for larger , the difference in performance between the exact and approximated LLRs becomes smaller and that for the difference between the ideal LLRs and the approximation versions, by selecting the dominant subtrellis before starting with the iteration process, is less than dB.

#### 6. Detection and Decoding: Multicarrier Case, Iterative

##### 6.1. Trellis Decomposition

Just like in the noniterative multicarrier case, we do the processing based on trellis decomposition and focus on the computation of the a posteriori subtrellis probabilities: Note that for and therefore it follows from (46) that for each subcarrier .

Now the a posteriori symbol probability for can be written as in (37) that is, where is computed as given by (48) for and . From these a posteriori probabilities, we can compute the extrinsic information that is needed by the convolutional decoder. Computing extrinsic information is actually a little bit easier since it involves less multiplications.

This suggests that, for each iteration, the demodulator first determines the a posteriori subtrellis probabilities using (50), for which first a backward pass in each of the trellises corresponding to the subcarriers is needed.

Using the weighting coefficients, the convex combination in (52) leads to the a posteriori symbol probabilities. Finding the a posteriori symbol probabilities should be done in the standard way, taking into account that the backward passes were already carried out.

##### 6.2. Dominant Subtrellis Approach

Equation (52) shows how the exact a posteriori symbol probabilities can be determined, in each iteration. Just like in the single-carrier case, if the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other ones, then convex combination (52) can be approximated as follows: with Again this approach involves, in each iteration, the computations of the a posteriori symbol probabilities only for the dominant subtrellis .

If we compute the dominant subtrellis only before the start of the iteration process, we obtain a significant complexity reduction since the analysis in Section 3.4 applies. Moreover, all iterations are done in the initially chosen subtrellis. The methods described here will be evaluated in the next subsection.

##### 6.3. Simulations

We have seen before that in the noniterative multi-carrier case the performance was more or less determined by the size of the block. If the channel cannot be assumed to be constant for large values of , we can always increase the number of subcarriers if the frequency selectivity allows this. Note that keeping small also has advantages related to service symbol processing [1]. Here the situation is slightly different as is demonstrated in Figure 12. Increasing has a positive effect on the performance; however, since the trellis-length remains constant (and is quite small), the effect of iterating is limited. We see, however, that by increasing from 1 to 8 we get an improvement of roughly dB.

Finally, we compare for and 32, for , the difference between the performance of exact LLRs based on the a posteriori (extrinsic) probabilities given by (52) and the approximated LLRs based on the a posteriori (extrinsic) probabilities given by (53), see Figure 13. We can observe from Figure 13 that, as expected, the larger is, the smaller the difference between the exact and approximated LLRs becomes. For , the difference between the ideal LLRs and the approximation, by selecting the dominant subtrellis before starting to iterate, is roughly dB.

#### 7. Performance for TU-6 Channel Model

So far we have used AWGN channels with unknown channel phase and fixed (unit) gain in our analysis and simulations. To investigate the performance in practise, we have used the TU-6 (Typical Urban 6 taps) channel model defined in [36], which is commonly used to test DAB, DAB+, or T-DMB transmission. Two maximum Doppler frequencies are chosen, that is, and Hz, representing DAB transmission (in Band-III) movement speeds between transmitter and receiver of *≈*45 and *≈*90 km/h, respectively.

We use our methods for DAB transmission in Mode-I, where the inverse subcarrier spacing ms and where the cyclic-prefix period s [1].

Now, with these settings, the *normalized Doppler rate * is 0.01 and 0.02, respectively.

Note that to prevent ISI in an OFDM-scheme, the delay differences on separate propagation paths need to be less than the cyclic-prefix period [2], that is, the channel impulse response length must satisfy . Within the DAB-system, [1] and therefore the coherence-bandwidth , which is at least 4 OFDM-subcarriers. For Doppler frequency Hz, the coherence-time ms, which is *≈*20 OFDM-symbols (including cyclic prefix).

The channel gain representative for a 2D block, where it is assumed to be constant, is estimated similar to (8) in Chen et al. [22], that is,

The results of our simulations with the TU-6 model are shown in Figure 14, where the solid lines show the results for and the dashed lines for . We have results for with and for with . We considered iterative procedures with iterations. In our simulations, we used the dominant subtrellis approach, where we have chosen the dominant subtrellis before starting the iterations.

The value for might be seen as a representative frame-size for services broadcasted by the DAB-family in transmission Mode I. In this mode, is the maximum possible number of interleaved OFDM symbols. Note that is close to the coherence-time of our TU-6 channel for a Doppler frequency of 20 Hz.

It can be concluded from Figure 14 that for and , reliable transmission is not possible for the TU-6 channel with movement speeds of *≈*45 km/h and *≈*90 km/h. For the 2D-decomposition approach, however, with and , there is a considerable improvement of roughly 2.4 and dB for 10 and Hz, respectively, in required signal-to-noise ratio possible, compared to 2SDD.

#### 8. Conclusions

We have investigated decoding procedures for DAB-like systems, focussing on trellis decoding and iterative techniques, with a special focus on obtaining an advantage from considering 2D blocks and trellis decomposition. These 2D blocks consist of the intersection of a number of subsequent OFDM symbols and a number of adjacent subcarriers. The idea to focus on blocks was motivated by the fact that the channel coherence time is typically limited to a small number of OFDM symbols, but also since per service symbol processing is used which limits the number of OFDM symbols in a codeword.

We have used trellis decomposition methods that allows us to estimate the unknown channel-phase modulo . This channel phase relates to subtrellises of which we can determine the a posteriori probabilities. Using these probabilities we can weigh the contributions of all the subtrellises to compute the a posteriori symbol probabilities. We can also use these probabilities to chose a dominant subtrellis for providing us with these a posteriori symbol probabilities. Working with dominant subtrellises results in significant complexity reductions. A second important advantage of trellis-decomposition is that it allows us to process in an efficient way several subcarriers simultaneously.

We have first investigated noniterative methods. The advantage of these methods is that forward-backward procedures turned out to be extremely simple since we could use Colavolpe processing [5]. The drawback of these noniterative methods is, however, that their gain, relative to the standard 2SDD technique, is modest. Iterative procedures result in a significantly larger gain, however. In this context we must emphasize that part of this gain comes from the fact that we can do 2D processing.

Simulations for the noniterative AWGN case show that (a) trellis-lengths of are required and (b) that 2D dominant subtrellis processing with outperforms 2SDD by dB at a BER of .

For the iterative AWGN case with iterations, simulations show that 2D dominant subtrellis processing with , where and , outperforms 2SDD by dB at a BER of . However, simulations also reveal that with , where and , the iterative coding gain is reduced to dB, which is caused by the smaller value of .

On the other hand, iterative simulations for a practical setting (i.e., the TU-6 model) show that (a) with trellis-length and (one subcarrier) no reliable communication is possible, but that (b) with a modest trellis-length and subcarriers, the iterative coding advantage is maintained and that the gain is roughly dB for 10 Hz Doppler frequency, and 1.6 dB for 20 Hz.

#### Acknowledgments

The authors would like to acknowledge Professor J. W. M. Bergmans, the management and the technical staff of Catena Radio Design B. V., and NXP-Research Eindhoven for their support to accomplish this work. Moreover, They thank the anonymous reviewers for their comments and valuable suggestions.