Abstract

We investigate iterative trellis decoding techniques for DAB, with the objective of gaining from processing 2D-blocks in an OFDM scheme, that is, blocks based on the time and frequency dimension, and from trellis decomposition. Trellis-decomposition methods allow us to estimate the unknown channel phase since this phase relates to the sub-trellises. We will determine a-posteriori sub-trellis probabilities, and use these probabilities for weighting the a-posteriori symbol probabilities resulting from all the sub-trellises. Alternatively we can determine a dominant sub-trellis and use the a-posteriori symbol probabilities corresponding to this dominant sub-trellis. This dominant sub-trellis approach results in a significant complexity reduction. We will investigate both iterative and non-iterative methods. The advantage of non-iterative methods is that their forwardbackward procedures are extremely simple; however, also their gain of 0.7 dB, relative to two-symbol differential detection (2SDD) at a BER of 104, is modest. Iterative procedures lead to the significantly larger gain of 3.7 dB at a BER of 104 for five iterations, where a part of this gain comes from 2D processing. Simulations of our iterative approach applied to the TU-6 (COST207) channel show that we get an improvement of 2.4 dB at a Doppler frequency of 10 Hz.

1. Introduction

1.1. Problem Description

Digital audio broadcasting (DAB) systems, DAB+ systems, and terrestrial-digital multimedia broadcasting (T-DMB) systems use orthogonal frequency division multiplexing (OFDM), for which every OFDM-subcarrier is modulated by 𝜋/4-Differentially Encoded-Quaternary PSK (DE-QPSK) [1].

Commonly used classical DAB receivers perform noncoherent 2SDD with soft-decision Viterbi decoding [2]. Noncoherent detection schemes like 2SDD are not optimal and can be improved by multisymbol differential detection (MSDD), which is a maximum likelihood procedure for finding a block of information symbols after observing a block of received symbols [3]. For very large numbers of observations, the performance of MSDD approaches the performance of ideal coherent detection of DE-QPSK, which is given in, for example, [46]. Noncoherent MSDD can also be used if channel coding is applied in a noniterative way, see [7, 8].

If MSDD is combined with iterative (turbo) processing (parallel concatenated systems were first described by Berrou et al. [9], serial concatenation was developed by Benedetto and coinvestigators [1012]), it needs to be improved to get a more acceptable complexity. We were motivated by a number of encouraging results on serial concatenation of convolutional encoding followed by differential encoding with turbo-like decoding techniques, also referred to as Turbo-DPSK. Turbo-DPSK was investigated for single-carrier transmission on AWGN channels in [1318], as well as for time-varying channels in [1923]. The main objective of these papers was to reduce the complexity of the inner decoder. Two main methods can be distinguished: first an explicit estimation of the channel phase followed by coherent detection, see [19, 20], and for the 2D-case [2426], or secondly by directly calculating the a posteriori probabilities of the information symbols as in [17, 18, 22], and for the 2D case [2729].

We focus in the present paper on 2D processing, that is, in both the frequency and time domains. We will propose methods based on iteratively demodulating and decoding blocks of received symbols in a DAB-transmission stream. First we will; however, summarize other 2D approaches that are relevant to our work.

The work of ten Brink et al. in [24] on 2D phase-estimate methods can be regarded as an extension of the results of Hoeher and Lodge in [20] to the multicarrier case. Park et al. in [25] improved the hard-decision approach of ten Brink et al. by considering soft-decision. Both [24, 25] rely on pilot symbols, which are not present in DAB-transmission [1] unfortunately. Blind channel estimation techniques were proposed by Sanzi and Necker in [26]. They proposed a combination of the iterative scheme of ten Brink in [24] and a fast converging blind channel estimator based on higher-order asymmetrical modulation schemes, which are not used within a DAB-transmission [1].

To obtain a posteriori probabilities of the information symbols in a 2D setting, May, Rohling, and Haase in [2729] considered iterative decoding schemes for multicarrier modulation with the soft-output Viterbi algorithm (SOVA, [30]). The SOVA was used for differential detection as well as for decoding of the convolutional code. They used in the coherent setting an estimate of the phase based on a block of three by three received symbols, which are adjacent in time and frequency direction. They proposed, for the coherent case, to use only the current received symbol to obtain a symbol metric for the SOVA innerdecoder, actually ignoring the differential encoding. For the incoherent case, they used a transition metric for the SOVA innerdecoder based on the current and previously received symbol. These a posteriori detection schemes produce approximations of the a posteriori probabilities. Procedures that focus on efficient computation of exact values can be found in [5] for the coherent case, but also in [18] for the incoherent case.

To reduce complexity we accept a small performance loss due to channel-phase discretization (see, e.g., Peleg et al. [17] and Chen et al. [22]) in this contribution, but apart from that we determine the exact a posteriori probabilities of the information symbols in a 2D setting. Our starting point will be the techniques proposed by Peleg et al. in [17]. We discretize the channel phase into a number of equispaced values, but do not allow the “side-step’’ transitions that were proposed by Peleg et al. to track small channel-phase variations. Then we calculate, in an efficient way, the a posteriori probabilities of the information symbols using the BCJR-algorithm [31] in a 2D setting, see also [32]. We will consider 2D blocks and trellis decomposition. Each 2D block consists of a number of adjacent subcarriers of a number of subsequent OFDM symbols. Focussing on 2D blocks was motivated by the fact that the channel coherence-time is typically limited to a small number of OFDM symbols, but also since DAB transmissions use time-multiplexing of services, which limits the number of OFDM symbols in a codeword. Extension in the subcarrier direction is required then to get reliable phase estimates. The trellis-decomposition method allow us to estimate the unknown channel-phase efficiently. This phase is related to subtrellises of which we can determine the a posteriori probabilities. With these probabilities we are able to chose a dominant subtrellis, which results in a significant complexity reduction.

Franceschini et al. [33] also use the idea of trellis-decomposition and subtrellises (multiple trellises), to focus on estimating channel parameters. Variation of these parameters is tackled by applying the so-called intermix intervals, in which special manipulations (mix-metric techniques) on the forward and backward metrics are performed. Since we cannot track channel variations here, we apply a 2D approach which is based on the assumption that there are independent channel realizations within distinct blocks. We will explain later, in Section 2.1, why we cannot track the channel phase.

1.2. Paper Outline

In this paper, we will focus both on iterative and noniterative decoding techniques for DAB-like systems. In the next section, we will give a short outline of the DAB system. In Section 3, we will start our analysis by considering noniterative methods for the single-carrier case and introduce trellis-decomposition with a dominant subtrellis approach. In Section 4, we expand our single-carrier methods to the multicarrier case and introduce a 2D-block approach for demodulation. Iterative methods based on serial concatenation of convolutional codes (SCCC) and trellis-decomposition with two dominant subtrellis approaches are considered in Section 5. In Section 6, we generalize our iterative methods for the single-carrier case to the multicarrier case with 2D-block demodulation. Results of applying our approach to a practical case are shown in Section 7. Finally, Section 8 draws conclusions on our decoding procedures based on 2D blocks and trellis-decomposition for DAB-like systems.

2. Description of a Digital Audio Broadcasting (DAB) System

2.1. Overview

Terrestrial digital broadcasting systems like DAB, DAB+, and T-DMB, all members of the “DAB family,’’ comprise a combination of convolutional coding (CC), interleaving, 𝜋/4-DE-QPSK modulation followed by OFDM, see Figure 1. Time multiplexing of the transmitted services allows the receiver to perform per service symbol processing [1], see Figure 2 where 1536 is the number of “active’’ OFDM subcarriers for a DAB-transmission in Mode-I [1]; hence the receiver can decode a certain service without having to process the OFDM symbols that do not correspond to this service. Consequently, only at particular time instants within a DAB transmission-frame a small number (usually up to four) of OFDM-symbols need to be processed. This results in “idle time’’ for the demodulation and decoding processes.

Note, that due to this “idle time,’’ the mix-metric techniques of [33] cannot be applied to DAB receivers. However, if all the transmitted services are decoded, and there is no idle time, mix-metric techniques could be a valuable extension to the 2D iterative processing methods based on trellis-decomposition that, we will develop here.

In the following subsections we will describe the transmit processes (convolutional encoding, differential modulation, and OFDM) in more detail.

2.2. Convolutional Coding and Interleaving

The convolutional code that is used within DAB has basic code-rate 𝑅𝑐=1/4, constraint length 𝐾=7, and generator polynomials 𝑔0=133, 𝑔1=171, 𝑔2=145, and 𝑔3=133. Larger code-rates can be obtained via puncturing of the mother code, see Hagenauer et al. [34]. The time and frequency interleavers in DAB perform bit and bit-pair interleaving, respectively. As a result the code-bits leaving the convolutional encoder are permuted and partitioned over the subcarriers of a number of subsequent OFDM symbols (in subsequent frames). The bits for each subcarrier are grouped in pairs, and each of such pair is mapped onto a phase (difference) that, therefore, can assume four different values. The mapping that is used here is based on the Gray principle, that is, labels that correspond to adjacent phase differences differ only in a single bit position.

2.3. Differential Modulation in Each Subcarrier

For each subcarrier, 𝜋/4-DE-QPSK modulation is applied. A sequence 𝐛=(𝑏1,𝑏2,,𝑏𝑁) consisting of 𝑁 symbols (phase differences) 𝑏𝑛 for 𝑛=1,2,,𝑁 carries the information that is to be transmitted via this subcarrier. The symbols 𝑏𝑛,𝑛=1,2,,𝑁, assume values in the (offset) alphabet ={𝑒𝑗(𝑝𝜋/2+𝜋/4),𝑝=0,1,2,3}. The transmitted sequence 𝐬=(𝑠0,𝑠1,,𝑠𝑁) of length 𝑁+1 follows from 𝐛 by applying differential phase modulation, that is,𝑠𝑛=𝑏𝑛𝑠𝑛1,for𝑛=1,2,,𝑁,(1) where for the first symbol 𝑠01.

2.4. OFDM in DAB

OFDM in DAB is realized using a 𝐵-point complex IFFT, where 𝐵 is 256, 512, 1024, or 2048. To compute the 𝑛-th time-domain OFDM-symbol ̃𝐬𝑛=(̃𝑠1,𝑛,̃𝑠2,𝑛,,̃𝑠𝐵,𝑛), we determinẽ𝑠𝑡,𝑛=1𝐵𝐵𝑚=1𝑠𝑚,𝑛𝑒𝑗2𝜋(𝑡1)(𝑚1)/𝐵,for𝑡=1,2,,𝐵,(2) and 𝑠𝑚,𝑛 is the 𝑛-th differentially encoded symbol corresponding to the 𝑚-th subcarrier, or equivalently the 𝑚-th element in the 𝑛-th frequency-domain OFDM-symbol, see Figure 1. Note, that the IFFT is a computationally efficient inverse discrete Fourier transform (IDFT) for values of 𝐵 that are powers of 2. To prevent Intersymbol interference (ISI) resulting from multipath reception, a cyclic prefix of length 𝐿cp is added to the sequence ̃𝐬𝑛. This leads to the sequence 𝐬𝑛=(̃𝑠𝐵𝐿cp+1,𝑛,,̃𝑠𝐵,𝑛,̃𝑠1,𝑛,̃𝑠2,𝑛,,̃𝑠𝐵,𝑛) that is finally transmitted.

We assume that the channel is slowly varying with an impulse response shorter than the cyclic-prefix length. Moreover, we assume that the channel coherence bandwidth and coherence time span multiple OFDM subcarriers and multiple OFDM symbols. Therefore, the channel-phase and gain might be assumed to be fixed for a number of adjacent subcarriers and consecutive symbols. This is the assumption on which we base our investigations. The channel phase and gain are assumed constant (yet unknown to the receiver) over a 2D block of symbols, see Figure 3.

The receiver, in the case of perfect synchronization, removes the (received version of the) cyclic prefix, and then applies a 𝐵-point complex FFT on the time-domain received sequence ̃𝐫𝑛=(̃𝑟1,𝑛,̃𝑟2,𝑛,,̃𝑟𝐵,𝑛), which results in the 𝐵 received symbols𝑟𝑚,𝑛=1𝐵𝐵𝑡=1̃𝑟𝑡,𝑛𝑒𝑗2𝜋(𝑚1)(𝑡1)/𝐵,for𝑚=1,2,,𝐵.(3) OFDM reception can be regarded as parallel matched-filtering corresponding to 𝐵 complex orthogonal waveforms, one for each subcarrier. This results in a channel model, holding for a 2D block of symbols, that is, given by𝑟𝑚,𝑛=||||𝑒𝑗𝜙𝑠𝑚,𝑛+𝑛𝑚,𝑛,(4) for some subsequent values of 𝑛 and 𝑚, where the channel gain || and phase 𝜙 are unknown to the receiver. It should be noted that a phase rotation proportional to 𝑚, due to a time delay, is removed by linear phase correction (LPC). This technique modifies the phase of each OFDM subcarrier with an appropriate rotation based on the starting position (time delay) of the FFT window within the OFDM symbol. In practise, this delay can be determined quite accurately.

In the next subsection, we focus on a single subcarrier.

2.5. Incoherent Reception, Channel Gain Known to Receiver

The sequence 𝐬 that is transmitted via a certain subcarrier is now observed by the receiver as sequence 𝐫=(𝑟0,𝑟1,,𝑟𝑁). Note that compared to the previous subsection we have dropped the subscript 𝑚 here. Since it is relatively easy to estimate the channel gain, we assume here that it is perfectly known to the receiver, and to ease our analysis we take it to be one. The received sequence now relates to the transmitted sequence 𝐬 as follows:𝑟𝑛=𝑒𝑗𝜙𝑠𝑛+𝑛𝑛,for𝑛=0,1,,𝑁,(5) where we assume that 𝑛𝑛 is circularly symmetric complex Gaussian with variance 𝜎2 per component. Basically we assume that the random channel phase 𝜙 is real-valued and uniform over [0,2𝜋). This channel phase is fixed over all 𝑁+1 transmissions and unknown to the receiver.

Accepting a small performance loss as in, for example, Peleg et al. [17] and Chen et al. [22], we may assume that the channel-phase is discrete and uniform over 32 levels, which are uniformly spaced over [0,2𝜋), hencePr𝜙=𝜋𝑙=11632,for𝑙=0,1,2,,31.(6) We will first study the situation in which we consider a uniformly chosen channel phase in a single subcarrier. Later we will also investigate the setting in which a uniformly chosen channel phase is moreover constant over a number of (adjacent) subcarriers.

2.6. Equivalence between DE-QPSK and 𝜋/4-DE-QPSK

It is well known and straightforward to show that the 𝜋/4-DE-QPSK modulation, which is performed in each of the subcarriers, is equivalent to DE-QPSK. To see this, we define for 𝑛=1,2,,𝑁𝑎𝑛=𝑏𝑛𝑒𝑗𝜋/4,(7) and for 𝑛=0,1,,𝑁𝑥𝑛=𝑠𝑛𝑒𝑗𝑛𝜋/4,𝑦𝑛=𝑟𝑛𝑒𝑗𝑛𝜋/4,𝑤𝑛=𝑛𝑛𝑒𝑗𝑛𝜋/4.(8) It now follows that 𝑎𝑛𝒜={𝑒𝑗𝑝𝜋/2,𝑝=0,1,2,3},𝑥0=1, and𝑥𝑛=𝑏𝑛𝑠𝑛1𝑒𝑗𝑛𝜋/4=𝑏𝑛𝑒𝑗𝜋/4𝑠𝑛1𝑒𝑗(𝑛1)𝜋/4=𝑎𝑛𝑥𝑛1,𝑦𝑛=𝑒𝑗𝜙𝑠𝑛+𝑛𝑛𝑒𝑗𝑛𝜋/4=𝑒𝑗𝜙𝑥𝑛+𝑤𝑛.(9) Now we may conclude that also 𝑥𝑛𝒜 for all 𝑛=0,1,,𝑁, and that 𝑤𝑛, just like 𝑛𝑛, is circularly symmetric complex Gaussian with variance 𝜎2 per component. Moreover, since 𝐛 is Gray-coded with respect to the interleaved code bits, so is 𝐚=(𝑎1,𝑎2,,𝑎𝑁). From now on, we will therefore focus on DE-QPSK.

3. Detection and Decoding: Single-Carrier Case, Noniterative

We will start by considering the single-carrier case. For some single subcarrier, we will discuss DE-QPSK modulation with incoherent reception. Based on trellis decoding techniques, we will determine the a posteriori symbol probabilities under the assumption that the (quantized) channel phase is uniform and unknown to the receiver. We also assume the transmitted symbols to be independent of each other and uniform.

3.1. Trellis Representation, Subtrellises, Decomposition

In this section, we will focus on noniterative detection. We start our analysis by noting that if we define 𝑧𝑛=𝑥𝑛𝑒𝑗𝜙 for 𝑛=0,1,,𝑁, then, since 𝑥0=1 and 𝜙 is uniform over {𝜋𝑙/16,𝑙=0,1,,31}, it follows that𝑧Pr0=𝑒𝑗𝑙𝜋/16=132,for𝑙=0,1,,31,(10) and 𝑧𝑛𝒵{𝑒𝑗𝑙𝜋/16,𝑙=0,1,,31}. Moreover, for 𝑛=1,2,,𝑁,𝑧𝑛=𝑎𝑛𝑧𝑛1,𝑎wherePr𝑛=𝑒𝑗𝑝𝜋/2=14,for𝑝=0,1,2,3.(11) The variables 𝑧𝑛 for 𝑛=0,1,,𝑁 can now be regarded as states in a trellis, and the independent uniformly distributed (iud) symbols 𝑎1,𝑎2,,𝑎𝑁 correspond to transitions between states. The resulting graphical representation of our trellis can be found in Figure 4.

If we would use the standard BCJR algorithm for computing the a posteriori symbol probabilities in the trellis in Figure 4, we have to do 32×4 multiplications in the forward pass, 32×4 multiplications in the backward pass, and 4×32×2 multiplications and 4 normalizations in the combination pass, per trellis section, if the a priori probabilities are all equal. In total, this is 512 multiplications and 4 normalizations per trellis section. We suggest to focus only on multiplications and normalizations in this paper since additions have a smaller complexity than multiplications and normalizations. (In the log-domain, multiplications and normalizations are replaced by additions, and additions are typically approximated by maximizations. This would more or less suggest to consider multiplications, normalizations, as well as additions, but for reasons of simplicity we neglect the additions here.)

An important observation for our investigations is that the trellis can be seen to consist of eight subtrellises 𝒯0,𝒯1,,𝒯7, that are not connected to each other. A similar observation was made by Chen et al. [22]. We will discuss connections between our work on the trellis decomposition and that of [22] later.

Subtrellis 𝒯𝑠 consists of states 𝑧𝑛𝒵𝑠={𝑒𝑗𝑙𝜋/16,𝑙=𝑠+8𝑝,𝑝=0,1,2,3}, for 𝑠=0,1,,7. Figure 4 shows the entire subtrellis 𝒯0, and the first section of subtrellis 𝒯1 and of subtrellis 𝒯7.

Note that for the likelihood 𝛾𝑛(𝑧𝑛) corresponding to some state 𝑧𝑛𝒵 for 𝑛=0,1,,𝑁 in the trellis 𝒯 or in a subtrellis, we can write that𝛾𝑛𝑧𝑛=12𝜋𝜎2||𝑦exp𝑛𝑧𝑛||22𝜎2.(12)

3.2. Forward-Backward Algorithm, Subtrellises

In this subsection, we would like to focus on computing the a posteriori symbol probabilities Pr{𝑎𝑛𝑦0,𝑦1,,𝑦𝑁} for all 𝑛=1,2,,𝑁 and all values 𝑎𝑛𝒜. It will be demonstrated that it is a relatively simple exercise to do this. We will show that the resulting a posteriori probability is a convex combination of the a posteriori probabilities corresponding to the eight subtrellises. Computing the a posteriori probabilities for each subtrellis is simple and can be done without performing the BCJR algorithm, as was demonstrated by Colavolpe [5]. The coefficients of the convex combination do not depend on the trellis section index 𝑛 and are quite easy to determine as we will show here.

3.2.1. Forward Recursion

In our forward pass, we focus on subtrellis 𝒯𝑠, for some 𝑠{0,1,,7}. For that subtrellis we find out how to compute all the 𝛼’s in that subtrellis first. Starting from 𝛼0(𝑧0)=1/32 for all 𝑧0𝒵𝑠, we can compute the 𝛼’s recursively from𝛼𝑛𝑧𝑛=(𝑧𝑛1,𝑎𝑛)𝑧𝑛𝛼𝑛1𝑧𝑛114𝛾𝑛1𝑧𝑛1,(13) for 𝑛=1,2,,𝑁 and 𝑧𝑛𝒵𝑠. The notation (𝑧,𝑎)𝑧 stands for all states 𝑧 and symbols 𝑎 that lead to next state 𝑧.

Lemma 1. If for  𝑛=0,1,,𝑁 we define  𝐾𝑠(𝑛)𝑧𝑛𝒵𝑠(1/4)𝛾𝑛(𝑧𝑛), then we have 𝛼𝑛𝑧𝑛=132𝑛1𝑖=0𝐾𝑠(𝑖).(14) for 𝑛=0,1,,𝑁 and 𝑧𝑛𝒵𝑠

Proof. Our proof is based on induction. Clearly for 𝑛=0 the result holds. Now assume that 𝛼𝑛1(𝑧𝑛1)=(1/32)𝑛2𝑖=0𝐾𝑠(𝑖) for 𝑧𝑛1𝒵𝑠, then from (13) we obtain 𝛼𝑛𝑧𝑛=𝑧𝑛1,𝑎𝑛𝑧𝑛132𝑛2𝑖=0𝐾𝑠1(𝑖)4𝛾𝑛1𝑧𝑛1=132𝑛2𝑖=0𝐾𝑠(𝑖)𝑧𝑛1𝒵𝑠14𝛾𝑛1𝑧𝑛1=132𝑛1𝑖=0𝐾𝑠(𝑖),(15) for all 𝑧𝑛𝒵𝑠.

3.2.2. Backward Recursion

Also in the backward pass we first focus only on subtrellis 𝒯𝑠 for some 𝑠. In this subtrellis, we would like to compute the 𝛽’s. Taking 𝛽𝑁(𝑧𝑁)=𝛾(𝑧𝑁) for 𝑧𝑁𝒵𝑠 we can compute all other 𝛽’s from𝛽𝑛𝑧𝑛=𝑎𝑛+114𝛾𝑛𝑧𝑛𝛽𝑛+1𝑧𝑛𝑎𝑛+1,(16) where again 𝑛=0,1,,𝑁1 and 𝑧𝑛𝒵𝑠.

Lemma 2. Based on definition of 𝐾𝑠(𝑛)𝑧𝑛𝒵𝑠(1/4)𝛾𝑛(𝑧𝑛) for all 𝑛=0,1,,𝑁, we get 𝛽𝑛𝑧𝑛=𝛾𝑛𝑧𝑛𝑁𝑖=𝑛+1𝐾𝑠(𝑖),(17) for 𝑛=0,1,,𝑁 and all 𝑍𝑛𝒵𝑠.

Proof. Again our proof is based on induction. Note first that for 𝑛=𝑁 the result holds. Now assume that 𝛽𝑛+1(𝑧𝑛+1)=𝛾𝑛+1(𝑧𝑛+1)𝑁𝑖=𝑛+2𝐾𝑠(𝑖), for 𝑧𝑛+1𝒵𝑠. Then 𝛽𝑛𝑧𝑛=𝑎𝑛+114𝛾𝑛𝑧𝑛𝛾𝑛+1𝑧𝑛𝑎𝑛+1𝑁𝑖=𝑛+2𝐾𝑠(𝑖)=𝛾𝑛𝑧𝑛𝑧𝑛+1𝒵𝑠14𝛾𝑛+1𝑧𝑛+1𝑁𝑖=𝑛+2𝐾𝑠(𝑖)=𝛾𝑛𝑧𝑛𝑁𝑖=𝑛+1𝐾𝑠(𝑖),(18) for all 𝑧𝑛𝒵𝑠.

3.3. Combination

To determine the a posteriori symbol probability for symbol value 𝑎𝑛𝒜, we compute the joint probability and density𝑎Pr𝑛𝑝𝐲𝑎𝑛=𝑧𝑛1𝒵𝛼𝑛1𝑧𝑛114𝛾𝑛1𝑧𝑛1𝛽𝑛𝑧𝑛1𝑎𝑛=7𝑠=0𝑧𝑛1𝒵𝑠𝛼𝑛1𝑧𝑛114𝛾𝑛1𝑧𝑛1𝛽𝑛𝑧𝑛1𝑎𝑛=7𝑠=0132𝑛2𝑖=0𝐾𝑠(𝑖)𝑁𝑗=𝑛+1𝐾𝑠(𝑗)𝑧𝑛1𝒵𝑠14𝛾𝑛1𝑧𝑛1𝛾𝑛𝑧𝑛1𝑎𝑛.(19) If we consider the “middle’’ term in (19), then we see that𝑎𝑛𝑧𝑛1𝒵𝑠14𝛾𝑛1𝑧𝑛1𝛾𝑛𝑧𝑛1𝑎𝑛=𝑧𝑛1𝒵𝑠14𝛾𝑛1𝑧𝑛1𝑎𝑛𝛾𝑛𝑧𝑛1𝑎𝑛=4𝐾𝑠(𝑛1)𝐾𝑠(𝑛).(20) From this we may conclude that𝑝(𝐲)=7𝑠=0Pr{𝑠}𝑝(𝐲𝑠}=7𝑠=018𝑁𝑖=0𝐾𝑠(𝑖),(21) with1Pr{𝑠}=8𝑝,for𝑠=0,1,,7,(𝐲𝑠)=𝑁𝑖=0𝐾𝑠(𝑖).(22) Now observing thatPr{𝑠𝐲}=(1/8)𝑁𝑖=0𝐾𝑠(𝑖)7𝑠=0(1/8)𝑁𝑖=0𝐾𝑠(𝑖),(23)𝑎Pr𝑛=𝐲,𝑠𝑧𝑛1𝒵𝑠(1/4)𝛾𝑛1𝑧𝑛1𝛾𝑛𝑧𝑛1𝑎𝑛4𝐾𝑠(𝑛1)𝐾𝑠(𝑛),(24) for 𝑠{0,1,,7} and 𝑎𝑛𝒜, we can write that𝑎Pr𝑛=𝐲7𝑠=0𝑎Pr{𝑠𝐲}Pr𝑛𝐲,𝑠.(25) The right-hand side of this equation can be interpreted as a convex combination of a posteriori symbol probabilities Pr{𝑎𝑛𝐲,𝑠}, one for each subtrellis, where the weighting-coefficients are the a posteriori subtrellis probabilities Pr{𝑠𝐲}. An a posteriori subtrellis probability is the conditional probability that the discrete channel phase modulo 8 equals 𝑠 for some 𝑠=0,1,,7 given 𝐲.

The demodulator that operates according to (25) has three tasks, first the eight weighting coefficients (23) have to be computed, then for each of the eight subtrellises for all symbol values 𝑎𝑛𝒜 and all 𝑛{1,2,,𝑁}, the a posteriori symbol probabilities have to be computed. Finally the weighting (25) has to be done. Computing the weighting coefficient requires for each subtrellis 𝑠{0,1,,7} the computation of the factors 𝐾𝑠(𝑛) for 𝑛=0,1,,𝑁. These factors should then be multiplied and normalized to form Pr{𝑠𝐲}. For these computations, 8 multiplications per trellis section are needed. Computing the a posteriori symbol probabilities Pr{𝑎𝑛𝐲,𝑠} can be done efficiently by applying the Colavolpe [5] technique to each subtrellis. As in Colavolpe each such a posteriori symbol probability is based on only two received symbols 𝑦𝑛1 and 𝑦𝑛 as is shown in (24). This avoids the use of the BCJR method in full generality and leads to significant complexity reductions, that is, only 8×4×4=128 multiplications and 8×4=32 normalizations are needed per trellis section. The weighting operation requires 8×4=32 multiplications, and therefore in total this approach leads to 8+128+32=168 multiplications and 32 normalizations, which is considerably less than what we need for full BCJR.

3.4. Dominant Subtrellis Approach

Equation (25) shows how the exact a posteriori symbol probabilities can be determined. If the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other ones then weighting (25) can be approximated by𝑎Pr𝑛𝑎𝐲Pr𝑛𝐲,̂𝑠,witĥ𝑠=argmax𝑠Pr{𝑠𝐲}.(26) Observe that this approach involves the computations of the a posteriori symbol probabilities, as described in (24), only for the dominant subtrellis ̂𝑠. This requires 4×4=16 multiplications and 4 normalizations only per trellis section. Together with the computation of the weighting coefficients 8+16=24 multiplications and 4 normalizations are necessary. Therefore, this reduces the number of multiplications with respect to full weighting by a factor of seven.

3.5. Simulations

We use in our simulations, just like Peleg et al. [17], the de facto industry standard 𝑅𝑐=1/2 convolutional code with generator polynomials 𝑔0=133 and 𝑔1=171, which is equal to the convolutional code with puncturing index 𝑃𝐼=8 of Table  29 in [1, Section 11.1.2, page 131]. The DAB, DAB+, and T-DMB bit-reversal time interleaver and block frequency interleaver are modeled by a bitwise uniform block interleaver generated for each simulated code block of bits, hence, any permutation of the coded bits is a permissible interleaver and is selected with equal probability, as is done in [17].

The demodulator calculates, for each OFDM-subcarrier, the a posteriori probability given by (25) for 𝑁+1=2,4,8, and 32. The demodulator is followed by a convolutional decoder, which needs as input soft-decision information about the coded bits. Now, it follows from Gray mapping, that is,𝑏1𝑏2𝑎𝑏00011110,1𝑏21𝑒𝑗𝜋/2𝑒𝑗𝜋𝑒𝑗3𝜋/2,(27) that the desired metrics related to transmission 𝑛, that is, the log-likelihood ratios (LLRs) [30], can be expressed as𝜆1𝑛𝑒=ln𝑚(𝜋)+𝑒𝑚(3𝜋/2)𝑒𝑚(0)+𝑒𝑚(𝜋/2),𝜆2𝑛𝑒=ln𝑚(𝜋/2)+𝑒𝑚(𝜋)𝑒𝑚(0)+𝑒𝑚(3𝜋/2),(28) with symbol metric𝑎𝑚(𝜙)=lnPr𝑛=𝑒𝑗𝜙𝐲,(29) and where 𝜆1𝑛 corresponds to bit 𝑏1 and 𝜆2𝑛 to bit 𝑏2.

Figure 5 shows the Bit-Error Rate (BER) performance with the so-called ideal LLRs for a decomposed trellis for trellis-length 𝑁+1=2,4,8, and 32. On the horizontal axis is the signal-to-noise ratio 𝐸𝑏/𝑁0=1/(2𝜎2). The demodulator operates according to (25).

We will compare the performance of this demodulator with that of two well-known procedures described in the literature: firstly, to “classical’’ DQPSK [35, Section 4.5-5, page 224], that is, two-symbol differential detection (2SDD). This leads to a posteriori symbol probabilities as in (9) in Divsalar and Simon [3], that is, to𝑎Pr𝑛𝑦𝑛,𝑦𝑛1𝐼01𝜎2||𝑦𝑛𝑎𝑛+𝑦𝑛1||,for𝑎𝑛𝒜,(30) where 𝐼0() is the zeroth-order modified Bessel function of the first kind. Secondly, we will compare our results to coherently detected DE-QPSK. We assume that the received sequence is perfectly derotated, that is, ̃𝐲=𝐲𝑒𝑗𝜙. Then the a posteriori symbol probabilities are given by𝑎Pr𝑛̃𝑦𝑥𝑛1𝒜1exp𝜎2𝑥𝑛1̃𝑦𝑛𝑎𝑛+̃𝑦𝑛1,for𝑎𝑛𝒜,(31) as described by Colavolpe [5]. Note that (31) is similar to (24) for 𝑠=0.

The simulation results, which are shown in Figure 5, demonstrate that the BER performance curves of 2SDD and trellis length 𝑁+1=2 are practically identical as we expect. Moreover, the coherent-detection curve and the curve for very large trellis sizes (𝑁) are very close. The small performance loss is due to discretizing the channel-phase with 32 levels. Furthermore, Figure 5 shows that (a) larger values of 𝑁+1 result in performance closer to the coherent-detection performance, and, (b) for 𝑁+1=32 ideally computed LLRs for a decomposed trellis perform quite close to coherent detection, that is, the difference in signal-to-noise ratio (𝐸𝑏/𝑁0) is less than 0.15dB at a BER of 104.

Next, in Figure 6, we turn to the dominant subtrellis approach, which is denoted by “Max’’ in the legend. We compare for trellis-length 𝑁+1=2,8, and 32, the difference in performance between ideal LLRs based on the a posteriori probabilities given by (25) and the approximated LLRs based on the dominant-subtrellis a posteriori probabilities specified in (26). It can be seen from Figure 6 that (a) for larger 𝑁+1, the difference between the exact and approximated LLRs becomes smaller, and (b) for 𝑁+1=32, the difference between the ideal LLRs and the approximated LLRs is less than 0.1dB.

3.6. Some Conclusions

Our simulations demonstrate that for trellis length 𝑁+1=32, the ideal LLRs and the approximated LLRs have a performance quite close to that of coherent detection. The difference in signal-to-noise ratio is less than 0.25dB at a BER of 104 for the dominant-trellis approach. Therefore, if we focus on a BER of 104 for obtaining an acceptable performance, with single subcarrier transmission, we need a trellis length 𝑁+132.

With a trellis length of 𝑁+132 received symbols, the channel coherence time needs to be in the order of 𝑇𝑐32𝑇𝑠, where 𝑇𝑠 is the OFDM symbol time. This imposes quite a strong restriction on the time-varying behavior of the channel. In practice, the channel may not be coherent so long, and therefore focussing on trellis-length 𝑁+1=32 might not be realistic. We will discuss this effect in more detail in Section 7, where we study a typical urban channel. There is a second reason for arguing that large values of 𝑁 are undesirable. DAB systems support, for complexity reduction, per service symbol processing. In such services, typically, at most 𝑁+14 subsequent OFDM symbols are contained in a single convolutionally encoded word, see Figure 2, and this does not match to processing more than four OFDM symbols in a demodulation trellis.

After having concluded that we cannot make 𝑁 too large, it makes sense to investigate the possibility of using a number of (adjacent) subcarriers to jointly determine the a posteriori symbol probabilities for the corresponding DE-QPSK streams. Instead of using a single trellis with length 𝑁+1=32, we could find out whether a similar performance can be obtained with a 2D block of 𝑀=8 trellises of length 𝑁+1=4 corresponding to adjacent subcarriers, see Figure 3. This will be the subject of the next section.

4. Detection and Decoding: Multicarrier Case, Noniterative

4.1. Demodulation Procedures

We have seen that the trellis-length 𝑁+1 needs to be as large as possible. For obtaining an acceptable performance, it must be larger than 32. This may not always be true. Therefore, we want to investigate the question of jointly decoding a block (2D) of received symbols. It would again be nice if we could decompose the computation of the a posteriori probabilities as in (25), also if we would concentrate on what was received over several subcarriers.

Now we assume that in each subcarrier 𝑚=1,2,,𝑀, a sequence 𝐚𝑚=(𝑎𝑚,1,𝑎𝑚,2,,𝑎𝑚,𝑁) is conveyed using differential encoding. For the components of the transmitted sequence 𝐱𝑚=(𝑥𝑚,0,𝑥𝑚,1,,𝑥𝑚,𝑁), we can write𝑥𝑚,𝑛=𝑎𝑚,𝑛𝑥𝑚,𝑛1(32) and 𝑥𝑚,0=1. We assume that the channel phase is constant over the block of symbols; therefore,𝑦𝑚,𝑛=𝑒𝑗𝜙𝑥𝑚,𝑛+𝑤𝑚,𝑛,(33) where 𝜙{𝑙𝜋/16,𝑙=0,1,,31} and uniform just as before, and the noise variables 𝑤𝑚.𝑛 are circularly complex Gaussian with variance 𝜎2 per component. The output sequence corresponding to subcarrier 𝑚 is denoted by 𝐲𝑚=(𝑦𝑚,0,𝑦𝑚,1,,𝑦𝑚,𝑁}.

Just like in the single carrier case, we can determine the a posteriori subtrellis probabilities:Pr𝑠𝐲1,𝐲2,,𝐲𝑀=𝐲Pr{𝑠}𝑝1𝑝𝐲𝑠2𝐲𝑠𝑝𝑀𝑠7𝑠=0𝐲Pr{𝑠}𝑝1𝑝𝐲𝑠2𝐲𝑠𝑝𝑀,𝑠(34) where Pr{𝑠}=1/8 for 𝑠=0,1,,7 and𝑝𝐲𝑚=𝑠𝑁𝑖=0𝐾𝑚,𝑠(𝑖),(35) where 𝐾𝑚,𝑠(𝑛)𝑧𝑛𝒵𝑠(1/4)𝛾𝑚,𝑛(𝑧𝑛). Note that for the likelihood corresponding to some state 𝑧𝑛 for 𝑛=0,1,,𝑁 in the trellis 𝒯 or in a subtrellis, we can write that𝛾𝑚,𝑛𝑧𝑛=12𝜋𝜎2||𝑦exp𝑚,𝑛𝑧𝑛||22𝜎2.(36)

Now the a posteriori symbol probability for 𝑎𝑚,𝑛𝒜 can be written as𝑎Pr𝑚,𝑛𝐲1,𝐲2,,𝐲𝑀=7𝑠=0Pr𝑠𝐲1,𝐲2,,𝐲𝑀𝑎Pr𝑚,𝑛𝐲𝑚,,𝑠(37) wherePr𝑠𝐲1,𝐲2,,𝐲𝑀=(1/8)𝑀𝑚=1𝑁𝑖=0𝐾𝑚,𝑠(𝑖)7𝑠=0(1/8)𝑀𝑚=1𝑁𝑖=0𝐾𝑚,𝑠(,𝑎𝑖)(38)Pr𝑚,𝑛𝐲𝑚=,𝑠𝑧𝑛1𝒵𝑠(1/4)𝛾𝑚,𝑛1𝑧𝑛1𝛾𝑚,𝑛𝑧𝑛1𝑎𝑛4𝐾𝑚,𝑠(𝑛1)𝐾𝑚,𝑠,(𝑛)(39) for 𝑠{0,1,,7} and 𝑎𝑚,𝑛𝒜.

This suggests that the demodulator first determines the a posteriori subtrellis probabilities (weighting coefficients) using (38), for which the first 8×𝑀×(𝑁+1)𝐾-factors have to be computed. Using the weighting coefficients, the convex combination in (37) then leads to the a posteriori symbol probabilities. Finding the a posteriori symbol probabilities Pr{𝑎𝑚,𝑛𝐲𝑚,𝑠} again can be done using the Colavolpe [5] method for each subcarrier and for each subtrellis, where again such an a posteriori symbol probability is based on only the two received symbols 𝑦𝑚,𝑛1 and 𝑦𝑚,𝑛 as is shown in (39). Again the BCJR method in full generality is not needed, and the number of required multiplications and normalizations per trellis section is the same as in the single carrier case.

Equation (37) shows how the exact a posteriori symbol probabilities can be determined. Just like in the single-carrier case, if the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other, ones then weighting (37) can be approximated as follows:𝑎Pr𝑚,𝑛𝐲1,𝐲2,,𝐲𝑀𝑎Pr𝑚,𝑛𝐲𝑚,̂𝑠,(40) witĥ𝑠=argmax𝑠Pr𝑠𝐲1,𝐲2,,𝐲𝑀.(41) Again this approach involves the computations of the a posteriori symbol probabilities only for the dominant subtrellis ̂𝑠. The resulting number of multiplications and normalizations per trellis section is the same as for the single carrier case.

4.2. Simulations

In the previous section, we analyzed and simulated the single subcarrier approach. Here we will discuss the simulations corresponding to the multi-carrier method. We will again study the coded BER versus the signal-to-noise ratio 𝐸𝑏/𝑁0=1/(2𝜎2). The BER performance for the ideal LLRs, based on a posteriori probabilities computed as in (37), is shown in Figure 7 with a fixed block size of 𝑀(𝑁+1)=16. This fixed block-size is realized by the parameter pair values (𝑀,𝑁+1)=(1,16),(2,8),(4,4), and (8,2). The detector operates according to (37). The performance of 2SDD and coherently detected DE-QPSK are shown as reference curves.

From Figure 7, it can be observed that a 2D decomposition with a shortest possible trellis-length of 𝑁+1=2 and 𝑀=8 adjacent subcarriers performs identical to the largest trellis-length 𝑁+1=16 and 𝑀=1 subcarrier that is, the single-carrier case. Intermediate cases also have an identical performance.

We do not show the results of the dominant subtrellis approach for the multi-carrier case here, since these results are identical to the corresponding results for the single-carrier case shown in Figure 6 in the previous section.

4.3. Conclusion Noniterative Decoding

Our investigations for the noniterative 2D-case show that we are very close to the performance of coherent detection of DE-QPSK even for small values of the trellis length 𝑁+1, by processing simultaneously several subcarriers. A next question is whether we can do better than this. In the literature, see, for example, Peleg et al. in [17] and Chen et al. [22], it is demonstrated that iterative decoding techniques lead to good results for differential encoding. Therefore, in the sequel of this paper we study iterative decoding techniques for DAB-like streams, with a special focus again on 2D blocks.

5. Detection and Decoding: Single-Carrier Case, Iterative

In the following two sections, we consider iterative decoding procedures. Peleg and Shamai [13] first demonstrated that iterative techniques could increase the performance of the demodulation procedures of DE-QPSK streams significantly. We specialize their approach to DAB systems, and solve a problem connected to the, in practise quite small, length of the trellises for each subcarrier, by turning to 2D blocks for iterative demodulation.

5.1. Serial Concatenation

In the current section, we will investigate iterative decoding procedures for DAB-like systems, which are based on convolutional encoding, interleaving, and DE-QPSK modulation. If we consider DE-QPSK modulation as the inner coding method and convolutional encoding as the outer code, then it is obvious that we can apply techniques developed for serially concatenated coding systems here, see Figure 8. Serially concatenated turbo codes were proposed by Benedetto and Montorsi [10] and later investigated in more detail in Benedetto et al. [12]. Iterating between the DPSK-demodulator and convolutional decoder for the incoherent case was first suggested (for a single carrier) by Peleg and Shamai [13]. Hoeher and Lodge [20] also applied iterative techniques to the incoherent case, but focussed on channel estimation, to be able to use coherent detection. For an overview of related results, all for the single-carrier case, see Chen et al. [22].

We will start in this section by considering the single carrier case and our aim is again to find out what we can gain from decomposing the trellis used in the demodulator into a part that corresponds to the channel phase and a part that relates to differential encoding. In the section that follows, we will consider the multi-carrier setting.

5.2. Peleg Approach

In this subsection, we investigate the forward backward procedures where we drop the assumption that the symbols 𝑎𝑛,𝑛=1,2,,𝑁, are uniformly distributed. Interleaving should still guarantee the independence of the symbols, however.

Just like Peleg et al. [17] we focus on the entire trellis 𝒯. Note, however, that our trellis is different from that of [17], in which tracking of small channel phase variations is made possible by adding “side-step’’ transitions. We do not have such transitions in our trellis and, therefore, our trellis can be decomposed in eight unconnected subtrellises. In the next subsection, we take advantage of this decomposition; however, first we will consider the undecomposed trellis.

Again starting from 𝛼0(𝑧0)=1/32 for all 𝑧0𝒵, we can compute the 𝛼’s recursively from𝛼𝑛𝑧𝑛=𝑧𝑛1,𝑎𝑛𝑧𝑛𝛼𝑛1𝑧𝑛1𝑎Pr𝑛𝛾𝑛1𝑧𝑛1,(42) for 𝑛=1,2,,𝑁 and 𝑧𝑛𝒵. Also in the backward pass, we consider the entire trellis 𝒯. Taking 𝛽𝑁(𝑧𝑁)=𝛾(𝑧𝑁) for 𝑧𝑁𝒵, we can compute all other 𝛽’s from𝛽𝑛𝑧𝑛=𝑎𝑛+1𝛾𝑛𝑧𝑛𝑎Pr𝑛+1𝛽𝑛+1𝑧𝑛𝑎𝑛+1,(43) where again 𝑛=0,1,,𝑁1, and 𝑧𝑛𝒵.

To determine the a posteriori symbol probability for symbol value 𝑎𝑛𝒜, we compute the joint probability and density𝑎Pr𝑛𝑝𝐲𝑎𝑛=𝑧𝑛1𝒵𝛼𝑛1𝑧𝑛1𝑎Pr𝑛𝛾𝑛1𝑧𝑛1𝛽𝑛𝑧𝑛1𝑎𝑛.(44) This expression also tells us how the resulting extrinsic information can be determined. It can be checked, see Benedetto and Montorsi [10], that multiplying by the factors Pr{𝑎𝑛} in the a posteriori information (44) should be omitted for obtaining extrinsic information. The extrinsic information is now further processed by the convolutional decoder. The results of the iterative procedure are discussed in Section 5.5.

Using the standard BCJR algorithm for computing the extrinsic symbol probabilities in the trellis in Figure 4, since a priori symbol probabilities are non-uniform now, leads to 32×4×2 multiplications in the forward pass, 32×4×2 multiplications in the backward pass, and 32×4×2 multiplications and 4 normalizations in the combination pass for computing extrinsic information, per trellis section. In total, this is 768 multiplications and 4 normalizations per trellis section per iteration. In the next subsection, we investigate the decomposition of the demodulation trellis.

5.3. Trellis Decomposition

Here we investigate whether we can decompose the entire trellis for the case where the a priori probabilities are nonuniform. We are interested in decomposing (44) in such a way that we can write𝑎Pr𝑛=𝐲𝑠Pr𝑠,𝑎𝑛=𝐲𝑠𝑎Pr{𝑠𝐲}Pr𝑛𝐲,𝑠,(45) for all 𝑎𝑛𝒜. The question now is how to compute the a posteriori subtrellis probabilities Pr{𝑠𝐲} for 𝑠=0,1,,7.

It can be shown thatPr{𝑠,𝐲}=𝑧0𝒵𝑠𝛽0𝑧032,(46) and thereforePr{𝑠𝐲}=𝑧0𝒵𝑠𝛽0𝑧0/327𝑠=0𝑧0𝒵𝑠𝛽0𝑧0/32.(47) Now for each subtrellis, we can determine the a posteriori symbol probabilities using𝑎Pr𝑛𝑎𝐲,𝑠Pr𝑛𝑝,𝑠𝐲𝑎𝑛𝑎,𝑠=Pr𝑛,𝑠𝑧𝑛1𝒵𝑠𝛼𝑛1𝑧𝑛1𝛾𝑛1𝑧𝑛1𝛽𝑛𝑧𝑛1𝑎𝑛,(48) and by omitting the factor Pr{𝑎𝑛,𝑠} in (48), the corresponding extrinsic information. Note that this approach requires a backward pass through the entire trellis 𝒯, first to find the weighting probabilities Pr{𝑠𝐲}, for 𝑠=0,1,,7. This requires 32×(4+1)=160 multiplications per trellis section observing that in (43), 𝛾𝑛(𝑧𝑛) can be put in front of the summation sign. Then for all subtrellises 𝒯𝑠, we do a forward pass (requiring 8×4×4×2=256 multiplications per section) and then combine the results to obtain the extrinsic symbol probabilities Pr{𝑎𝑛𝐲,𝑠} for that subtrellis (for which we need 8×4×4×2=256 multiplications and 8×4=32 normalizations per section). Finally these probabilities have to be weighted as in (45) which requires 8×4=32 multiplications. In total, this results in 704 multiplications and 32 normalizations per iteration. It should be noted that decomposition of the trellis does not result in a significant complexity reduction with respect to the Peleg approach. In the next subsection, we will discuss an approach that gives a relevant complexity reduction, however.

5.4. Dominant Subtrellis Approaches

To achieve a complexity reduction, we investigate a method that is based on finding, at the start of a new iteration, the dominant subtrellis first and then do the forward-backward processing for demodulation only in this dominant subtrellis.

Finding the dominant subtrellis for an iteration is done based on the a posteriori subtrellis probabilities Pr{𝑠𝐲} that are computed before starting this iteration. Now assuming that one of the a posteriori subtrellis probabilities dominates the other ones, we can write𝑎Pr𝑛𝑎𝐲Pr𝑛𝐲,̂𝑠,witĥ𝑠=argmax𝑠Pr{𝑠𝐲}.(49) This approach involves the computations of the a posteriori symbol probabilities (and corresponding extrinsic information) as described in (42), (43), and (44) only for the dominant subtrellis ̂𝑠. Computing the a posteriori subtrellis probabilities for each iteration and then focussing only on the forward pass and combination computations is less complex than following the Peleg procedure. For the best subtrellis 𝒯̂𝑠, we do a forward pass (4×4×2=32 multiplications per trellis section) and then we combine the results to obtain the a posteriori (actually extrinsic) symbol probabilities Pr{𝑎𝑛𝐲,̂𝑠} for that subtrellis (4×4×2=32 multiplications and 4 normalizations per section). In total, we now need 224 multiplications and 4 normalizations per trellis section per iteration.

A second approach involves choosing the dominant subtrellis only once, before starting with the iterations. Since before starting the iterations the a priori probabilities Pr{𝑎𝑛}=1/4, that is, are all equal, the analysis in Section 3.4 applies. The a posteriori subtrellis probabilities can be computed as in (23). Now we do the iterations only in the subtrellis that was chosen initially. This approach requires 84 multiplications and 4 normalizations per trellis section per iteration and is therefore essentially less complex than the Peleg technique. In our simulations, we will only use this last technique when we address dominant subtrellises.

5.5. Simulations

We simulated the Peleg method described in [17] and determined the BER versus the signal-to-noise ratio 𝐸𝑏/𝑁0=1/(2𝜎2). This BER performance is shown in Figure 9 for trellis lengths practically infinite, that is, 𝑁 and ideal LLRs are based on the a posteriori probability given by (44). The BER performance is shown for 𝐿=1,2,,5 iterations, where 𝐿=1 stands for no iterations. Note that since we are using ideal LLRs and infinite trellis lengths, the corresponding curves shown in Figure 9 can be regarded as target curves for the iterative (single-carrier) case. In addition, also here, 2SDD and coherently detected DE-QPSK curves are shown for reference. Not in the figure are the curves corresponding to the approach based on decomposing the trellis and using weighting as in (45). As expected, the performance of this approach shows no differences with the Peleg approach in (44). From Figure 9, it can be seen that for a BER=104 the improvement in required signal-to-noise ratio is 4.1dB after 𝐿=5 iterations. Figure 9 also shows that improvement decreases with the number of iterations and that the first iteration yields the largest improvement. Similar results were obtained by Peleg et al. [17].

To see how the performance in the iterative case depends on the trellis length 𝑁, we simulated the Peleg approach for 𝑁+1=2,4, and 32, for 𝐿=5 iterations. The results are in Figure 10. It can be seen that the “iterative coding gain’’ increases, as expected, with 𝑁 and that, for 𝑁+1=32, the performance is already quite close to that of 𝑁.

Finally, we compared for 𝑁+1=4 and 32, the difference in BER between the exact LLRs based on the a posteriori (extrinsic) probability given by (44) or (45) and the approximated LLRs based on the a posteriori (extrinsic) probability given by (49). The results are shown in Figure 11. We can conclude from Figure 11 that for larger 𝑁+1, the difference in performance between the exact and approximated LLRs becomes smaller and that for 𝑁+1=32 the difference between the ideal LLRs and the approximation versions, by selecting the dominant subtrellis before starting with the iteration process, is less than 0.3dB.

6. Detection and Decoding: Multicarrier Case, Iterative

6.1. Trellis Decomposition

Just like in the noniterative multicarrier case, we do the processing based on trellis decomposition and focus on the computation of the a posteriori subtrellis probabilities:Pr𝑠𝐲1,𝐲2,,𝐲𝑀=𝐲Pr{𝑠}𝑝1𝑝𝐲𝑠2𝐲𝑠𝑝𝑀𝑠7𝑠=0𝐲Pr{𝑠}𝑝1𝑝𝐲𝑠2𝐲𝑠𝑝𝑀.𝑠(50) Note that Pr{𝑠}=1/8 for 𝑠=0,1,,7 and therefore it follows from (46) that𝑝𝐲𝑚=𝑠𝑧0𝒵𝑠𝛽𝑚,0𝑧04,(51) for each subcarrier 𝑚=1,2,,𝑀.

Now the a posteriori symbol probability for 𝑎𝑚,𝑛𝒜 can be written as in (37) that is,𝑎Pr𝑚,𝑛𝐲1,𝐲2,,𝐲𝑀=7𝑠=0Pr𝑠𝐲1,𝐲2,,𝐲𝑀𝑎Pr𝑚,𝑛𝐲𝑚,,𝑠(52) where Pr{𝑎𝑚,𝑛𝐲𝑚,𝑠} is computed as given by (48) for 𝑠{0,1,,7} and 𝑎𝑚,𝑛𝒜. From these a posteriori probabilities, we can compute the extrinsic information that is needed by the convolutional decoder. Computing extrinsic information is actually a little bit easier since it involves less multiplications.

This suggests that, for each iteration, the demodulator first determines the a posteriori subtrellis probabilities using (50), for which first a backward pass in each of the 𝑀 trellises corresponding to the subcarriers is needed.

Using the weighting coefficients, the convex combination in (52) leads to the a posteriori symbol probabilities. Finding the a posteriori symbol probabilities Pr{𝑎𝑚,𝑛𝐲𝑚,𝑠} should be done in the standard way, taking into account that the backward passes were already carried out.

6.2. Dominant Subtrellis Approach

Equation (52) shows how the exact a posteriori symbol probabilities can be determined, in each iteration. Just like in the single-carrier case, if the a posteriori subtrellis probabilities are such that one of the probabilities dominates the other ones, then convex combination (52) can be approximated as follows:𝑎Pr𝑚,𝑛𝐲1,𝐲2,,𝐲𝑀𝑎Pr𝑚,𝑛𝐲𝑚,̂𝑠,(53) witĥ𝑠=argmax𝑠Pr𝑠𝐲1,𝐲2,,𝐲𝑀.(54) Again this approach involves, in each iteration, the computations of the a posteriori symbol probabilities only for the dominant subtrellis ̂𝑠.

If we compute the dominant subtrellis only before the start of the iteration process, we obtain a significant complexity reduction since the analysis in Section 3.4 applies. Moreover, all iterations are done in the initially chosen subtrellis. The methods described here will be evaluated in the next subsection.

6.3. Simulations

We have seen before that in the noniterative multi-carrier case the performance was more or less determined by the size 𝑀(𝑁+1) of the block. If the channel cannot be assumed to be constant for large values of 𝑁+1, we can always increase the number of subcarriers 𝑀 if the frequency selectivity allows this. Note that keeping 𝑁+1 small also has advantages related to service symbol processing [1]. Here the situation is slightly different as is demonstrated in Figure 12. Increasing 𝑀 has a positive effect on the performance; however, since the trellis-length 𝑁+1 remains constant (and is quite small), the effect of iterating is limited. We see, however, that by increasing 𝑀 from 1 to 8 we get an improvement of roughly 0.7dB.

Finally, we compare for 𝑁+1=4 and 32, for 𝑀=8, the difference between the performance of exact LLRs based on the a posteriori (extrinsic) probabilities given by (52) and the approximated LLRs based on the a posteriori (extrinsic) probabilities given by (53), see Figure 13. We can observe from Figure 13 that, as expected, the larger 𝑁+1 is, the smaller the difference between the exact and approximated LLRs becomes. For 𝑁+1=4, the difference between the ideal LLRs and the approximation, by selecting the dominant subtrellis before starting to iterate, is roughly 0.3dB.

7. Performance for TU-6 Channel Model

So far we have used AWGN channels with unknown channel phase and fixed (unit) gain in our analysis and simulations. To investigate the performance in practise, we have used the TU-6 (Typical Urban 6 taps) channel model defined in [36], which is commonly used to test DAB, DAB+, or T-DMB transmission. Two maximum Doppler frequencies are chosen, that is, 𝑓𝑑=10 and 20Hz, representing DAB transmission (in Band-III) movement speeds between transmitter and receiver of 45 and 90 km/h, respectively.

We use our methods for DAB transmission in Mode-I, where the inverse subcarrier spacing 𝑇𝑢=1ms and where the cyclic-prefix period 𝑇𝑐𝑝=246𝜇s [1].

Now, with these settings, the normalized Doppler rate 𝑓𝑑𝑇𝑢 is 0.01 and 0.02, respectively.

Note that to prevent ISI in an OFDM-scheme, the delay differences on separate propagation paths need to be less than the cyclic-prefix period [2], that is, the channel impulse response length 𝜏𝑚 must satisfy 𝜏𝑚𝑇𝑐𝑝. Within the DAB-system, 𝑇𝑐𝑝(63/256)𝑇𝑢<𝑇𝑢/4 [1] and therefore the coherence-bandwidth 𝐵𝑐1/𝜏𝑚>4(1/𝑇𝑢), which is at least 4 OFDM-subcarriers. For Doppler frequency 𝑓𝑑=20Hz, the coherence-time 𝑇𝑐1/(2𝑓𝑑)=25ms, which is 20 OFDM-symbols (including cyclic prefix).

The channel gain representative for a 2D block, where it is assumed to be constant, is estimated similar to (8) in Chen et al. [22], that is,||||21=max𝑀(𝑁+1)𝑚=1,𝑀,𝑛=0,𝑁||𝑦𝑚,𝑛||22𝜎2,0.(55)

The results of our simulations with the TU-6 model are shown in Figure 14, where the solid lines show the results for 𝑓𝑑𝑇𝑢=0.01 and the dashed lines for 𝑓𝑑𝑇𝑢=0.02. We have results for 𝑁+1=18 with 𝑀=1 and for 𝑁+1=4 with 𝑀=8. We considered iterative procedures with 𝐿=5 iterations. In our simulations, we used the dominant subtrellis approach, where we have chosen the dominant subtrellis before starting the iterations.

The value for 𝑁+1=4 might be seen as a representative frame-size for services broadcasted by the DAB-family in transmission Mode I. In this mode, 𝑁+1=18 is the maximum possible number of interleaved OFDM symbols. Note that 𝑁+1=18 is close to the coherence-time of our TU-6 channel for a Doppler frequency of 20 Hz.

It can be concluded from Figure 14 that for 𝑁+1=18 and 𝑀=1, reliable transmission is not possible for the TU-6 channel with movement speeds of 45 km/h and 90 km/h. For the 2D-decomposition approach, however, with 𝑁+1=4 and 𝑀=8, there is a considerable improvement of roughly 2.4 and 1.6dB for 10 and 20Hz, respectively, in required signal-to-noise ratio possible, compared to 2SDD.

8. Conclusions

We have investigated decoding procedures for DAB-like systems, focussing on trellis decoding and iterative techniques, with a special focus on obtaining an advantage from considering 2D blocks and trellis decomposition. These 2D blocks consist of the intersection of a number of subsequent OFDM symbols and a number of adjacent subcarriers. The idea to focus on blocks was motivated by the fact that the channel coherence time is typically limited to a small number of OFDM symbols, but also since per service symbol processing is used which limits the number of OFDM symbols in a codeword.

We have used trellis decomposition methods that allows us to estimate the unknown channel-phase modulo 𝜋/2. This channel phase relates to subtrellises of which we can determine the a posteriori probabilities. Using these probabilities we can weigh the contributions of all the subtrellises to compute the a posteriori symbol probabilities. We can also use these probabilities to chose a dominant subtrellis for providing us with these a posteriori symbol probabilities. Working with dominant subtrellises results in significant complexity reductions. A second important advantage of trellis-decomposition is that it allows us to process in an efficient way several subcarriers simultaneously.

We have first investigated noniterative methods. The advantage of these methods is that forward-backward procedures turned out to be extremely simple since we could use Colavolpe processing [5]. The drawback of these noniterative methods is, however, that their gain, relative to the standard 2SDD technique, is modest. Iterative procedures result in a significantly larger gain, however. In this context we must emphasize that part of this gain comes from the fact that we can do 2D processing.

Simulations for the noniterative AWGN case show that (a) trellis-lengths of 𝑁+132 are required and (b) that 2D dominant subtrellis processing with 𝑀(𝑁+1)=32 outperforms 2SDD by 0.7dB at a BER of 104.

For the iterative AWGN case with 𝐿=5 iterations, simulations show that 2D dominant subtrellis processing with 𝑀(𝑁+1)=32, where 𝑁+1=32 and 𝑀=1, outperforms 2SDD by 3.7dB at a BER of 104. However, simulations also reveal that with 𝑀(𝑁+1)=32, where 𝑁+1=4 and 𝑀=8, the iterative coding gain is reduced to 2.5dB, which is caused by the smaller value of 𝑁+1.

On the other hand, iterative simulations for a practical setting (i.e., the TU-6 model) show that (a) with trellis-length 𝑁+1=18 and 𝑀=1 (one subcarrier) no reliable communication is possible, but that (b) with a modest trellis-length 𝑁+1=4 and 𝑀=8 subcarriers, the iterative coding advantage is maintained and that the gain is roughly 2.4dB for 10 Hz Doppler frequency, and 1.6 dB for 20 Hz.

Acknowledgments

The authors would like to acknowledge Professor J. W. M. Bergmans, the management and the technical staff of Catena Radio Design B. V., and NXP-Research Eindhoven for their support to accomplish this work. Moreover, They thank the anonymous reviewers for their comments and valuable suggestions.