#### Abstract

Common OFDM system contains redundancy necessary to mitigate interblock interference and allows computationally effective single-tap frequency domain equalization in receiver. Assuming the system implements an outer error correcting code and channel state information is available in the receiver, we show that it is possible to understand the cyclic prefix insertion as a weak inner ECC encoding and exploit the introduced redundancy to slightly improve error performance of such a system. In this paper, an easy way to implement modification to an existing SDR OFDM receiver is presented. This modification enables the utilization of prefix redundancy, while preserving full compatibility with existing OFDM-based communication standards.

#### 1. Introduction

Thanks to their flexibility, SDR platforms enable relatively easy modifications to the communication system subblocks, which could bring advantages or gains to existing system. It is even possible sometimes to exploit already standardized protocols in an unexpected new way. In this paper, this future will be illustrated on an example of the standard OFDM transmission technique.

In current standards, for example, IEEE 802.16e, the OFDM system uses a cyclic prefix (CP) to mitigate the effects of channel impairment. For each transmitted OFDM symbol, the length of the prefix is a fraction of useful symbol length. The main purpose of using cyclic prefix is protection against intersymbol interference (ISI), or more precisely interblock interference (IBI), in connection with simple equalization in frequency domain. Most of current receivers throw the cyclic prefix away or use it only for channel estimation. In this paper, we present a method for exploiting the redundancy introduced by the cyclic prefix, by means of decoding two serially concatenated codes. To be more precise, we understand the prefix insertion as an inner repetition coding, where only a part of the samples is repeated. With code rate (assuming the repetition of one eighth of the samples as in [1]), this code is very weak, but in cooperation with a powerful outer code (which is always present in a practical system), it could bring an error performance gain. Of course the received copy of the prefix data is corrupted by IBI and has to be processed first.

In the next section, we give a short description of current OFDM systems along with a brief overview of recent methods for exploiting of cyclic prefix. The third section refreshes the principle of concatenated coding and depicts the possible OFDM error performance improvement using a simulation with real-life settings. The fourth section is devoted to description of the received prefix processing, which is necessary for successful extraction of repetition data. The last section describes the whole modified OFDM receiver and then presents simulation results showing the actual error performance improvement in a multipath environment.

#### 2. Current OFDM Systems and Modifications

##### 2.1. OFDM System Overview

Figure 1 shows a simplified standard OFDM system model (defined in [2]). Each of the depicted functional blocks is implemented in software, using a high-level programming language such as ANSI C [3], as a separate function or module, possibly taking advantage of underlying platform specific hardware acceleration. Since this paper primarily focuses on software processing of floating or fixed point vectors, the analog and hybrid blocks, such as amplifiers and A/D converters, are intentionally omitted in Figure 1.

OFDM processing begins with blocks of ECC encoded binary data that are first digitally modulated in frequency domain (CM Block), then transformed to time domain by IDFT. A cyclic prefix is attached in the CPI block. The OFDM symbol (or time domain block) consisting of many (1024 and more) samples travels through multipath environment, which can be modeled by a convolution with channel impulse response . The prolonging of the block caused by the channel convolution is source of ISI and IBI. Furthermore, AWGN noise is superimposed onto the signal. The receiver first selects a major subset of samples of the received OFDM symbol in the CPR block; the rest of the block samples is discarded because it is corrupted by IBI. The selected part of the samples is then transformed to frequency domain, where it can be easily equalized by an efficient single-tap FDE equalizer. Easy equalization is a consequence of the cyclic prefix insertion and removal. Equalization in time domain would require a much more complicated computation of channel convolution inversion. The quality of the equalization process, however, depends on the precision of the CSI estimate. The output of equalization is a noisy estimate of original signal space mapped data block. After symbol detection, this block is finally transformed to a block of log- likelihood ratio (LLR) values for further soft-input error correcting decoding.

##### 2.2. Advanced Cyclic Prefix Usage

The channel estimation and symbol detection blocks are omitted from Figure 1 as this paper focuses on the decoding of the code cascade. However, several conceptually different approaches for increasing throughput and/or exploiting the cyclic prefix of a coded OFDM system already exist.

The first one is reducing of the CP size to less than the channel delay spread and overcoming the resulting IBI by modifying the iterative ECC decoder to work over two consecutive OFDM symbols so that it is able to fix errors resulting from IBI that occurs in case of insufficient CP size [4].

A different approach is exploiting of the CP (of size greater than the channel delay spread) to improve the channel estimation [5–8] or symbol detection [9]. However, the methods described in [6, 8] practically apply only to xDSL environment and are not suitable for variable channel conditions in a wireless transmission, while the redundant CP samples in [9] are used only for reducing the noise variance for the replicated samples using a time domain max ratio combining (MRC) algorithm in single carrier system—not OFDM.

The residual intersymbol interference cancellation (RISIC) is presented in [5]. The principle is to cancel the residual IBI resulting from an insufficient CP by iterating channel estimation, cyclicity restoration, and soft output decoding. The algorithm is defined purely for a setting where a space-time EC code is present in a MIMO-OFDM system.

A more general method of turbo frequency domain equalization (turbo FDE) is presented in [7]. Here a soft elementary signal estimation (ESE) block works together with a soft-input soft-output (SISO) ECC decoder in an iterative manner so that the estimates of channel symbols are iteratively improved by the results of the decoder. This method uses the CP insertion and removal only to ensure the circulant property of the channel matrix. The iterative application of a complex APP decoding algorithm results in a great increase in computational complexity.

Finally, the third concept is based on
the observation that cyclic prefix size defined in *communication standard* is designed for the worst-case scenario therefore in case that the channel
delay spread is shorter than the prefix duration, the prefix information can be
used for CE without extensive postprocessing
[10]; the drawback is limited usage of this method together with modern
standards, such as IEEE 802.16e where the size of the prefix varies according
to the channel propagation conditions.

#### 3. Concatenation of Codes

##### 3.1. OFDM Is a Concatenated Code

The principle of serial concatenation of codes is well known. As shown in Figure 2, the transmitter simply feeds the output of one encoder to the input of the other through a pseudorandom or rectangular interleaver.

The situation in the receiver is more complicated; usually two SISO decoders cooperate in an iterative manner, exchanging extrinsic information as described in [11]. After a number of iterations, final hard decision is made. This decoding scheme is well known and is also used in turbo code decoding application.

In today’s systems, OFDM is always used along with a powerful error correcting code such as turbo or LDPC code as shown in Figure 1. If we understand the cyclic prefix insertion as an inner coder and the IDFT as an interleaver, then the coded OFDM transmitter is a serially concatenated encoder system, and the decoder can be redesigned to iterative form. Because the inner code is very weak (partial repetition) and the decoding of the outer coder is computationally intensive (It is an iterative process itself.) a simplified noniterative soft-output scheme is suggested.

In Figure 4, only the forward branch of the decoder in Figure 3 is performed, so the weak inner repetition code with very simple decoding is used only to improve the log-likelihood values of samples from the channel, entering the powerful outer code decoder. The first logical step in the analysis of concatenated OFDM decoding is to omit the negative effect of IBI, ISI, and time frequency domain switches. This simplification leads to a very simple model, with transmitter and receiver shown in Figures 2 and 4. As mentioned before, the SISO decoding of the repetition code is very simple. The extrinsic information for any bit is just the copied channel LLR of its repetition [12]. Therefore, if a second copy of part of the bits is available, the first stage (the decoding of the partial repetition code) is done by a simple addition of LLR values. After deinterleaving, these fortified LLRs enter the unmodified outer code decoder.

##### 3.2. Empiric Error Performance Upper Bound

Before actual prefix redundancy extraction efforts, simplified simulation experiments were done, primarily with the goal to give us a proof of concept. The first round of simulations used a simplified OFDM system model. In the simulations, we used the inner partial repetition code of rate which corresponds to one of the WiMax defined cyclic prefix sizes for OFDM [1] and convolutional turbo code of rate such as defined in UMTS standards [13]. We also used a pseudorandom interleaver used in UMTS instead of IDFT. The simulation results are shown in Figure 5. Three curves depict the BER after the 1st, 3rd, and 7th iteration. It is clear that the system with repetition decoder placed in front of the turbo decoder achieves the same reliability at approximately 0.25 dB lower than the turbo system alone.

The resulting 0.25 dB improvement can be interpreted as a rough estimate or an upper bound to the actual improvement that can be achieved in real systems. The significant difference between the simplified model and the real system is corruption of the second copy of data used in repetition decoder. This corruption is caused by IBI and cannot be fully remedied. Two differently successful solutions addressing this problem are described in the following sections.

#### 4. Cyclic Prefix in OFDM

The following section consists of three parts. First, the principle of prefix insertion (CPI) and removal (CPR) for the purpose of simple frequency domain equalization in context of OFDM multipath-environment transmission is reviewed. Second, a more formal matrix-based description of channel and CPI/CPR processes is presented. Finally, the process of extraction of redundant information from cyclic prefix (CP) is described, based on the formal matrix representation.

#### 4.1. Frequency Domain Equalization

The propagation of a signal through a multipath channel with ISI is usually described by convolution with the channel impulse response (in this paper a discrete-time version is assumed). In a block-oriented system, such as OFDM, the discrete-time convolution can be written in matrix form where is a vector of time samples, produced by the transmitter, is a convolution matrix consisting of channel impulse response values (an example is shown in Figure 7), and is channel output/received sequence. The process of convolution of transmitted data block with channel impulse response prolongs the block from transmitted samples to received samples, where is the length of . If there is no guard interval between two consecutively transmitted blocks, or if the guard interval is shorter than , IBI occurs on the boundaries of the blocks.

One possible way of coping with IBI is to send an all-zero guard prefix. Another way, used in OFDM, is to use a cyclic prefix—part of the samples from the end of the transmitted block is copied and prepended before the beginning of the block. In the receiver, only the appropriate subblock of the received sequence is selected, redundant samples of the prefix are discarded. The motivation for CP insertion and removal is that these operations allow us to understand the channel convolution matrix as a circulant matrix. More precisely a submatrix can be found in , so that if the transmitted vector with redundant CP is multiplied by the resulting received vector is the same as if the transmitted vector with no CP inserted was multiplied by a circulant matrix (see Figure 8). For a circulant matrix , the following equation means that such a matrix, when multiplied by the Fourier transform matrices (which represent the IDFT in transmitter and DFT in receiver), results in a diagonal matrix with nonzero elements only on the main diagonal [14]. The diagonal elements are also equal to the channel frequency response samples. Therefore, the equalization is simply performed by scalar multiplication of each element of the received block with a reciprocal value of channel frequency response (assumed to be known).

#### 4.2. Channel Convolution Matrix Circularization

The goal of prefix extraction is to process the currently discarded received CP subblock to obtain a second copy of the data bits, more specifically a second set of channel LLR values for the data bits, in order to use these values in a repetition decoder and in that way fortify the successive error correcting soft-input decoder.

To completely understand the process, a more formal description of transmission based on (1) is needed. The transmitted vector containing the cyclic prefix can be divided into three subblocks , where is the cyclic prefix with value equal to (repeated samples from the end of the block), and is the data part that is not repeated (“” denotes vector concatenation operator). The payload is the vector while the first vector is redundant. The received vector can also be divided to subblocks , where the part is discarded by design because it is corrupted by IBI—for block number , its samples sum up with the samples of block . Also, is the tail subblock—the result of the convolutional prolonging of block in the channel. For the block number , it corrupts the samples at the beginning of the block . The useful data is the part, and in a standard OFDM receiver implementation it is the only subvector further processed. As shown in Figure 8, (1) can be expressed in terms of sub vectors of and ; these subvectors define a partitioning of the convolution matrix dividing it to 3 submatrices , , and , which can be further partitioned to smaller matrices (mainly for the purpose of finding small, nonzero, and possibly square matrices). (In Figure 8 a partitioning dividing the matrix to submatrices is shown. However, the lowest rightmost matrix is labeled . This is intentional and enables compatibility of the labeling with a slightly different partitioning necessary for the prefix extraction defined below. (1) is reformulated as follows:

#### 4.3. Prefix Extraction

Conforming to Figure 8, the interesting information the receiver needs to extract is the redundant information in the transmitted cyclic prefix contained in the received vector . So far the redundant information in the received vector is discarded because what receiver actually observes is not the vector but a combination of samples of two consecutive OFDM symbols (indexed by ):

However, if matrices and are known, which is assumed true, it is possible to extract the information required using an additive correction and matrix inversion:

However, is a reconstruction of transmitter output samples for the block . It is based on the decoded bits from previous OFDM symbol.

The reconstruction of the transmitter output in receiver is straightforward (as shown in Figure 10). The decoded bits are encoded again, mapped to signal space constellation samples, and transformed to time domain. As most of the iterative decoding algorithms can produce the codeword, along with the decoded bits, the encoding process can be omitted. All other operations necessary for reconstruction are not computationally intensive.

The first obstacle here is the fact that the time domain samples are not very useful for repetition decoding. LLR values have to be created for bits contained in these samples and this computation must be done in frequency domain (similar to standard OFDM operation, see Figure 1). Only such values can be used in repetition decoding

This can be done by concatenation , creating a 2nd copy of time block , and transformation to frequency domain again. The redundant information contained in a subset of samples in the time domain will be smeared to all samples in frequency domain. Therefore, the repetition decoder will affect all the bits.

A serious drawback of this procedure is the fact that the inversion of channel response submatrix is a numerically unstable operation and in the presence of noise can lead to noise amplification rendering the practical limited-precision implementation unreliable. This effect can be reduced by an optimized organization of computations, but there is also another method.

The *basic idea is not to compute any inversion*, but instead use
the properties of the channel convolution matrix. We assume that an additive correction,
that uses an estimate of previous transmitted block (5), is already applied (the
vector is subtracted). This correction removes IBI *and* must be done in time domain.

As the standard branch of processing depends on the circulant property which is a consequence of cyclic prefix insertion in transmitter and appropriate subblock selection in receiver, in case of the 2nd copy extraction, the circulant property will be provided by a correction done in the receiver. The result will be a second copy of frequency domain samples that originates from the otherwise ignored prefix samples.

In Figure 9, the actual situation is shown on the left, is now the important subvector of the received sequence and is related with the interesting block by a simple equation ( is considered part of ). Vector is a cyclically shifted version of vector . Matrix is not circulant, but if a second additive correction is applied to the received vector (more specifically to ), then the resulting vector is related to the transmitted vector through multiplication with a circulant matrix , and therefore can be effectively equalized by a single-tap frequency domain equalizer, just as the block in a standard OFDM system. The second additive correction uses the matrix shown in Figure 9 on the right: If the correction is applied to , then the vector is equalized in the same way and with the same equalizer values as .

As apparent in Figure 9, matrix can be further divided vertically into two submatrices, one of them zero. This division enables reduction of complexity when only a small part of samples of needs to be estimated. An example of final matrix partitioning covering both partitionings in Figures 8 and 9 is shown in Figure 7.

#### 5. Modified OFDM Receiver

##### 5.1. Receiver Design

In Figure 10, the modified OFDM receiver is shown. It consists basically of a standard receiver, fortified with a second copy extraction and simple repetition decoder. An estimate of a subset of samples of transmitter output is computed based on the decoded output of the standard branch. Because a potent ECC is assumed to be present in the system, the reconstruction of transmitter output will be a “good estimate” of the actual values. After the additive corrections and are applied, the resulting time-domain samples (subset of them containing new information) are transformed to frequency domain where they can be easily equalized in the same way as shown earlier. Because the 2nd copy of the received block is rotated in time domain by the size of the prefix, another single-tap phase correction must be applied in frequency domain (the “spectral shift” block). This correction is multiplicative and depends only on the size of the shift which is constant for a specific prefix size. A second set of channel LLR values is computed and added to the LLR values from the standard processing branch. This improved sequence then enters the ECC decoder.

As indicated earlier, all of the functional blocks are implemented in software. The key property of the modified branch is that it is built using exactly the same components as the standard processing branch with one exception—the spectral shift block that is implemented as a simple scalar complex multiplication. The development and inclusion of the modification is very straightforward in an SDR receiver. Furthermore, the additional processing can be turned off and on adaptively, depending on the transmission quality requirements and available processing time.

#### 5.2. Simulation Results

We simulated a coded OFDM system with an outer RSC turbo code of rate defined in [13], with cyclic prefix size equal to 1/8 of the data block size (as defined in [1]) over a multipath channel with AWGN noise. Each data block of size 1024 bits was after turbo coding mapped to three OFDM symbols of 1024 complex samples. The values of channel impulse response samples were distributed according to [15].

The error performance of the new system is only slightly better (approx. 0.1 dB) than the basic system. The improvement is most visible in the error floor area (below ) of the suboptimal log-domain iterative decoder of the outer code.

#### 6. Conclusion

We have shown that it is possible to exploit the redundancy in a cyclic prefix of OFDM. The modified receiver is fully backward compatible with any existing OFDM-based protocol. The computational complexity is approximately double compared to the standard OFDM receiver. Simulations for specified parameters have shown that a relatively small improvement of 0.1 dB in bit error rate could be achieved thanks to exploitation of the prefix redundancy. Because the modification reuses most of the functional blocks already present in the system, it can be implemented very rapidly in an SDR system using a high-level programming language.

#### Acknowledgment

This work was supported by Scientific Grant Agency of Ministry of Education of Slovak Republic and Slovak Academy of Sciences under contract VEGA 1/0376/09, 2009–2012.