#### Abstract

Single-carrier-frequency division multiple access (SC-FDMA) has recently become the preferred uplink transmission scheme in long-term evolution (LTE) systems. Similar to orthogonal frequency division multiple access (OFDMA), SC-FDMA is highly sensitive to frequency offsets caused by oscillator inaccuracies and Doppler spread, which lead to intercarrier interference (ICI). This work proposes a multistage decision-feedback structure to mitigate the ICI effect and enhance system performance in time-variant environments. Based on the block-type pilot arrangement of the LTE uplink type 1 frame structure, the time-domain least squares (TDLS) method and polynomial-based curve-fitting algorithm are employed for channel estimation. Instead of using a conventional equalizer, this work uses a group frequency-domain equalizer (GFDE) to reduce computational complexity. Furthermore, this work utilizes a dual iterative structure of group parallel interference cancellation (GPIC) and frequency-domain group parallel interference cancellation (FPIC) to mitigate the ICI effect. Finally, to optimize system performance, this work applies a novel error-correction scheme. Simulation results demonstrate the bit error rate (BER) performance is markedly superior to that of the conventional full-size receiver based on minimum mean square error (MMSE). This structure performs well and is a flexible choice in mobile environments using the SC-FDMA scheme.

#### 1. Introduction

In recent years, cellular communication services, ranging from traditional voice traffic to data transmission, the data rate, and bandwidth efficiency of the wireless-network physical layer have improved. The international mobile telecommunications-advanced (IMT-Advanced) standard, which is promoted by the international telecommunication union (ITU), is 4th-generation (4G) mobile communication. The highest data rate for 4G uplink and downlink is 50 Mbps and 100 Mbps, and the highest bandwidth efficiency of uplink and downlink is 6.75 bits/s/Hz and 15 bits/s/Hz [1], respectively. Notably, IEEE 802.16 Worldwide Interoperability for Microwave Access (WiMAX) orthogonal frequency division multiple access (OFDMA) has become the physical layer standard [2] on the uplink and downlink to meet bandwidth efficiency requirements. However, OFDMA has the disadvantage of a high peak-to-average-power ratio (PAPR), which reduces power efficiency [3] of a power amplifier (PA) and is not suitable for mobile users with limited power. Therefore, the future 4G standard of the 3rd-generation partnership project-long-term evolution (3GPP-LTE) system will employ a relatively low PAPR transmission scheme of a single-carrier frequency division multiple access (SC-FDMA) [4–6] as the uplink physical layer standards [7–9] to increase power efficiency.

The SC-FDMA system can be considered the first discrete Fourier transform (DFT) precoded OFDM system, which is similar to the traditional OFDM system. The performance of the SC-FDMA system is decreased by the Doppler effect in mobile time-variant environments. This effect destroys the orthogonality of subcarriers, leading to intercarrier interference (ICI). Moreover, the pilot arrangement of the LTE uplink is block type, differing from the comb type of the conventional OFDM system, such that channel estimation methods are needed to amend.

Some channel estimation methods with the block-type pilot arrangement have been developed. In [10], Karakaya proposed a Kalman-filter-based approach to mitigate ICI under high Doppler spread scenarios by tracking variation in channel taps and utilized an interpolation algorithm based on polynomial fitting for channel estimation. However, the ICI effect in highly mobile environments degrades system performance. Notably, [10] did not focus on ICI cancellation techniques in the frequency domain. In [11], Wang proposed a single-user and multiuser channel estimation scheme. The channel taps of each user can be estimated by the orthogonal features of the designed constant amplitude zero autocorrelation (CAZAC) sequence. However, [11] assumed a non-time-varying channel scenario; thus, this method cannot be applied to high-speed mobile environments. In [12], Zheng proposed an interpolation-based channel estimator of the frequency domain and proved that interpolation of the least squares (LS) method and that of the minimum mean square error (MMSE) method are equivalent. In [13], Zhang proposed a frequency-domain decision-feedback equalizer and applied the Lagrange multiplier method to replace general inverse-matrix operations. Compared with the traditional decision-feedback equalizer (DFE), [13] is less complex.

This work focuses on channel estimation and ICI cancellation techniques for receiver design. First, time-domain least squares (TDLS) channel estimation is applied for the block-type pilot arrangement. The curve-fitting estimator is applied to interpolate missing channel information between pilot symbols. Next, the low-complexity channel equalizer of the group frequency-domain equalizer (GFDE) is then applied. Furthermore, in considering the Doppler effect of mobile time-variant environments, this work applies dual iterative interference cancellation of group parallel interference cancellation (GPIC) and frequency-domain group parallel interference cancellation (FPIC) to reduce the ICI effect. Finally, this work proposes a novel error-correction scheme to optimize system performance. Simulations demonstrate that the proposed receiver design performs well and is a flexible choice in mobile environments using the SC-FDMA scheme.

The remainder of this paper is organized as follows. Section 2 introduces the SC-FDMA model and time-variant channel model. Section 3 divides the receiver design into five parts. Section 3.1 is the TDLS channel estimation scheme and the polynomial curve-fitting-based channel interpolation method. Section 3.2 is the low-complexity GFDE. Sections 3.3 and 3.4 are the dual ICI cancellations by GPIC and FPIC. Error correction, which consists of time-domain least squares and group maximum likelihood, is achieved in Section 3.5. Performance simulation results for the proposed system are given in Section 4. Section 5 offers conclusions.

#### 2. System Model

Figure 1 shows the structure of the SC-FDMA transceiver. Constellation symbols of the th user can be grouped into a data block .

Then, is transformed to the frequency domain using the -point fast Fourier transform (FFT): where and is a normalized FFT matrix, that is, . Next, the th element of is mapped onto the th subcarrier as follows: where denotes the resource allocation sets of the localized and distributed subcarrier mapping of the th user shown in Figure 2, and is a set of indices whose elements are . Resource allocation data block is then acquired.

**(a)**

**(b)**

Following subcarrier mapping, is transformed into the time domain using the -point inverse fast Fourier transform (IFFT): where and is a normalized IFFT matrix, that is, . In this work, we assume only one user exists, .

Then, the received signal over mobile fading channel can be expressed as where is the transmitted signal with cyclic prefix (CP) insertion, is CP length, is the time-variant channel response of the th path at discrete sampling time , is maximum delay spread, and is additive white Gaussian noise (AWGN) with . After removing the CP at the receiver, the received signal can be written as where is the noise vector, is the transmitted signal, and is an circulant matrix as follows:

#### 3. The Proposed Multistage Decision Feedback Receiver

To cancel the ICI effect caused by mobile environments, the proposed multistage decision-feedback receiver design is analyzed. Figure 3 shows the block diagram of the multistage decision-feedback receiver.

First, channel response is estimated by the TDLS [14]. The estimated channel response is utilized to facilitate the multistage decision-feedback design, which has low complexity, ICI cancellation, and improved data correction. The details of these procedures are described and analyzed as follows.

##### 3.1. The Time-Domain Least Squares Algorithm

This section describes the channel estimation scheme for estimating the channel response between pilot symbols of two consecutive slots within a frame. In the LTE uplink type 1 frame structure with extended CP shown in Figure 4, a reference signal is allocated in the fourth symbol in one slot with a total of six symbols. The reference signal is the Zadoff-Chu sequence with low autocorrelation sidelobes [15]. The pilot arrangement in the LTE uplink is a kind of block-type arrangement [9], meaning that pilot signals are inserted into all subcarriers of the frequency domain.

In the following, TDLS estimation is analyzed. Figure 5 shows the block diagram of TDLS.

For time-invariant channel, the received pilot signal can be expressed as where is an circular matrix formed by the pilot symbol and is channel response vector with paths. In order to estimate the approximated time-variant channel, the received pilot signal be rewritten as where is average channel response vector with paths. Notably, denotes the average channel response within the consecutive sampling period of the th path. To increase the effectiveness of the channel response, and can be expressed as where is the th element of the received pilot vector and denotes the group number. represents the submatrix from the th row to the th row and the 1st column to the th column of matrix . Next, the average channel response vector of group can be obtained as In rearranging the average channel response vector above, we define as the channel response matrix within the pilot symbol.

After obtaining the channel response within the pilot symbol, this work utilizes a polynomial curve-fitting scheme based on the linear model estimator [10] to interpolate the missing channel information between pilot symbols in Figure 4. The linear model [16] of fading channel profile can be expressed as where denotes the th path response of under two consecutive pilot symbols of the time slots with size ; is the channel response of the th path derived from the TDLS; is the th time instant of the th time slot, where . The relation between time slot and is shown in Figure 6; is a matrix consisting of two Vandermonde matrices, where and can be the composited elements of time instants ; polynomial orders represent time slots. is a vector of polynomial coefficients of the th path. Next, the polynomial coefficients can be obtained by the least squares solution: Therefore, the channel response of polynomial-based curve-fitting between two consecutive pilot symbols can be approximated as Thus, the MSE performance of curve-fitting channel estimation can be defined as where is the ideal channel response of the th time instant of the th path. After estimating the channel impulse response, this work reconstructs the frequency domain channel response matrix [17] as Moreover, MSE performance of curve fitting depends on the polynomial order and time interval . In particular, the size of variable directly affects the performance of curvefitting. For two extreme examples, Figures 6 and 7 show the real-part mobile channel responses with the size of and , respectively. In case of and , estimated channel responses by TDLS are obtained, and the variation between the TDLS channel and the real channel is larger. But the bias between the curve-fitting channel and the real channel is smaller. Contrarily, in case of and , estimated channel responses by TDLS are obtained, and the variation between the TDLS channel and the real channel is smaller. However, the bias between the curve-fitting channel and the real channel is larger. Such as the previous examples, the tradeoff between size of and curve fitting should be considered. To optimize the performance of channel estimation, more details will be discussed in simulation results.

##### 3.2. Group Frequency Domain Equalizer

This section describes the GFDE of the first stage. In a time-variant environment, is a sparse frequency-domain channel matrix whose energies centralize at diagonal elements; as the farther from the diagonal line the gain of elements decreases. Therefore, this work applies the GFDE to reduce the ICI effect and computational complexity. Figure 8 shows the block diagram of the GFDE.

First, frequency domain signal vector with size is derived from the subcarrier demapping of signal vector . Then, is divided into groups shown in Figure 8. Each group with size can be expressed as where , , , is the relative AWGN with respect to the group , and denotes the th block matrix of . Then, data detection can be obtained roughly by the group minimum mean square error (GMMSE) where where is a identity matrix and is the noise variance of . In the first stage, data detection of the GFDE suffers the ICI effect of marginal elements of the group block of . In the following second stage, the ICI effect of group is mitigated.

##### 3.3. The Group Parallel Interference Cancellation of Frequency-Domain Soft-Decision Feedback

In the previous section, GFDE is applied to reduce computational complexity and equalize the channel effect. However, the performance degradation of GFDE is caused by the loss of some channel information: (20)

For example in (20), the GFDE equalizes the channel effect of the dotted circle. Due to the characteristic of the sparse frequency-domain channel matrix , the marginal ICI close to the diagonal of the solid circle still has larger energies and affects the system performance. In order to mitigate the marginal ICI effect, the GPIC of frequency domain soft decision feedback is applied in this section. Figure 9 shows the block diagram of GPIC.

The GPIC is an iterative decision feedback structure shown in Figure 9. In the first loop (), let of size be the initial decision; then, signal can be derived by marginal ICI cancellation where and are the th and th elements of , respectively. Next, is derived by the GFDE While the loops are equivalent or larger than 2 (), let be the decision data and redo (21) and (22) to refine signal . After iterative processing, the decision data can be derived by

##### 3.4. The Group Parallel Interference Cancellation of Time-Domain Hard-Decision Feedback

The parallel interference cancellation (PIC) [18] can be applied to improve data accuracy and leads the initial decision to be more reliable. In this section of the third stage, this work utilizes the FPIC of time-domain hard-decision feedback to mitigate the ICI effect further. Figure 10 shows the block diagram of FPIC.

In the first loop () of the third stage, the ICI term of the th subcarrier signal can be reconstructed by

Next, the ICI term is subtracted from the subcarrier de-mapping signal , and signal can then be acquired where is the th column vector of , denotes the th composite channel vector, and is AWGN with respect to . The output of the MMSE equalizer is given by The decision output of can be derived by

Next, this work applies the iterative process when loops are equivalent or larger than 2 (). Let replace and reapply (24)–(27) to improve decision reliability.

##### 3.5. Error Correction Consisting of Time-Domain Least Squares and Group Maximum Likelihood

In this section, this work proposes a novel error-correction scheme on the fourth stage. Figure 11 shows the block diagram of error correction.

After obtaining decision data , let conduct SC-FDMA modulation equation (1)–(3) to construct the transmitted signal . Equation (10) can be rewritten as where is circular matrix formed by the signal . Then, the reconstructed channel can be expressed in the following from (11):

In rearranging the reconstructed channel response vector above, we define as the channel response matrix of each path within the data blocks. We assume FFT size is 128 and the occupied subcarrier number of the desired user is 72. Figure 12 shows the reconstructed real-part channel impulse response.

It is obviously to find the burst error of the reconstructed channel when decision data errors. To find the subcarrier set of the decision error according to the burst error of the channel, this work builds a lookup table using a histogram-based approach. The subcarrier set of decision error is derived by dividing occupied subcarriers of the desired user into 12 blocks; each block has the same size—. Table 1 shows the lookup table of the burst error location between the channel and decision data.

Additionally, this error correction scheme is only suitable for a high signal-to-noise ratio (SNR), that is, dB, because the performance of the TDLS is sensitive to the SNR. The following presents the details of the proposed error correction scheme.(1)Let conduct SC-FDMA modulation and use the received signal to reconstruct the channel impulse response by the TDLS, where denotes the th path of the reconstructed channel between two consecutive pilot symbols, and . (2)Compare with of curving fitting, then find the range of the error burst where , denotes the location of the error burst, and is the percentage of channel difference. If no channel difference exists under the condition , skip error correction. Conversely, one must identify the position of maximum error burst. (3)Find subcarrier set of decision error with respect to by Table 1.(4)In this step, a localized maximum-likelihood (ML) search is applied to correct decision data of subcarrier sets . The technical procedures of the localized ML are as follows: where is a set of all possible transmit symbol vectors that consists of elements, is the subcarrier size of , are constellation points, and denotes the reconstructed channel response, which is constructed by the TDLS with respect to candidates .

After applying the above error-correction scheme, let replace and redo steps 1–4. The iterative structure can be used to optimize the performance of data accuracy.

#### 4. Simulation Results

Based on the previous analyses, simulations are performed to assess the performance of the proposed multistage decision-feedback equalization schemes for LTE uplink systems. Table 2 lists the parameter settings of the simulation environment.

First, this work discusses the MSE performance of the time-varying channel estimation of the TDLS. The MSE performance depends on the polynomial order and time interval . Figure 13 shows the MSE performance of TDLS with different size . It is obvious that the MSE performance varies with size of polynomial and range. Based on the rule of thumb of the simulation result in Figure 13, the adaptive order with different range can be expressed as

Based on the adaptive order in (33), Figure 14 shows the MSE performance of TDLS with different size . When time interval is , MSE performance increases as time interval increases. However, when time interval is too large (), group number is not sufficient to support the need for statistical curving fitting; thus, MSE performance decreases. According to the simulation result in Figure 14, the optimum selection of the time interval size is considered.

To demonstrate the effect of block size of the GFDE in the second simulation, the performance of GFDE improves when block size is large, which is shown in Figure 15. At of GFDE, gain losses are about 2 dB and 1 dB for and , respectively, as compared to the full-size MMSE equalizer. To reduce computational complexity, the suitable block size of GFDE is .

Figure 16 shows the BER performance of cascades of multistage equalizers. Obviously BER performance can be improved by increasing the number of stages. The performance of the proposed GFDE () and GPIC can approach that of the full-size MMSE equalizer below dB. To enhance performance, the proposed FPIC and error-correction schemes are employed. The performance of the overall receiver design () is improved by about 1-2 dB, as compared with that of the full-size MMSE equalizer in Figure 16.

Furthermore, the comparison of computation complexity of proposed receiver and full-size MMSE receiver is analyzed in the following. In the first stage of GFDE scheme equations (18)-(19), the number of complex multiplication is computed about . In the second stage of GPIC scheme equations (21)–(23), the number of complex multiplication is computed about . In the third stage of FPIC scheme equations (24)–(27), it involves the advantage of with the fixed values (i.e., QPSK symbol: ). Therefore, the complex multiplication can be reduced for the ICI reconstruction and the number of complex multiplication of FPIC is computed about . In the fourth stage of error-correction scheme, it includes SC-FDMA modulation equation (3), channel reconstruction equation (29), and ML search equation (32); the number of complex multiplication can be computed about . Besides, the number of complex multiplication of the full-size MMSE is computed about .

For example, considering the case (GFDE and GPIC and FPIC and error correct) in Figure 16 with , , , , and , the number of complex multiplication of the proposed receiver is determined about 242, 360. And considering the full-size MMSE receiver with in Figure 16, the number of complex multiplication is calculated about 2,097,152. It is obviously that the proposed multistage receiver can provide the advantage of the lower computation complexity than the full-size MMSE receiver.

#### 5. Conclusions

This work proposes a multistage decision-feedback receiver design for LTE uplink systems. First, TDLS channel estimation is applied for block-type pilot arrangement; then the curve-fitting estimator is applied to interpolate missing channel information between pilot symbols. The low-complexity channel equalizer GFDE is employed. Furthermore, in considering the Doppler effect of mobile time-variant environments, this work employs dual iterative interference cancellation by GPIC and FPIC to mitigate the ICI effect. Finally, the novel error-correction scheme combining TDLS and group maximum likelihood is applied to optimize the system performance. Simulation results demonstrate that the performance of proposed channel estimation scheme is good when a suitable time interval and polynomial order are chosen. Due to the properties of each stage, the proposed receiver design markedly improves BER performance. This multistage design is more flexible than traditional structure and feasible for mobile time-variant environments.

#### Acknowledgments

This work is sponsored by the National Science Council, Taiwan, under Contract NSC 101–2220-E-155-006. The authors would like to thank the editor and anonymous reviewers for their helpful comments and suggestions in improving the quality of this paper.