Abstract

With 5G communication systems on the horizon, efficient interference management in heterogeneous multicell networks is more vital than ever. This paper investigates the linear precoder design for downlink multicell multiple-input multiple-output orthogonal frequency-division multiplexing (MIMO-OFDM) systems, where base stations (BSs) coordinate to reduce the interference across space and frequency. In order to minimize the overall feedback overhead in next-generation systems, we consider precoding schemes that require statistical channel state information (CSI) only. We apply the random matrix theory to approximate the ergodic weighted sum rate of the system with a closed form expression. After formulating the approximation for general channels, we reduce the results to a more compact form using the Kronecker channel model for which several multicarrier concepts such as frequency selectivity, channel tap correlations, and intercarrier interference (ICI) are rigorously represented. We find the local optimal solution for the maximization of the approximate rate using a gradient method that requires only the covariance structure of the MIMO-OFDM channels. Within this covariance structure are the channel tap correlations and ICI information, both of which are taken into consideration in the precoder design. Simulation results show that the rate approximation is very accurate even for very small MIMO-OFDM systems and the proposed method converges rapidly to a near-optimal solution that competes with networked MIMO and precoders based on instantaneous full CSI.

1. Introduction

Multicell multiple-input multiple-output orthogonal frequency-division multiplexing (MIMO-OFDM) is a promising technology for next-generation telecommunication networks. Both MIMO and OFDM are known to facilitate the wireless networks with excellent capabilities; when MIMO and OFDM are combined together, MIMO-OFDM achieves tremendous capacities that will be enjoyed by 5G systems in the near future. While MIMO-OFDM systems have superb performance, several challenges invoked by MIMO and OFDM characteristics are involved in the system design [16]. Other competing technologies in 5G networks are filtered-OFDM (F-OFDM) and universal filtered multicarrier (UFMC) that are receiving attention in the literature [7, 8].

MIMO wireless cellular systems can achieve excellent data-rates if proper coordination among base stations (BSs) is employed to suppress the intercell and intracell interference. The highest system capacity is achieved by networked MIMO where both the channel state information (CSI) and the data streams of all users in different cells are made available at all BSs [9, 10]. A more practical and less complex form of coordination is through joint linear precoding with only the CSI of the users shared among BSs [1113]. In this case, each BS obtains the CSI of its intracell users and shares it with its adjacent BSs. Such precoding approach is very efficient for small number of BS antennas and it can alleviate the interference to a significant degree. As the number of BS antennas increases, the amount of signaling overhead for passing the CSI among BSs becomes taxing [14]. This problem is aggravated in frequency-selective channels employing OFDM where the CSI is different on each subcarrier. To overcome this difficulty, the precoders can be designed based on statistical CSI at the transmitter rather than instantaneous CSI [3, 1520]. Since the statistical CSI changes much slower than instantaneous CSI, the BSs need to be updated less frequently, hence, much less signaling overhead. While there is a performance loss due to the lack of exact CSI, precoder design based on statistical CSI for large-scale MIMO systems is an efficient approach because the performance loss is negligible [18, 20].

In order to design MIMO precoders with statistical CSI, the ergodic rate, which is a function of the covariance matrices of the MIMO channels, must be maximized over the precoders [16]. To facilitate the maximization, the ergodic rate can be approximated by a compact deterministic expression when the MIMO system is large-scale. The optimization can then be carried out on this deterministic approximation which is a function of the statistics of the channels [2023]; thus, the solution is a set of precoders that are functions of the statistical CSI. Deterministic approximation to the Shannon rate can be achieved based on techniques such as Bai and Silverstein [22, 24, 25], Gaussian method [26, 27], Lindeberg principle [20, 26, 28, 29], or a combination of them [22, Theorem  6.9], [26]. All of these methods approximate the Shannon transform of large dimensional random matrices, but they are based on slightly different assumptions.

The deterministic equivalents of the rates for large-scale MIMO systems that are derived in recent works [2023, 25, 26, 29] are functions of the channel statistics, so they can be used as objective functions for precoder design with statistical CSI. This has been done in [2123] where closed form optimal precoders are derived for simple single-user scenarios. In [20], an uplink multiuser multicell MIMO system is considered and suboptimal linear precoders are found based on the Lindeberg approximation to the ergodic rate. We shall extend this methodology to the problem of downlink multiuser multicell MIMO-OFDM linear precoder design with statistical CSI. Considering the downlink leads to a somewhat different power constraint compared to the uplink, and the OFDM assumption gives rise to concepts such as frequency selectivity [17, 30, 31], channel tap correlations [17, 3134], and intercarrier interference (ICI) [1, 3538], all of which must be taken into account in the precoder design. Our derivation of statistical precoding for the downlink multicell MIMO-OFDM system is a unification of previous works [20, 29], and while several details are different from these two works, this derivation is not the main contribution. The main point of this work is applying the already established results of random matrix theory to the MIMO-OFDM scenario in order to study the impact of frequency selectivity, tap correlations, and ICI on the statistical precoder design and system performance.

We shall study the downlink multicell MIMO-OFDM linear precoder design with statistical CSI using the deterministic approximation of the Shannon rate, a problem studied before only under instantaneous CSI [12]. In particular, with instantaneous CSI, the channel statistics are not used in precoder design and one may try to optimize the instantaneous rate, as is done in [12] where it has been shown that such an optimization problem is nonconvex and a suboptimal solution has been proposed using the Karush-Kuhn-Tucker (KKT) conditions. In this work, however, we assume that there is only statistical CSI available at the BSs and we design linear precoders that maximize the ergodic weighted sum rate. As opposed to the instantaneous case, for analysis of the ergodic rate, we need a statistical model for the MIMO channel which is usually determined by a covariance structure [23, 39]. We first consider the general correlated channel model with arbitrary probability distribution described in [23] that includes several statistical models such as the independent and identically distributed (i.i.d.) Rayleigh fading channel and the Kronecker channel [39]. Then we reduce the results to a simpler form for the Kronecker channel model.

To maximize the ergodic rate, inspired by [20, 23, 29], we find a deterministic approximation to the Shannon rate of the MIMO-OFDM system using the random matrix theory [21, 40]. The methodology will be applied to the MIMO-OFDM system with tap correlations and intercarrier interference. Then, assuming that the BSs have the statistical CSI of all users in the form of covariance matrices, we form a weighted sum rate maximization problem and propose a suboptimal solution using the KKT conditions along with the gradient descent method. Our simulations show that the approximation is quiet accurate even for small size MIMO-OFDM systems and the proposed algorithm converges rapidly to a maximum which has a substantial improvement over isotropic precoding. The results only slightly underperform those obtained with perfect instantaneous CSI [12]. We then extend to the case where the frequency-selective channel suffers from correlation among channel taps and we show that the precoders become frequency dependent under tap correlations. We study the effects of tap correlation on the precoder design and system performance. Next, we allow ICI among OFDM subcarriers which is caused by carrier frequency offset due to synchronization errors and Doppler shifts [1]. The ICI introduces a new source of interference in addition to the intercell and intracell interference. We find the deterministic approximation to the rate under ICI and then study its impact on the precoder design and system performance. Our simulations show that while spatial correlations, tap correlations, and ICI decrease the system sum rate, our method alleviates this performance loss by incorporating the correlation information and ICI intensity information into the precoder design. It must be noted that our statistical CSI based method is applicable to networked MIMO with full cooperation where the BSs share the channel statistics and the transmit data.

The remainder of the paper is organized as follows. In Section 5, we provide the system model and formulate the optimization problem. In Section 3, we obtain the deterministic approximation to the ergodic rate function. In Section 4, we give a gradient-descent-based algorithm to obtain the suboptimal precoders. In Section 5, we discuss the channel model and how the formulas simplify for the separable channel models. In Section 6, we extend to the case where there is ICI. In Section 7, we present simulation results, and finally, conclusions are given in Section 8.

We denote matrices, vectors, and scalars by upper-case bold letters as in , lower-case bold letters as in , and nonbold letters as in and , respectively. Moreover, , , and denote conjugate, transpose, and conjugate-transpose, respectively. The element on the th row and th column of a matrix is denoted by , and the th element of a vector is denoted by . Vectorization, trace, and expected value operators are denoted by , , and , respectively. The all-one and all-zero vectors of size and identity matrix of size are denoted by , , and , respectively. The operators and represent real part and imaginary part, respectively.

2. System Model and Problem Formulation

2.1. System Model

We consider a downlink multicell MIMO-OFDM wireless network shown in Figure 1, with BSs each serving users over subcarriers. All BSs have transmitting antennas while each user has receiving antennas with . The th user under BS is denoted by . The data vector to be precoded and transmitted by BS to its th user, that is, the data vector for user , over subcarrier is denoted by , and the corresponding linear precoding matrix is denoted by . The downlink channel matrix from BS to user over subcarrier is denoted by . The thermal noise at user on subcarrier is denoted by . Based on the above notations, the received signal at user on subcarrier is given bywhere the second and third terms represent the intracell and intercell interference, respectively. We assume that and are i.i.d. Gaussian with and . We also assume that MIMO channels are independent across users and BSs.

Each user is assumed to know its own instantaneous CSI for detecting the data [16, 20, 22, 23, 29], which leads to the instantaneous data-rate for user on subcarrier given bywhere denotes the total interference plus noise on subcarrier given bywhere the first, second, and third terms represent the noise power, intracell interference power, and intercell interference power, respectively.

2.2. Problem Formulation

Our goal is to design precoder matrices based only on the second-order statistics of the MIMO channels. While the second-order statistics fully describe the statistical CSI for zero mean Gaussian channels, it is only partial statistical CSI for non-Gaussian channels. Defining as the set of precoder matrices, we formulate the weighted sum rate optimization problem aswhere denotes expectation over the channel matrices and is the maximization weight associated with user . The constraint is due to the fact that BS is subject to the transmit power limit of . Note that the power constraint introduces coupling across subcarriers, while the rate function given by (2) is independent for each subcarrier and coupled across users and BSs only, a feature of OFDM systems.

Define the covariance matrix of asUnlike designing precoders based instantaneous CSI as in [12], the precoders in are restricted to be functions of . The above approach to precoder design has several advantages. It is very practical since acquiring full CSI for all induces huge communication overhead due to CSI exchanges. Also, since the higher-order statistics can be costly to obtain, the second-order statistics often serve as the minimal statistics for various kinds of estimations.

In order to solve the nonconvex problem , we need to find the expectation . However, this expectation makes the problem hard to analyse even for the single-user case [16]. Therefore, it is useful to replace the objective function by the analytically tractable approximation . Thanks to the random matrix theory, as will be shown in the next section, we can find the deterministic approximation that allows analytical expressions for the ergodic rate [2023, 29]. The approximation is appropriate for our problem as is a function of second-order statistics. While we have asymptotic convergence when the number of BS antennas is large, our simulations show that the approximation is also accurate for MIMO-OFDM systems with very small number of antennas.

We now replace the ergodic rate with its deterministic approximation and form the analytically tractable problemWhile analytically tractable, problem is still nonconvex. Since the global optimum to a nonconvex optimization problem is generally hard to obtain, we will develop an algorithm to find the local optimal solution to . In the next sections, we will first derive the asymptotic approximation and then propose an algorithm to obtain locally optimum precoder matrices.

3. Asymptotic Approximation of the Rate

In this section we derive the asymptotic approximation to the ergodic rate function. From (2) and (3), it follows that the rate function can be rewritten asTo express the rate more compactly, definein whichso that the rate is formed by the difference between two Shannon transforms asThe matrix is associated with the total received signal while the matrix is associated with interference plus noise.

In order to approximate , we need to find the asymptotic approximation to . According to the random matrix theory [21, 22, 40], the ergodic Shannon transform of can be asymptotically approximated by a deterministic function that only depends on the second-order statistics of . Therefore, we only need the second-order statistics of and to characterize .

3.1. Second-Order Statistics of and

We shall now calculate the covariance matrix of defined as which is associated with the total received signal, and the covariance matrix of defined as which is associated with interference plus noise. From the definition of and statistical independence of channel matrices across BSs, it follows thatwhere is the subcovariance matrix corresponding to BS and given bySimilarly, from the definition of and statistical independence of channel matrices across BSs, it follows thatwhere is the subcovariance matrix corresponding to BS and given byNow, we only need to calculate the subcovariance matrices and to completely determine the covariance matrices and . It is straightforward to show thatwhere is defined in (4).

We remark here that, in contrast to the uplink transmission considered in [20] where each subcovariance matrix is , in the downlink transmission each subcovariance matrix is , so each dimension is approximately times larger. The increased size of the covariance matrix is due to the fact that the downlink transmitter (BS) has intended receivers (users), while the uplink transmitter (user) has only one intended receiver (BS). In addition to this, there are covariance matrix groups in the MIMO-OFDM system, one for each subcarrier.

The deterministic approximation will depend on the eigenvalues of and , so we shall define two eigenvalue matrices. Define the eigenvalue matrix and eigenvalue matrix , which are formed by stacking the eigenvalues of and in sized columns, respectively. Such a definition is required to apply the theorems from random matrix theory. Note that the first dimension of eigenvalue matrices is the receiver antenna number . It is easy to see thatwhere eigenvalue matrix and eigenvalue matrix are formed by stacking the eigenvalues of and in sized columns, respectively.

3.2. Approximation of the Rate Function

In this subsection, we will derive the approximation to the rate based on the random matrix theory. We start by reviewing some concepts from the random matrix theory. For an random matrix , the correlation function is defined as . Define the kernel of the correlation function by a set of orthonormal eigenfunctions satisfying where is the kernel eigenvalue [21]. We say the kernel is factorable if for some functions and . Also, define as the eigenvalue matrix which is formed by stacking the eigenvalues of in sized columns. Matrices , , and can represent , , and (or those of ), respectively.

Proposition 1. Suppose the following technical conditions hold for the correlation function of : (a) its kernel is factorable; (b) its eigenvalues multiplied by are uniformly bounded. Then, based on [29, Theorem  2] and [23, Theorem  3], the random variable will converge almost surely to its deterministic approximation for fixed as . We havewhere denotes the eigenvalue matrix and , are solutions to the following fixed point equations:

Now based on this approximation technique and the second-order statistics of and derived previously, we can obtain the asymptotic approximation to the ergodic rate in the following theorem.

Theorem 2. Suppose the following conditions are satisfied for the correlation functions of and : (a) they have factorable kernels; (b) their eigenvalues multiplied by are uniformly bounded. Then for we havein which is a deterministic function given bywhere , , , and are the solutions to the following fixed point equations:where and range according to the vector sizes.

Proof. We will apply Proposition 1 to each term in (8). Consider the first term in which must satisfy conditions (a) and (b) of Theorem 2 in accordance with the same conditions in Proposition 1. Since , the ratio of dimensions is always fixed as . Condition guarantees that the precoders have full column rank so (2) remains valid. Now, from Proposition 1, the first term in (8) divided by is approximated asIn the first term of approximation , we can express in terms of using (22). In the second term of approximation , we can express in terms of using (21). For the third term of approximation , we can writeand express in terms of using (21). So all the terms in are expressed in terms of , . In a similar manner, can be expressed in terms of , . Forming along with some mathematical manipulations completes the proof.

Some remarks are in order. It is shown in [23] that many channel models including the unitary-independent-unitary (UIU) model, Kronecker model, and independent nonidentically distributed (IND) model satisfy conditions (a) and (b) of Theorem 2. While the approximation is guaranteed to converge as , it is accurate even for very small number of antennas (e.g., , ) as will be shown in the simulation section. The existence of positive solutions to the fixed point equations in Theorem 2 is proved in [21]. Computationally, the equations can be solved numerically by iteratively substituting the value on the right-hand side into the left-hand side. The convergence result suggests that the ergodic rate can be approximated by .

4. Weighted Sum Rate Maximization

In this section, we develop an algorithm to obtain a suboptimal solution to problem . We will often need to differentiate one matrix with respect to another matrix; therefore, to facilitate calculations, the following definition is presented [41].

Definition 3. Let be a function mapping , define its derivative as the matrix with elementsFurthermore, let be a function mapping ; define its derivative as the matrix

The most important property of this definition is that the chain rule holds, which enables the differentiation of complicated functions. Adopting this definition, we can now seek to solve the optimization problem.

Problem is nonconvex, so a global maximum is generally hard to obtain. Fortunately, the KKT conditions still serve as the necessary conditions for local optima, so we seek a suboptimal solution that satisfies the KKT conditions. To this end, we define which is the complex-real isomorphism of . Problem then becomeswhere is a set that has complex-real isomorphism with respect to .

To further simplify the problem, we can reformulate it as an unconstrained optimization. To do so, similar to [42], we make a change of variables from to the spherical coordinates that belongs to if and to if . The transformation is described bywhere denotes the lexicographical order of vectors. Under this parametrization, the power constraint is automatically satisfied since the variables are on the surface of a hypersphere with radius .

Since the local optima occur at the interior of the domain of angle parameters, the KKT condition for problem is simplified towhere holds all the angle parameters corresponding to BS . The above expression simply states that the gradient vanishes at any optimum point. So finding a local optimum solution is equivalent to pursuing a set of points , at which the gradient vanishes. In practice, the equation for setting the gradient to zero is highly nonlinear and it is impossible to solve it directly. However, starting from any initial point for , we can use a simple gradient search method to increase the objective function after each iteration and gradually approach a point where the gradient is zero. In order to do so, we need to derive the vector , which is the transpose of the gradient vector. The remaining materials of this section will thus be devoted to the derivation of .

4.1. Derivation of

It immediately follows by the chain rule thatNow we need to find and in order to form the chain rule and obtain (33). We shall find derivatives with respect to the subvectors of , that is, with respect to vectors . The components of can be obtained by differentiating (30) which yields the following:The components of can be found through the complex-real isomorphism asSubstituting (34) and (35) in (33) completes the derivation provided that is available.

Calculation of is more complicated since, according to Theorem 2, is a function of , , , and , which are in turn functions of , . The chain rule givesThere are three derivatives involved in each term of (36). In what follows, we shall calculate each of them.

4.1.1. First Chain of (36)

The first chains can be obtained by differentiating , as given in Theorem 2, which yields

4.1.2. Second Chain of (36)

The second chains can be obtained by differentiating the fixed point equations given in Theorem 2. It can be shown thatwhere denotes the th standard basis vector in and

4.1.3. Third Chain of (36)

Using the chain rule together with (16), we haveWe now need to calculate the terms in the above formulas. Note that the nonzero terms in the above matrices are equal to and expressed through the chain rule.

Based on the results in [41], the derivative of the eigenvalues with respect to the matrix can be explicitly written as a function of the eigenvectors, so we obtainwhere and denote the th eigenvector of and , respectively.

Finally, differentiating (15) and using the results in [41], we obtain the formulas for and given by in which is the commutation matrix defined in [41] satisfying for every matrix , andWith all the chains derived, (36) is completely characterized which in turn enables an explicit expression for given by (33).

4.2. Local Optimum Solution

With the gradient computed, a gradient search method can be applied. We now propose Algorithm 1 to find a local optimum solution for problem .

(1) Initialize:
initialize , and ;
(2) while stopping criterion not met do
(3) compute , using (28), (30);
(4) solve for , , , , , using Theorem 2;
(5) compute , , , , using (33);
(6) , and ;
(7) end while
(8) return , ;

Algorithm 1 is not limited to multicell networks with partial cooperation, so with small modifications it can be used for precoder design based on statistical CSI for networked MIMO systems. In networked MIMO, in addition to the channel statistics, it is assumed that each BS has all the transmit data, so the whole system can be viewed as a MIMO super-cell with statistical CSI at the transmitter.

5. MIMO-OFDM Kronecker Channel Model

While the results obtained so far are valid for general correlation channel models, in this section, we will reduce the results to a more compact form with less computational complexity by considering the Kronecker channel model. The Kronecker model arises in practice when the immediate surrounding dominates the spatial correlation and the intermediate scattering clusters exist in a narrow angular range seen from the antennas [34, 39]. We shall start by describing the MIMO-OFDM channel and expressing it with the Kronecker structure.

5.1. Statistical Representation

A wideband MIMO channel is characterized by channel taps . The channel matrix on the th subcarrier is then given by . Now the correlation among channel taps is given by the tap correlation matrix that is defined as , . On the other hand, the Kronecker model assumes that the correlation of transmitter side and the receiver side is separable, so for each channel tap , we have where and are receiver and transmitter correlation matrices, respectively. Based on the above definitions, it is easy to show that the channel correlation matrix on subcarrier is given byTherefore, the channel correlation matrix is characterized by the Kronecker product of transmitter and receiver correlation matrices and , multiplied by the quadratic form which depends on the tap correlation matrix and the th Fourier vector. Finally, the following statistical representation can be considered for the channel:where is a white random matrix whose elements are uncorrelated with zero mean and unit variance. Through the properties of the Kronecker product, the above statistical representation yields . We shall use (45) to describe the MIMO-OFDM Kronecker channel model.

If the channel taps are uncorrelated, is diagonal and becomes independent of ; hence, the channel correlation function is the same over all subcarriers. However, whenever the channel taps are correlated, is not diagonal and channel statistics are different on each subcarrier. Since the precoders depend on the channel statistics, when there is tap correlation, the MIMO-OFDM precoding matrices are frequency dependent which limits the system performance [17, 3134]. But when there is no tap correlation, precoders are the same across all frequencies. We shall study the effect of tap correlation on the system sum rate in the simulation results section.

5.2. Eigendecomposition under Kronecker Model

Based on (45), the downlink channel matrix between BS and user can be expressed aswhere , , and are the corresponding receiver, transmitter, and tap correlation matrices, respectively, and is a white random matrix whose elements are uncorrelated with zero mean and unit variance. Obviously, we have that along with (15) yieldsObserve that the above subcovariance matrices decompose into a separated form, so their eigenvalue matrices admit the following decomposition:in which the vector holds the eigenvalues of , and the vectors and hold the eigenvalues of and , respectively. Now the eigenvalue matrices and can then be found by inserting (48) into (16).

Computationally, we no longer need to compute the eigenvalues of very large matrices and , but instead, we need only compute the eigenvalues of the lower dimensional matrices , , and . Using the above eigendecomposition, the computational complexity is reduced from to .

5.3. Derivation of under Kronecker Model

The procedure given in Section 4.1 can be applied here with great simplifications for and which constitute the third chain of (36) computed in Section 4.1.3. Note that the third chain was the most complicated component of the derivative. Considering (16), it suffices to find and . It can be shown from (48) thatwhere and denote the th eigenvector of and , respectively, and was defined in (43). Using the above equations, the computational complexity is reduced from to .

6. Intercarrier Interference

In this section, we allow ICI among OFDM subcarriers. The case without ICI is then a special case of this scenario. The ICI occurs when there is carrier frequency offset due to synchronization errors and Doppler shifts [1, 38, 43]. This leads to loss of orthogonality among subcarriers which introduces more interference to the system. To accommodate our method to this situation, we will approximate the rate under ICI and then extend the proposed algorithm to facilitate the precoder design with statistical CSI.

6.1. System Model and Problem Formulation under ICI

When there is ICI, the received signal at user on subcarrier can be modelled aswhere models the power leaked from subcarrier to subcarrier due to the ICI. Following [1], we model the ICI through the normalized frequency offset denoted by , which is the ratio of the actual frequency offset to the intercarrier spacing. It is shown in [1] that relates to the byHere at the receiver, in addition to the summations over , that model intercell and intracell interference, the received signal over subcarrier depends also on all other subcarriers through ICI and thus the summation over . Note that the formulation reduces to that of the non-ICI scenario when , or equivalently . The achievable instantaneous data-rate under ICI for user on subcarrier is then given byIf we definethen we can writeBased on this result, we can utilize the same random matrix method employed for the non-ICI scenario to derive the approximated rate .

The optimization problem we consider is similar to but using the rate under ICI, that is, . As before, we replace the rate by its approximation for tractability. Therefore, the following optimization problem is formed:

6.2. Second-Order Statistics under ICI

Due to the ICI, in addition to per-carrier statistics, the BSs now share the additional information of cross-carrier correlation. Specifically, the cross-carrier covariance matrices, are assumed to be available at all BSs. This additional information is important for the BSs to suppress the ICI. For Kronecker model, the transmitter, receiver, and tap correlation matrices are sufficient because we have with .

We will now find the second-order statistics of and which depend on . It can be shown thatwhere and are block matrices holding blocks with block given bywith , , and , .

Finally, we define the eigenvalue matrices aswhere eigenvalue matrix and eigenvalue matrix are formed by stacking the eigenvalues of and in sized columns, respectively.

6.3. Rate Approximation under ICI

Using the second-order statistics derived above, and by extending Theorem 2, we present the rate approximation under ICI in the following corollary.

Corollary 4. For we havein which is a deterministic function given bywhere , , , and are the solutions to the following fixed point equations:where and range according to the vector sizes.

6.4. Weighted Sum Rate Optimization under ICI

After the change of variables to spherical coordinates in a similar manner to Section 4, the KKT conditions for are , . Under ICI, the rate is a function of precoders over all subcarriers and not just , so the gradient is modified asHere, due to the ICI, we need to sum over all subcarriers to calculate the gradient function. The above derivative can be obtained with the chain rule in a similar manner to Section 4; however, we need the derivatives with respect to all subcarriers and not just , as is seen from (63). Another difficulty involved under ICI is the calculation of and which are rather different from the case without ICI. We shall only derive the above two terms under ICI because other terms are straightforward to find.

For general channels, it can be shown that where and are the th eigenvectors of and , respectively, and are given by

For Kronecker channels, it can be shown that and are given by in whichand and are the th eigenvectors of and , respectively, andThe computational complexity is greatly reduced for the Kronecker channel model from to .

We can now propose Algorithm 2 to find a suboptimal solution to . Note that this algorithm is computationally more cumbersome compared to Algorithm 1, since the number of components in is times larger due to the ICI.

(1) Initialize:
initialize , and ;
(2) while stopping criterion not met do
(3) compute , using (28), (30);
(4) solve for , , , , , using Corollary 4;
(5) compute , using (63);
(6) , and ;
(7) end while
(8) return , ;

7. Simulation Results

In this section, we demonstrate the approximation accuracy and evaluate the performance of our algorithms. We consider a downlink multicell MIMO-OFDM system, where the number of cells and users in each cell is set to be , and the number of subcarriers is set to 8, that is, , , and . The number of antennas for users and BSs is set to 3 unless stated otherwise; that is, and . The users are uniformly distributed in cells with 0.5 Km radius. Without loss of generality, the weighting for the sum rate maximization is uniform in the simulations, that is, , . We assume the power constraints are the same for all BSs, that is, , . Since the noise power was normalized to unity, the transmitter signal-to-noise ratio is .

We consider the Kronecker channel model introduced in Section 5 with correlation matricesin which denotes the distance between BS and user , and , are the spatial correlation factors as defined in [44]. The constant is chosen so that . While not explicitly specified, we also choose , to be random and uniformly distributed in the interval for each user. The tap correlation matrix is chosen as where is the tap correlation factor and with  dB as defined in [34]. The number of channel taps is assumed to be throughout our simulations; that is, . The ICI is modelled according to (51).

7.1. Approximation Accuracy

We compare the exact ergodic sum rate obtained by simulations and its approximation given by Theorem 2 or Corollary 4. The results are depicted in Figure 2, where it is seen that the approximations are extremely accurate. Note that although the approximations are asymptotic, exceptional accuracy is observed for our small size MIMO system (, ); this fact justifies this approximation approach for practical antenna sizes. It is also seen that the approximation is accurate across a wide range of SNRs.

7.2. System Performance

Now we demonstrate the performance of our proposed method given by Algorithm 1, which is based only on statistical CSI at the BSs. For comparison, we shall also depict the results of two other methods. First, the results of the algorithm are based on perfect instantaneous CSI proposed in [12], which assumes the same system model as this work but with full CSI at the BSs. Second, the results of networked MIMO are based on statistical CSI, where, in addition to the channel statistics, it is assumed that each BS has all the transmit data, so the whole system can be viewed as a MIMO super-cell with statistical CSI at the transmitter. The above two systems obviously require much heavier BS coordination, but as we shall see, our method competes with them. As mentioned before, our proposed algorithm is not limited to multicell networks with partial cooperation, so with small modifications it can be used for precoder design based on statistical CSI for networked MIMO systems.

Figure 3 depicts the results where it is observed that although our method uses statistical CSI with limited BS cooperation, the achievable weighted sum rate is comparable to that with full CSI and networked MIMO for a wide range of SNRs; note that statistical CSI incurs a much lower signaling overhead and requires less frequent update compared to full CSI. Here, we also see the results for different tap correlation factors. While tap correlations degrade system performance, using our method to incorporate the correlation information into precoder design alleviates the performance loss as seen in Figure 3. Moreover, it is observed that lack of knowledge about the tap correlations or neglecting them (assuming uncorrelated channel taps) leads to performance loss. When no precoders are employed (identity matrix precoding), there is a huge disadvantage, so precoder design is crucial in the MIMO-OFDM system.

7.2.1. Correlation Factors

In Figures 4 and 5, we investigate the effect of tap correlation and spatial correlation factors on the system performance. We see that both tap correlation and spatial correlation decrease the weighted sum rate. However, the spatial correlation shows more prominent effect on the sum rate. It is seen that as long as the correlation information is incorporated into precoder design via our proposed algorithm, the performance loss is not significant for a wide range of correlation degrees. It is noteworthy to mention that the performance loss becomes slightly larger at lower SNRs.

7.2.2. Number of Antennas

Figure 6 shows the sum rate as the number of transmitter antennas increases; here the number of receiver antennas is fixed to . The weighted sum rate is seen to increase with . As it is seen, the slope of performance increase for Algorithm 1 and that of the method in [12] based on full CSI is similar as increases. This suggests that our proposed method is reliable for arbitrary large MIMO-OFDM systems.

7.3. System Performance under ICI

Now we present the system performance when there is ICI. Due to the cross interference from other subcarriers, the sum rate decreases as is seen in Figure 7. It is seen that, for nonnegligible FO, there can be a serious decrease in the sum rate. However, smart precoder design implemented by Algorithm 2 can suppress ICI and achieve a reasonable sum rate. It is observed that the performance gain of our method over the nonprecoding scheme increases with SNR. Moreover, the advantages is more pronounced for smaller frequency offsets. However, we note that the gap between ICI-free system and system with ICI also increases with SNR.

In Figure 8, we plot the weighted sum rate versus the ICI intensity factor for  dB. We see that the system sum rate is highly sensitive to ICI for small .

7.4. Convergence Rate

Now we compare the convergence rates for the Algorithms 1 and 2 for  dB and , , , and . The initial precoders are chosen to be the identity matrix. Under various degrees of ICI, we see from Figure 9 that Algorithm 1 always converges faster than Algorithm 2. When there is ICI, the convergence rate of Algorithm 2 is similar for various degrees of ICI.

8. Conclusions

We investigated linear precoding for downlink multicell MIMO-OFDM systems based on statistical CSI. The main contribution of this work was applying the already established results of random matrix theory to the MIMO-OFDM scenario in order to study the impact of frequency selectivity, tap correlations, and ICI on the statistical precoder design and system performance. The asymptotic approximations to the ergodic rates in ICI and ICI-free scenarios were derived, based on which, we formulated two nonconvex sum rate maximization problems and proposed locally optimum gradient based solutions to them. Simulation results showed that while spatial correlations, tap correlations, and ICI decrease the system sum rate, our method alleviates this performance loss by incorporating the correlation information and ICI intensity information into the precoder design.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Leading Talents Program of Guangdong Province under Grant 00201510, the Basic Research Program of Shenzhen under Grant JCYJ20151117161854942, and the Shenzhen Peacock Program under Grant KQTD2015071715073798.