Abstract

In order to apply sphere decoding algorithm in multiple-input multiple-output communication systems and to make it feasible for real-time applications, its computational complexity should be decreased. To achieve this goal, this paper provides some useful insights into the effect of initial and the final sphere radii and estimating them effortlessly. It also discusses practical ways of initiating the algorithm properly and terminating it before the normal end of the process as well as the cost of these methods. Besides, a novel algorithm is introduced which utilizes the presented techniques according to a threshold factor which is defined in terms of the number of transmit antennas and the noise variance. Simulation results show that the proposed algorithm offers a desirable performance and reasonable complexity satisfying practical constraints.

1. Introduction

The Nondeterministic Polynomial-time hard (NP-hard) complexity of Maximum Likelihood (ML) decoding, the optimal decoder, generally prohibits its use in practical Multiple-Input Multiple-Output (MIMO) systems [1], especially when a large signal constellation and/or many transmit antennas are involved. Some suboptimum detection algorithms, such as Kannan’s algorithm which searches only over restricted parallelograms [2], the KZ algorithm [3] based on the Korkin-Zolotarev-reduced basis, and the Sphere Decoding (SD) algorithm of Fincke and Pohst [4, 5], can perform detection with much lower complexity, but at a performance degradation.

SD was first introduced in [6] to perform ML detection, and it achieves reduced complexity by searching for the closest lattice point over the points that lie in a certain sphere around a given vector. Although it significantly reduces the computational complexity of ML, it requires huge amount of computations in MIMO systems. There are several approaches to reduce the complexity of SD algorithm such as the Schnorr-Euchner (SE) enumeration [7], descending probabilistic ordering [8], increasing radius sphere decoder [9], parallel competing branch algorithm [10], and reduced dimension maximum likelihood search [11]. Other approaches trading performance for complexity include the radius scheduling method [12], probabilistic tree pruning algorithm [13], sequential Fano decoders [14], and semidefinite relaxation [15]. The work of [16] proposes the utilization of the SE refinement of the Pohst enumeration in the closest lattice point search. Based on the numerical results, [16] concludes that the SE enumeration is more efficient than the Viterbo-Boutros implementation in [6]. According to the proposed method in [16], an algorithm is presented in [17] and it has been shown that it is robust to the initial choice of the sphere radius. This concept has been further discussed in different pieces of research that try to improve its performance and computational complexity which make it possible for practical real time systems.

Various papers have analyzed the complexity of SD such as [18] which shows that the expected complexity of SD the expected number of operations required by the algorithm, , depends on the both number of transmit antennas, , and the Signal-to-Noise Ratio (SNR), . It is also shown in [18] that when the SNR is high, the expected number of operations required by the SD, can be approximated by a polynomial function for small . An exact but complicated expression for the expected number of operations required by the sphere decoder has been obtained in [18]. In [19] by obtaining a lower bound, it is shown that the expected complexity of SD applied to a large class of problems is exponential in . In this work the complexity of SE-SD algorithm is discussed from a new point of view.

Stopping criteria can be used to reduce the complexity of SD, since it results in terminating the decoding process earlier and thus prevents a huge amount of extra calculations. Some researchers have worked on these criteria for some special scenarios. This paper also discusses the convergence radius and proposes a new stopping criterion.

SD searches a lattice through a given set of points that is bounded by the search sphere with the received point as center. Therefore, the method requires determining an initial search radius,. The concept of choosing a suitable plays a crucial role in finding the nearest lattice point in a sphere. The initial radius should not only be small enough to contain at least one lattice point but also big enough to have a practical enumeration complexity of finding closest point among containing lattice points. Even though it is claimed that SE-SD is less sensitive to initial radius than the original SD, [17] shows that the complexity of SE-SD is still controlled by. To the best of our knowledge, there are no general guidelines for choosing appropriate. This paper focuses on the suitable and discusses the way of finding it easily, which is applicable to the complexity-limited wireless communication systems. This paper also demonstrates that the initial radius can affect not only the computational complexity but also BER of SE-SD algorithm significantly.

In this paper, and are the sets of integer, real, and complex numbers, respectively. denotes circularly symmetric complex Normal distribution.

In the assumed MIMO system, is the complex transmitted vector of dimension, whose elements are members of a squared -dimension Quadrature Amplitude Modulation (-QAM) constellation. It is assumed that the channel coefficients matrix, , which is comprised of i.i.d entries, can be estimated accurately at the receiver. The noise vector comprised of i.i.d entries is presented by, where and is the average signal energy of the constellation. A symbol, defined as, is transmitted over antennas, and , which is given by, is received.

To obtain a lattice representation of this multiple antenna system, the complex matrix equation is transformed into the real matrix equation as whereand, similar to are obtained through and is transformed to where for easier notation, and .

The reminder of this paper is organized as follows. Section 2 describes sphere decoding and SE-SD algorithm, briefly. The initial radius and complexity of the algorithm is discussed through simulation results over a wide range of SNRs and channel sizes in Section 3. Section 4 goes into the ideas of stopping criteria for the algorithm. In Section 5 a new low-complexity algorithm is introduced. Finally, Section 6 draws the conclusion.

2. Sphere Decoding

A finite lattice can be defined as where is the generator matrix of the lattice andcontains the coordinates of lattice points. The ML estimates that minimizes the Euclidean distance between and , as follows: where represents the vector norm and represents the set of points of the -QAM constellation.

As mentioned earlier, SD searches the lattice through a given set of points bounded by a sphere with the received point as center and a specific initial radius. Whenever a point is found inside the sphere, the radius is reduced to the value of the distance between the new and the received point. Under assumption of , the channel matrix can be transformed into by performing QR-decomposition. is an upper triangular matrix, and is an orthogonal matrix. Therefore, the ML problem will be simplified to find the lattice point that satisfies the following condition: where and represents Hermitian matrix transposition. The inequality can be rewritten and then expanded as where denotes the entry of. The previous inequality results indifferent inequalities. By taking advantage of the upper triangular property of the first term of right-hand side of (5) depends only on and thus it belongs to the following interval: where and denote rounding to the nearest larger and smaller element in the set of numbers that spans the lattice, respectively. The intervals for are found in a similar fashion, in which is a function of only, . Generally speaking, all possible candidates for should be searched within a sphere with radius and dimension .

The first candidate for, namely, , is:

By SE enumeration, the candidates are spanned in a zigzag order, starting from the midpoint. Hence, at each level , the SE enumeration will produce this sequence of candidates for : if and otherwise .

A full search can be depicted as a search tree, like Figure 1, which its root is the th entries of the possible symbol and each node in the th level shows one of the possible values for the th entry of the symbol. The search starts from the root down to the 1st level (leaf node), where at the th level, all possible entries are found so that the symbol lies in the sphere. When the search reaches the leaf node, all the entries of the symbol are discovered. Therefore, each path through the tree corresponds to a possible symbol. As a result, SD can be viewed as a pruning algorithm on this tree from which, based on violation of the constraint given by (4), a branch can be removed at any level.

The work of [17] proposed an algorithm with 6 recursive steps to implement SE-SD algorithm. After initialization, leading to start from the highest level of the tree and set, the algorithm begins with step 2, by offering the first candidate for the root of the tree. The testing node at each level is offered by steps 2 and 6 through SE enumeration. By using the lattice boundary (maximum and minimum of ) in steps 2 and 6, the algorithm works with only the finite square -QAM constellation. Step 3 examines the constraint given by (5) which may lead to two cases.(1)If the candidate is valid and a leaf node is reached, the symbol is recorded as the ML solution, the radius is updated in step 5, and the algorithm restarts from . But if the valid candidate is found in other levels, the search proceeds in a lower level. (2)If the candidate is not valid, the algorithm will go to step 4. If the algorithm is in the top level, it means that there is no valid symbol in the sphere. Thus, the algorithm terminates. Otherwise, the algorithm will go up to and the next candidate of that level will be tested.

This paper uses the SE-SD algorithm introduced in [20] which is the modified version of the proposed algorithm in [17] which does not consider any point outside the finite lattice through lattice boundary awareness. This algorithm is preferred because of the lower complexity without any performance degradation.

3. Initial Radius of SE-SD Algorithm

Several approaches have been proposed to find an appropriate initial radius. Because of the advantages of the Schnorr-Euchner enumeration, the conventional methods choose the positive infinity as the initial radius. Obviously, this approach avoids declaring an empty sphere. It is also clear that the first point found with corresponds to the Babai point [16]. Thus, a more suitable choice for is to use the distance between the Babai point and, since this radius guarantees the existence of at least one lattice point inside the sphere. Generally, it is not clear whether this choice of initial radius leads to too many lattice points lying inside the sphere [18]. In [21] through some examples it is shown that this sphere contains at least one point, if the radius is computed exactly. However, in practice, due to rounding errors introduced by floating-point computation, this radius cannot be calculated exactly. It offers an upper bound for the computational error of and defines as the initial radius, where is the unit of round-off. The work of [22] mentions that this method is useful when the noise variance is relatively small. The work of [23] proposes a method that utilizes the result of QR decomposition and reordering of to obtain Babai point and defines the initial radius as the distance between the received signal and the lattice point mapped by the suboptimal solution.

Another case for is the covering radius of the lattice, defined to be the smallest radius of spheres centered at the lattice points that covers the entire space [24]. This is clearly the smallest radius that guarantees the existence of a point inside the sphere for any. The problem with this choice of is that determining the covering radius for a given lattice is itself NP-hard [25].

Some works consider a small fixed number as for all cases, which is increased if no lattice point is found in the sphere. The work of [26] sets the initial radius to the distance between the lattice point mapped by Minimum Mean Square Error (MMSE) solution and the received signal. In general, the approaches that use the suboptimal solutions to find contribute to higher complexity.

A useful approach is to choose according to the noise distribution, so (3) can be helpful to determine the desired. is a Chi-square random variable with degrees of freedom. Therefore, a radius may be chosen to be a scaled variance of the noise [18]:

In such a way, a lattice point can be found inside the sphere with a high probability: where the integrand is the probability density function of the Chi-square random variable with degrees of freedom, and is set to a value close to 1. If the point is not found, the probability will be increased and a new is calculated; consequently, the searching will be restarted considering the new radius.

It is important to note that the radius is chosen based on the statistics of the noise and not. Making the choice based on quickly leads us to NP hard problems (such as determining the covering radius). Moreover, as noted in [1], selection of the radius based on the noise has a beneficial effect on the computational complexity. The work of [27] proposes an empirical definition for in special situations (64 and 16-QAM) in a small SNR range.

To investigate the behavior of the algorithm, we find the average number of flops, a measure for the complexity, and Bit Error Rate (BER) of the SE-SD algorithm for various , and through computer simulations. Our experimental setup corresponds to the transmission of -QAM constellations over a multiple antenna flat Rayleigh fading channel, which is reasonable for many communication problems. The channel matrix, changes randomly after transmitting 100 symbols. In order to plot BER or complexity versus initial radius for certain , , and , a set of 100 initial radii are examined, and 108 random symbols per any particular , are tested. Note that only the flops of the search process are counted without considering the cost of QR decomposition. In practice, at least one lattice point should be found by the algorithm, and if does not contain any point the initial radius is multiplied by and the algorithm is restarted.

The work of [17] investigates the effect of on the average complexity of SE-SD algorithm for a 16-QAM constellation at a range of to 25 dB and it is shown that for dB, the complexity is less insensitive to the initial radius than the original SD. However, the relationship between the computational complexity and the initial radius does not discussed in [17].

Although (9) implies that the probability of finding lattice point inside the sphere changes as initial radius changes, there are no known paper on the effect of on the performance of SE-SD for finite lattices. Figure 2 illustrates the significant effect of initial radius on the performance of the algorithm. It shows BER as a function of initial radius for and when , 21.46, and dB ( = 0.203, 0.1, and 0.056). A particularthat leads to the lowest BER calling “the best performance initial radius”,, can be seen in each subfigure of Figure 2. For instance, whendB, BER of is 21% less than that of . The average Babai Radius is depicted by a line in Figure 2, and this figure shows that is close to but not exactly equal to it.

A huge number of figures that show the complexity versus have been obtained from simulations when , and is choosing from dB to dB with the step of dB, which is the typical SNR range for wireless communication applications. In this paper only 6 examples of these simulation results are depicted in Figures 3 and 4 due to the limited space. These figures depict the complexity of SE-SD algorithm as a function of initial radius for , and dB. All of the simulated curves follow a similar pattern and can be fitted to a rational function of two fourth-order polynomials [28].

For any less than, as the initial radius decreases, the complexity soars. Because the smaller spheres force the algorithm to restart several times, this contributes to huge computational complexity. The curves show that the complexity gets its lowest value when is near. This can be defined as the “lowest complexity initial radius”, . There is three cases related to and .

(1)The curves related to low (Figures 3(a) and 3(b)) are smoother, and for any bigger than the complexity seems almost constant, so, in this case a big initial radius should be chosen to have low complexity however, it should be small enough (second-order value) not to degrade the performance noticeably (noting Figure 2). For example is suitable for the case of , , and to dB.(2) Moderate results can be seen in Figures 4(a) and 3(c). In this curve the lowest complexity is obtained when is around and it grows considerably whenincreases. For instance, the complexity grows about 40% as is chosen for dB. Therefore, in this case the most suitable should be chosen slightly less than.(3)High makes the complexity of the algorithm less sensitive to any (greater than), as Figures 4(b) and 4(c) illustrate. Although the raise in the complexity can be neglected for high, the performance degradation is significant. It comes to the conclusion that an initial radius greater than and less than 1 can be acceptable in this situation to comprise between the performance and the complexity.

From Figures 3 and 4 it is obvious that decreasing of leads to decline in and also smoothing the curves which supports the theory in [18] that recommends choosing according to the noise variance.

Simulation results that depict the average complexity and BER of SE-SD algorithm as a function of when , , and = 4, 16, 64, and 128 show that not only but also does not change as the constellation’s size grows. It comes to the conclusion that the size of constellation does not affect the performance and the complexity of SE-SD.

While an approach is to choose as the initial radius [16, 17], Figures 2, 3, and 4 show that neither nor is exactly equal to. Using as the initial radius in some practical scenarios causes a noticeable increase in BER and the complexity due to the rounding problem. In addition, calculating costs extra complexity which cannot be negligible in some scenarios.

Figure 5 shows and the proper (which is obtained from a comprise between performance and complexity) of SE-SD algorithm as a function ofnumber of transmit antennas when and = 0.65,0.203, and 0.05. As it can be seen from Figures 5(b) and 5(c), for a low scenario in spite of, the number of transmit antennas does not affect the suitable and it remains almost constant for any number of antennas. Thus, in this case, finding suitable for small problem size (low ) concludes to solving the initial radius problem for the big one (high ). Since the complexity of calculating for scenarios with small number of antennas is considerably less than the big one, the cost of finding suitable initial radius reduces significantly.

However, as it can be seen from Figure 5(a), high scenario needs calculating of Babai radius to use as initial radius. Because of the rounding problem for any scenarios with, the suitable is approximately 8% smaller than.

4. Stopping Criteria for SE-SD Algorithm

Stopping criterion is a potential mean for saving computational complexity in iterative algorithms like SE-SD. The work of [29] suggests that if an enumerated lattice point is found to be at a distance less than half the length of shortest lattice vector (packing radius) from , it is clearly a nearest lattice point and thus the enumeration process can be terminated right away. The work of [29] utilizes a lower bound on the packing radius as a stopping test. The work of [30] introduces a parameter, , based on the target symbol error rate and , and the proposed probabilistic search stops when is within it.

The idea of [31] is to first run a lattice reduction-aided SIC detector. If this results in a valid vector, with all elements within the symbol alphabet, the algorithm stops. Otherwise, it proceeds by running the sphere decoder.

It was mentioned that when the algorithm finds a symbol in the sphere, calculates a new radius for the sphere and when the algorithm reaches its convergence radius, it still tries to find a symbol inside the new sphere, but it does not succeed, because there is no one. Of course, the attempts to find a new symbol after reaching the convergence radius cause significant extra calculations. Thus, if SE-SD algorithm terminates as soon as it reaches the convergence radius, the huge amount of unnecessary computations can be prevented. Consequently, the complexity of the algorithm will be reduced considerably.

Finding the convergence radius is itself really complicated. Through the computer simulations of SE-SD for each scenario we recorded the average Final Radius,, the radius of the last sphere in which the algorithm cannot find any lattice point after a huge amount of calculations. Figure 5 shows, , and the proper of SE-SD algorithm as a function ofthe number of transmit antennas when and = 0.65, 0.203, and 0.05. It seems that the final radius, similar to Babai radius, increases as the number of antennas grows. This growth in low scenarios is negligible (Figures 5(b) and 5(c)) but for high it cannot be ignored (Figure 5(a)). The distance between the final radius and Babai radius for high seems to be a function of the number of antennas and the noise variance. When suitable is chosen 8% less than for high is the final radius empirically found to be approximately equal to So, in this case calculating results in not only a proper initial radius but also the final one.

When a new sphere radius is calculated in step 5 of the algorithm, it should be compared to the estimated, and when it is found to be less than , the SE-SD algorithm terminates to avoid extra useless calculations after reaching the convergence radius. This technique can be defined as Early terminated SE-SD.

Although the problem of finding suitable can be solved easily usingand (10), yet there is a question about the cost of calculating. To find out the complexity growth, we take into account the complexity of three types of performing SE-SD:

SE-SD with,

SE-SD which is chosen through previous section’s recommendation, namely, proposed initiated SE-SD (PSE-SD),

early terminated PSE-SD (EPSE-SD).

Table 1 indicates the percentage reduction in the complexity of these three types of SE-SD in comparison with the case of equal to a certain rough value like 20, when , , and . The negative number in the table means the increasing in the complexity.

According to Table 1 if is greater than 10, the complexity declines by at least 9% via compared to the case that. The reduction soars to at least 16% using PSE-SD. In addition, utilization of EPSE-SD results in a substantial decrease (more than) in the complexity. Therefore, the slight additional complexity entailed by calculating leads to significant reduction in the total complexity of SE-SD.

However, if , the number of flops of calculating is comparable to the overall SE-SD complexity and the complexity reduction of EPSE-SD seems to be negligible. For example, Figure 4(a) indicates that for the case of and the average number of flops for decoding a burst of 100 symbols is around, and so, 14600 flops of calculating is comparable to the SE-SD complexity.

Table 2 demonstrates a comparison between the three techniques, SE-SD with, PSE-SD and EPSE-SD, when , and , and 0.056. In this table the percentage reduction of the complexity of the three types of SE-SD reflects the apparent discrepancy between, and . On one hand, EPSE-SD causes a significant fall in complexity when ; on the other hand, calculating increases the number of flops of PSE-SD dramatically when .

In fact the noticeable beneficial effect of allocating as a criterion for early termination of SE-SD is evident in the case of high noise variance or big problem size. For instance the complexity declines by 52% in the case of , , and .

5. Proposed TF-Based Algorithm

As a result of the presented discussion, EPSE-SD seems not to be the efficient decoding algorithm for some cases; a criterion should be introduced to help us to choose one of the initiation and termination techniques of SE-SD. We propose the TF-based algorithm performing different decoding techniques via a Threshold Factor (TF). TF of this algorithm is defined as a function of and :

There are four major cases according to the value of TF. First, when TF is greater than 300, the TF-based algorithm only performs SIC decoding. For instance, if and is greater than 2.74, , SE-SD leads to a huge amount of computational complexity, and its performance is more or less identical to SIC decoding.

In the second case when TF is between 3.2 and 300, the algorithm initially performs SIC decoding to find and consequently calculates suitable and using and then performs SE-SD algorithm. According to Tables 1 and 2, if and, or in a case of and, so , the PESE-SD can make an acceptable performance with a reasonable complexity.

In the third case, when TF is between 0.4 and, the algorithm only performs SE-SD with. Because the complexity of finding is comparable to SE-SD, it is not logical to use either PSE-SD or EPSE-SD. Based on Figures 2 and 4, when and , , a certain equal to 1 can make a sensible complexity without any performance degradation.

Finally, the fourth case is when TF is less than 0.4 and the algorithm performs SE-SD with . In this case SE-SD can find the closest lattice point quickly which costs a quiet small number of flops. The proposed algorithm can be summarized as Algorithm 1.

(1) Calculate TF
(2) If Then Do SIC algorithm
(3) If Then Do SIC decoding,
 Calculate , . Do EPSE-SD
(4) If Then let . Do SE-SD
(5) If Then let . Do SE-SD

6. Conclusion

In order to make SE-SD feasible for real applications, some techniques should be utilized to decrease the complexity of this algorithm. We presented new methods of initiation and termination of the SE-SD algorithm that can contribute to achieve the goal of having a reasonable complexity.

We showed that for a high number of transmit antennas, using Babai distance as initial radius leads to considerable performance degradation due to the big problem size. The suitable initial sphere radius which results in low complexity and desirable performance in the range of which covers wireless communication applications can be found by the our proposed method. Moreover, this paper offers a technique to estimate the final radius to which SE-SD converges. Utilizing the estimated final radius as a criterion for early terminating of SE-SD is a way of controlling the complexity of the algorithm because it avoids considerable amount of unnecessary calculations. Simulation results show that employing proposed initiation and early termination of SE-SD causes a significant reduction in the complexity. For example, when , , and , the technique leads to reduction in the complexity by 52%.

To estimate the initial radius and the final one, Babai distance should be found through SIC decoding. Therefore, the presented technique sounds not to be useful for some cases in which the extra complexity of SIC decoding is comparable to that of SE-SD process. This investigation proposed an algorithm that utilizes different techniques according to a threshold factor defined in terms of the number of transmit antennas and noise variance. Using threshold factor, the novel algorithm offers a reasonable complexity without any performance degradation.

Acknowledgments

This work is supported by Developing Research and Strategic Planning Department of Mobile Communication Company of Iran (MCI) as a part of advanced communication systems project. The authors would like to thank Mr. Vahid Sadoughi, the CEO of MCI, for his financial support. They would like to thank Mrs. Samaneh Movassaghi (University of Technology, Sydney) and Mr. Mahyar Shirvani Moghaddam (University of Sydney) for the great help they provided. Also, the authors would like to thank the editor and anonymous reviewers of the International Journal of Antennas and Propagation (IJAP) for their very helpful comments and suggestions which have improved the presentation of the paper.