Research Article  Open Access
Diego Javier Reinoso Chisaguano, Minoru Okada, "Low Complexity Submatrix Divided MMSE SparseSQRD Detection for MIMOOFDM with ESPAR Antenna Receiver", VLSI Design, vol. 2013, Article ID 206909, 11 pages, 2013. https://doi.org/10.1155/2013/206909
Low Complexity Submatrix Divided MMSE SparseSQRD Detection for MIMOOFDM with ESPAR Antenna Receiver
Abstract
Multiple input multiple outputorthogonal frequency division multiplexing (MIMOOFDM) with an electronically steerable passive array radiator (ESPAR) antenna receiver can improve the bit error rate performance and obtains additional diversity gain without increasing the number of Radio Frequency (RF) frontend circuits. However, due to the large size of the channel matrix, the computational cost required for the detection process using VerticalBell Laboratories Layered SpaceTime (VBLAST) detection is too high to be implemented. Using the minimum mean square error sparsesorted QR decomposition (MMSE sparseSQRD) algorithm for the detection process the average computational cost can be considerably reduced but is still higher compared with a conventional MIMOOFDM system without ESPAR antenna receiver. In this paper, we propose to use a low complexity submatrix divided MMSE sparseSQRD algorithm for the detection process of MIMOOFDM with ESPAR antenna receiver. The computational cost analysis and simulation results show that on average the proposed scheme can further reduce the computational cost and achieve a complexity comparable to the conventional MIMOOFDM detection schemes.
1. Introduction
In multipath fading channels, multiple input multiple output (MIMO) antenna systems can achieve a great increase in the channel capacity [1]. MIMOOFDM combines the advantages of the MIMO systems with orthogonal frequency division multiplexing (OFDM) modulation, achieving a good performance for frequency selective fading channels. Due to these advantages, MIMOOFDM allows high data rates in wireless communications systems. It is used in the wireless local area network (WLAN) standard IEEE 802.11n [2] and is also considered for the nextgeneration systems.
One of the limitations of MIMOOFDM is that it requires one radio frequency (RF) frontend circuit for every receiver and transmitter antenna. Comparing MIMOOFDM 2Tx2Rx with MIMOOFDM 2Tx4Rx, MIMOOFDM 2 4 can achieve better diversity gain and bit error rate performance but requires more RF frontend circuits, A/D converters, and FFT blocks for every additional branch.
In [3, 4] a MIMOOFDM 2 2 scheme with electronically steerable passive array radiator (ESPAR) antenna receiver diversity has been proposed. It utilizes for every receiver a 2element ESPAR antenna whose directivity is changed at the same frequency of the OFDM symbol rate. Compared to the conventional MIMOOFDM 2 2 systems, this scheme gives additional diversity gain and improves the bit error rate performance without increasing the number of RF frontend circuits. For the detection the zero forcing (ZF) VerticalBell Laboratories Layered SpaceTime (VBLAST) algorithm [5, 6] is used but, due to the large size of the channel matrix, the required computational effort is very high.
In order to reduce the computational cost of the detection process of the scheme proposed in [3, 4], the use of a minimum mean square error sparsesorted QR decomposition (MMSE sparseSQRD) algorithm based on the SQRD algorithm introduced in [7, 8] was proposed by the authors in [9]. The computational cost reduction is achieved by exploiting the sparse structure of the channel matrix. This detection algorithm considerably reduces the average computational cost and also improves the bit error rate performance compared to the original scheme [3, 4]. A submatrix divided MMSE sparseSQRD algorithm for the detection process of MIMOOFDM with ESPAR antenna receiver was proposed by the authors in [10] for further reduction in the computational cost. This algorithm divides the channel matrix into smaller submatrices reducing the computational cost but adding a small degradation in the bit error rate performance.
This paper is an extension of [10] including results of the bit error performance and computational cost for higher order submatrix division schemes. Also, another approach to further reduce the bit error degradation originated by the submatrix division algorithm is introduced.
The rest of this paper is organized as follows. Sections 2 and 3 gives a brief background description about OFDM and MIMOOFDM with ESPAR antenna receiver. In Section 4, detection algorithms based on QR decomposition are shown. Then in Section 5 a detailed explanation about the MMSE sparseSQRD algorithm is included. In Section 6 the proposed submatrix divided scheme is described. The computational cost analysis and simulation results are presented in Sections 7 and 8, respectively. And finally, in Section 9 conclusions are included.
2. OFDM with ESPAR Antenna
ESPAR is a small size and low power consumption antenna [11, 12]. It is composed by a radiator element connected to the RF frontend and one or more parasitic (passive) elements terminated by variables capacitances. The beam directivity can be controlled modifying the variables capacitances. This antenna requires only one RF frontend and therefore is known also as single RF port antenna array.
In [13] an OFDM receiver using ESPAR antenna is proposed. In this scheme the directivity of the ESPAR antenna is changed by a periodic wave whose frequency is the OFDM symbol rate. The block diagram of this scheme is shown in Figure 1.
A twoelement ESPAR antenna is utilized. The periodic variation of the directivity causes intercarrier interference (ICI) in the received signal. The ICI is caused by the addition of phase shifted components to the received signal. The frequency domain equalizer in Figure 1 uses both the shifted and nonshifted components in the detection. Due to this effect this scheme obtains diversity gain, therefore improving the bit error rate performance.
3. MIMOOFDM with ESPAR Antenna Receiver
Based on [13], a MIMOOFDM receiver with ESPAR antenna was proposed in [3, 4] and is described in this section. The block diagrams of the receiver and transmitter are shown in Figures 2 and 3, respectively.
The transmitter is based on the WLAN standard IEEE 802.11n [2]. For simplicity forward error correction (FEC) interleaver blocks are not considered in the system. The receiver uses a 2element ESPAR antenna where the directivity is also periodically changed according to the OFDM symbol rate. An MMSE channel estimator derived in [3, 4] is used and the detection process is carried out by the ZF VBLAST detector.
3.1. Channel Estimation
For the channel estimation [13], let be the pilot symbol and its cyclic shifted . The received signal after the FFT processor at the th Rx is where and are the channel response between the th receive antenna and th transmit antenna for the phase nonshifting () and phase shifting () elements respectively. The matrix represents the frequency shift due to directivity variation in ESPAR antenna and is the additive white Gaussian noise (AWGN) vector.
From (1) the autocorrelation matrix is given by where is the noise variance and is the covariance matrix that represents the delay profile of the channel. Considering that the phase nonshifting () and phase shifting () elements are spatially separated enough to be uncorrelated, the crosscorrelation matrices ] are given by Using the MMSE criteria the channel response is given by The channel matrix has a size of , where is the number of data subcarriers.
3.2. Detection
For the detection process the ZF VBLAST [6] algorithm is used. In this algorithm the received signal vector is multiplied with a filter matrix , calculated by where is the MoorePenrose pseudoinverse of . The matrix is calculated in a recursive way after zeroing one column of the channel matrix ; for this scheme the pseudoinverse is calculated times. Due to the large size of the channel matrix , calculating the pseudoinverse demands a very high computational effort and for this reason the detection process is the main limitation of this scheme.
4. QR DecompositionBased Detection
Let denote the vector of transmitted symbols, let denote the vector of noise components and the vector of received symbols.
4.1. MMSEQRD
Applying like in [8] the MMSE detector criteria, let us denote the extended channel matrix and the extended vector of received symbols by where is the noise standard deviation, is an identity matrix of size , and is a column vector with zero elements.
The QR decomposition of the extended channel matrix can be expressed by where is a unitary matrix and is an upper triangular matrix. And the extended vector of received symbols is given by Then (8) is multiplied by to obtain where is the Hermitian transpose of and . The statistical properties of remain unchanged because is a unitary matrix.
4.2. MMSESQRD
In [8] an MMSE sorted QRD detection algorithm based on the modified GramSchmidt algorithm is introduced. The starting condition is that ; then the norms of the column vectors of are calculated. For every step the column of with the minimum norm is found to maximize and the columns of are exchanged before the orthogonalization process. This algorithm calculates an improved matrix that reduces the error propagation through the detection layers. During the calculation, a permutation vector carries the column exchanging operations for reordering the detected symbols at the end of the algorithm.
After the matrices and are calculated, then is obtained according to (9) and the symbols are detected iteratively. After the symbols are detected, they are reordered using the permutation vector to find the original sequence of the detected symbols.
5. MMSE SparseSQRD Algorithm
The extended channel matrix whose size is is shown in (10) and as we can see it is a sparse matrix. The MMSE sparseSQRD algorithm is based on the MMSESQRD algorithm [8] and exploits the sparse structure of to reduce the computational cost of the detection process:
Analysing given in (10) we can see that every column has only five nonzero elements so for the norm calculation of the column vectors of only these elements should be used. Also the positions of the nonzero elements are fixed so we have this information contained in a matrix as input. Using this information the norm calculation is shown in lines 5–9 of Algorithm 1.

In the orthogonalization process of the algorithm these two calculations are performed in an iterative way. denote the elements of the matrix and , are column vectors of the matrix .
In (11), the multiplication of the zero elements of the column vectors does not influence the final result so these multiplications can be avoided. A vector containing only the indices of the nonzero elements of the column vectors is obtained in line 14 so the number of operations required to calculate is reduced without influencing the final result. This is shown in lines 19–21 in the algorithm. The same strategy is used also in lines 15–17 and 23–25.
Also, due to the sparse structure of (10), the result of (11) can be zero. In this case calculating (12) is unnecessary because it does not change the value of so it can be avoided using the condition in line 22.
Also we can consider that the calculation of (9) can be simplified as where is a matrix with the same size of the channel matrix . This is shown in lines 3031 of the algorithm.
Using these analysed criteria the MMSE sparseSQRD algorithm can achieve the same bit error rate performance of the MMSESQRD algorithm but with a considerable computational cost reduction.
6. Submatrix Divided Proposed Algorithm
In order to further reduce the computational cost of the detection process an algorithm based on submatrix division of the channel matrix is proposed. The block diagram of the proposed scheme is shown in Figure 4.
This detection scheme is composed by a submatrix builder block and MMSE sparseSQRD detectors. The submatrix builder is fed with the received symbols from the FFT processors and the channel state information obtained in the channel estimator. Its function is to build the submatrices and vectors for the detectors. Every detector is fed with a vector of received symbols and a channel submatrix .
From now on we consider the number of subcarriers to be like in the IEEE 802.11n [2] standard. Let denote the vector of received (transmitted and interfered) symbols from the FFT1 processor and let denote the vector of received symbols from the FFT2 processor. For simplicity we consider that the extended channel submatrix is created inside the th detector. Now we will explain in detail the submatrix division case when considering two variations with 2 or 4symbol overlapping.
6.1. QuarterSize Submatrix (k = 4) with 2Symbol Overlapping
In this case we divide the channel matrix into four submatrices denoted as , , and . These matrices are shown in (14), (15), (16), and (17), respectively, The vectors of received symbols applied to the four detectors are denoted as
And the vectors of detected symbols obtained from the detectors are denoted as
The submatrix division introduces a degradation in the bit error performance so now we explain the procedure used to minimize this effect. First the channel matrix nonshifted () elements associated with the subcarrier −14 () are included in both and . In the same way the nonshifted elements associated with the subcarrier +15 () are included in and .
We overlap the symbols , in vectors and . Using the information from the detection process of the detector 1, the symbols , in vector are compensated according to (20) where , represent the compensated symbols: We also overlap symbols , in and . The symbols , in vector are compensated according to (21) using the information of the detection process of the detector 3. Similarly and represent the compensated symbols:
During the sorting process of the detector 1, the columns containing the channel matrix nonshifted () elements associated with the subcarrier −14 are used first regardless of its norm. It reduces the degradation introduced by these elements in the upper layers during the detection process. The same is performed in the detector 3 with the nonsubcarriershifted elements of the subcarrier +15.
In the vectors of detected symbols the overlapped detected elements , in vector and , in vector are discarded because they have a higher probability of error.
6.2. QuarterSize Submatrix (k = 4) with 4Symbol Overlapping
In this subsection another variation with 4symbol overlapping is introduced. The objective of this idea is to further reduce the degradation in the bit error rate performance created by the submatrix division. Similar to the previous subsection we divide the channel matrix into four submatrices denoted as , , , and . These matrices are shown in (23), (24), (25), and (26), respectively. In this case the vectors of received symbols applied to the four detectors are denoted as And the vectors of detected symbols obtained from the detectors are denoted as
In this variation 4 symbols , , , in vectors and are overlapped. Similar to the previous subsection the symbols , in vector are compensated according to (20) using the elements of and . We also overlap symbols , , , in and . In the same way the symbols , in vector are compensated according to (21) using the elements of and .
Also the channel matrix elements associated with the subcarriers −14 and −13 are included in both and . In the same way the elements associated with the subcarriers +15 and +16 are included in and . During the sorting process of the detector 1, the columns containing the channel matrix elements associated with the subcarriers −14 and −13 are used first regardless of its norm. The same is performed in the detector 3 with the elements of the subcarriers +15 and +16.
In the vectors of detected symbols the overlapped elements , , , in vector and , , , in vector are discarded because they have a higher probability of error.
7. Computational Cost
The computational cost is analysed in terms of the number of complex floating point operations (flops) required. As in [8], for simplicity we consider each complex addition as one flop and each complex multiplication as three flops. We cannot obtain a formula for the number of flops for the submatrix divided proposed algorithm because this number depends on the random sorting, so we obtained an average of the number of flops from the simulation results. Also, for comparison, the number of flops required by the ML detector [14] is where is the constellation size. The ZFVBLAST algorithm like in [15] requires where is the number of columns and is the number of rows of the channel matrix .
In Tables 1 and 2 a computational cost comparison in terms of the average number of flops per subcarrier is presented for the case of 2 and 4symbol overlapping, respectively. The tables show the number of flops per subcarrier for different submatrix sizes using different modulation schemes. The tables also include the number of flops for a full size channel matrix when the submatrix division scheme is not utilized. We can see that when the submatrix division order increases the average number of flops per subcarrier is reduced. For the eighteen () submatrix size, that is, the maximum achievable division of the scheme, we obtain the minimum average computational cost. Also we can see that the average number of flops is similar for the different modulation schemes. And, the number of flops for the 4symbols overlapping option is bigger compared with the other 2symbols overlapping option.


Table 3 shows as reference the number of flops per subcarrier of the conventional MIMO 2 2 VBLAST and MIMO 2 2 MLD both without ESPAR antenna receiver. Also the computational cost using eighteenthsize () submatrix division MMSE sparseSQRD algorithm with 2 and 4symbols overlapping is included. We can see that the average number of flops per subcarrier of the proposed submatrix division based algorithm is similar to the flops of MIMO 2 2 VBLAST and better than MIMO 2 2 MLD scheme for 16QAM and 64QAM modulation.

For calculating the total computational cost required by the receiver, based on [16] the number of flops required by the two FFT blocks considering the data symbol and pilot symbol is where is the FFT size. Also the flops required by the channel estimator used for the ESPAR antenna receiver, that was presented in Section 3.1, are given by
Table 4 presents the total flops per subcarrier required by the receiver using QPSK modulation. Also the complexity of the FFT, channel estimator, and detection blocks is included for the different systems. The MIMOOFDM systems that are analysed in this table are the original system with ESPAR antenna receiver using ZFVBLAST detector [3, 4], the system using fullsize channel matrix detection, the system using the proposed submatrix divided () with 4symbol overlapping detection and the 2 2 VBLAST system without ESPAR antenna receiver. We can observe that using the proposed submatrix divided scheme () with 4symbols overlapping the computational cost required for the detection and also the total number of flops per subcarrier required by the receiver are reduced.
8. Simulation Results
To determine the bit error rate performance of the proposed algorithm, a software simulation model of MIMOOFDM with ESPAR antenna receiver was developed in c++ using the it++ [17] communications library. It is important to note that the system does not include FEC and interleaver. In the simulation the proposed low complexity submatrix divided MMSE sparseSQRD detection is implemented with quartersize, eighthsize and eighteenthsize, submatrices. Both options, with 2 and 4symbol overlapping, are implemented for the previous mentioned submatrix sizes. The configuration settings of the simulation are shown in Table 5.

In Figures 5 and 6 the bit error rate performance using QPSK modulation, for the cases of 2 and 4symbol overlapping, respectively, is shown. In these figures the performance of the proposed algorithm for quartersize (), eighthsize (), and eighteenthsize () submatrices is included. To compare the degradation in the bit error performance created by the algorithm, the performance in the case of a fullsize channel matrix without division is included. And also the performance of conventional MIMOOFDM 2 2 VBLAST and MIMOOFDM 2 2 MLD systems without ESPAR antenna receiver is shown. As we can see in Figure 6, with QPSK modulation and 4symbol overlapping, the bit error rate performance degradation is minimum even for the case of eighteenthsize () submatrix size. Also for a BER of , the proposed scheme with eighteenthsize () submatrix size that achieves the minimum computational cost obtains an additional gain of about 11 dB compared to a conventional MIMOOFDM 2 2 VBLAST system without ESPAR antenna receiver.
In the same way the bit error rate using 16QAM modulation is shown in Figures 7 and 8. With 16QAM modulation the degradation in the bit error rate performance is bigger compared with the QPSK results. In this case also the degradation is smaller in the case of 4symbols overlapping. With 16QAM for a BER of , the proposed scheme with eighteenthsize () submatrix size, obtains an additional gain of about 8.5 dB compared to a conventional MIMOOFDM 2 2 VBLAST system.
The results for 64QAM are shown in Figures 9 and 10. In this case the degradation is much bigger and the best result is obtained with the 4symbol overlapping option.
In Figure 11 the BER performance of the proposed submatrix divided () scheme with 4symbol overlapping is compared with the conventional MIMO 2 2 VBLAST and MIMO 2 4 VBLAST without ESPAR antenna using QPSK modulation. This is not a fair comparison in terms of the number of RF frontends in the receiver side because MIMO 2 4 VBLAST requires four RF frontends compared to MIMO 2 2 VBLAST and the proposed submatrix divided () scheme that only require two RFfront ends. However, this figure shows that the proposed scheme cannot overcome the BER performance of MIMO 2 4 VBLAST without ESPAR antenna but gives a considerable improvement compared to the BER of MIMO 2 2 VBLAST without ESPAR antenna receiver. Also in this figure we can observe that the slope of the proposed scheme and MIMO 2 4 VBLAST are similar and steeper compared to MIMO 2 2 VBLAST. Therefore, our proposed scheme achieves a diversity order similar to MIMO 2 4 VBLAST without ESPAR antenna receiver.
9. Conclusion
In this paper, we have proposed a low complexity submatrix divided MMSE SparseSQRD algorithm for the detection of MIMOOFDM with ESPAR antenna receiver. The computational cost analysis shows that this algorithm can further reduce the average computational effort achieving a complexity comparable to the common MIMOOFDM detection schemes. We analysed two variations using 2 and 4symbol overlapping. From the results the option with 4symbol overlapping obtains the best performance in terms of bit error rate, yet increasing the computational cost compared with the other option. The proposed detection scheme is flexible, so the best tradeoff between computational cost and bit error rate can be selected depending on the design constraints.
The main application of MIMOOFDM with ESPAR antenna receiver is to improve the bit error rate performance and diversity gain without increasing the number of RF frontend circuits. And utilizing the proposed low complexity detection scheme we can obtain this improvement in the performance with a low computational cost. The proposed detection scheme is specifically designed to reduce the computational cost of the detection of MIMOOFDM with ESPAR antenna receiver but it can be also applied in the detection of similar systems that have a large size channel matrix.
In future research we will work in the channel estimator because it is necessary to reduce its computational cost. Also, we will add FEC and interleaver to the system for further improvement in the bit error rate performance.
References
 E. Telatar, “Capacity of multiantenna Gaussian channels,” European Transactions on Telecommunications, vol. 10, no. 6, pp. 585–595, 1999. View at: Google Scholar
 IEEE Computer Society, IEEE Standard for Information Technology Telecommunication and Information Exchange between Systems Local and Metropolitan Area Networks Specific Requirements, IEEE Computer Society, New York, NY, USA, 2009.
 I. G. P. Astawa and M. Okada, “ESPAR antennabased diversity scheme for MIMOOFDM systems,” in Proceedings of the 2009 Thainland—Japan MicroWave, pp. 1–4, February 2010. View at: Google Scholar
 I. G. P. Astawa and M. Okada, “An RF signal processing based diversity scheme for MIMOOFDM systems,” IEICE Transactions on Communications, vol. 95, no. 2, pp. 515–524, 2012. View at: Publisher Site  Google Scholar
 G. J. Foschini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky, “Simplified processing for high spectral efficiency wireless communication employing multielement arrays,” IEEE Journal on Selected Areas in Communications, vol. 17, no. 11, pp. 1841–1852, 1999. View at: Publisher Site  Google Scholar
 P. W. Wolniansky, G. J. Foschini, G. D. Golden, and R. A. Valenzuela, “VBLAST: an architecture for realizing very high data rates over the richscattering wireless channel,” in Proceedings of the URSI International Symposium on Signals, Systems, and Electronics (ISSSE '98), pp. 295–300, October 1998. View at: Google Scholar
 D. Wubben, J. Rinas, R. Bohnke, V. Kuhn, and K. D. Kammeyer, “Efficient algorithm for detecting layered spacetime codes,” in Proceedings of the ITG Conference on Source and Channel Coding, pp. 399–405, Berlin, Germany, January 2002. View at: Google Scholar
 D. Wubben, R. Bohnke, V. Kuhn, and K. D. Kammeyer, “MMSE extension of VBLAST based on sorted QR decomposition,” in Proceedings of the IEEE 58th Vehicular Technology Conference (VTC '03Fall), vol. 1, pp. 508–512. View at: Publisher Site  Google Scholar
 D. J. Reinoso Ch and M. Okada, “Computational cost reduction of MIMOOFDM with ESPAR antenna receiver using MMSE SparseSQRD detection,” in Proceedings of the 27th International Technical Conference on Circuit/Systems, Computers and Communications, Sapporo, Japan, July 2012. View at: Google Scholar
 D. J. R. Chisaguano and M. Okada, “ESPAR antenna assisted MIMOOFDM receiver using submatrix divided MMSE sparseSQRD detection,” in Proceedings of the International Symposium on Communications and Information Technologies (ISCIT '12), pp. 198–203, Gold Coast, Australia, October 2012. View at: Publisher Site  Google Scholar
 T. Ohira and K. Iigusa, “Electronically steerable parasitic array radiator antenna,” Electronics and Communications in Japan II, vol. 87, no. 10, pp. 25–45, 2004. View at: Publisher Site  Google Scholar
 T. Ohira and K. Gyoda, “Electronically steerable passive array radiator antennas for lowcost analog adaptive beamforming,” in Proceedings of the IEEE International Conference on Phased Array Systems and Technology, pp. 101–104, Dana Point, Calif, USA, May 2000. View at: Google Scholar
 S. Tsukamoto and M. Okada, “SingleRF diversity for OFDM system using ESPAR antenna with periodically changing directivity,” in Proceedings of the 2nd International Symposium on Radio Systems and Space Plasma, pp. 1–4, Sofia, Bulgaria, August 2010. View at: Google Scholar
 M. Chouayakh, A. Knopp, and B. Lankl, “Low complexity two stage detection scheme for MIMO systems,” in Proceedings of the IEEE Information Theory Workshop on Information Theory for Wireless Networks (ITW '07), pp. 1–5, Solstrand, Norway, July 2007. View at: Publisher Site  Google Scholar
 J. Benesty, Y. Huang, and J. Chen, “A fast recursive algorithm for optimum sequential signal detection in a BLAST system,” IEEE Transactions on Signal Processing, vol. 51, no. 7, pp. 1722–1730, 2003. View at: Publisher Site  Google Scholar
 S. G. Johnson and M. Frigo, “A modified splitradix FFT with fewer arithmetic operations,” IEEE Transactions on Signal Processing, vol. 55, no. 1, pp. 111–119, 2007. View at: Publisher Site  Google Scholar
 “Welcome to IT++!,” 2010, http://itpp.sourceforge.net/devel/index.html. View at: Google Scholar
Copyright
Copyright © 2013 Diego Javier Reinoso Chisaguano and Minoru Okada. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.