Table of Contents
ISRN Signal Processing
Volume 2011, Article ID 615934, 8 pages
http://dx.doi.org/10.5402/2011/615934
Research Article

Low-Complexity Inverse Square Root Approximation for Baseband Matrix Operations

1Department of Computer Systems, Tampere University of Technology, P.O. Box 553, 33101 Tampere, Finland
2Nokia Multimedia Imaging, Nokia Devices R&D, Nokia, Visiokatu 1, 33720 Tampere, Finland
33GP/DSE, ST-Ericsson, Hermiankatu 1 B, 33720 Tampere, Finland
4Nokia Devices R&D, Nokia, Elektroniikkatie 3, 90570 Oulu, Finland

Received 8 December 2010; Accepted 11 January 2011

Academic Editors: E. D. Übeyli and E. Salerno

Copyright © 2011 Perttu Salmela et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. P. Darwood, P. Alexander, and I. Oppermann, “LMMSE chip equalization for 3GPP WCDMA downlink receivers with channel coding,” in Proceedings of the IEEE International Conference on Communications (ICC '01), vol. 5, pp. 1421–1425, Helsinki, Finland, June 2001.
  2. G. Baker, J. Gunnels, G. Morrow, B. Riviere, and R. van de Geijn, “PLAPACK: high performance through high-level abstraction,” in Proceedings of the International Conference on Parallel Processing (ICPP '98), pp. 414–422, Minneapolis, Minn, USA, August 1998.
  3. J. Demmel, “LAPACK: a portable linear algebra library for supercomputers,” in Proceedings of the IEEE Control Systems Society Workshop on Computer-Aided Control System Design (CACSD '89), pp. 1–7, Tampa, Fla, USA, December 1989.
  4. H. Kwan, R. L. Nelson, and E. E. Swartzlander, “Cascade implementation of an iterative inverse-square-root algorithm, with overflow lookahead,” in Proceedings of the IEEE 12th Symposium on Computer Arithmetic, pp. 115–122, Bath, UK, July 1995.
  5. S. F. Oberman, “Floating point division and square root algorithms and implementation in the AMD-K7 microsprocessor,” in Proceedings of the 14th IEEE Symposium on Computer Arithmetic, pp. 106–115, Adelaide, Australia, April 1999.
  6. W. F. Wong and E. Goto, “Fast hardware-based algorithms for elementary function computations using rectangular multipliers,” IEEE Transactions on Computers, vol. 43, no. 3, pp. 278–294, 1994. View at Publisher · View at Google Scholar · View at Scopus
  7. K. Turkowski, “Computing the inverse square root,” Tech. Rep. 95, Media Technologies: Computer Graphics Advanced Technology Group Apple Computer, Inc., October 1994. View at Google Scholar
  8. C. Lomont, “Fast inverse square root,” Tech. Rep., Department of Mathematics, Purdue University, West Lafayette, Ind, USA, February 2003. View at Google Scholar
  9. V. K. Jain, S. Shrivastava, A. D. Snider, D. Damerow, and D. Chester, “Hardware implementation of a nonlinear processor,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '99), vol. 6, pp. I-509–I-514, Orlando, Fla, USA, June 1999.
  10. J. A. Piñeiro and J. D. Bruguera, “High-speed double-precision computation of reciprocal, division, square root, and inverse square root,” IEEE Transactions on Computers, vol. 51, no. 12, pp. 1377–1388, 2002. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  11. N. Takagi, “A hardware algorithm for computing reciprocal square root,” in Proceedings of the 15th IEEE Symposium on Computer Arithmetic, pp. 94–100, Vail, Colo, USA, June 2001.
  12. T. Lang and E. Antelo, “Radix-4 reciprocal square-root and its combination with division and square root,” IEEE Transactions on Computers, vol. 52, no. 9, pp. 1100–1114, 2003. View at Publisher · View at Google Scholar · View at Scopus
  13. N. Takagi, “Generating a power of an operand by a table look-up and a multiplication,” in Proceedings of the 13th IEEE Symposium on Computer Arithmetic, pp. 126–131, Asilomar, Calif, USA, July 1997.
  14. N. Takagi, “Powering by a table look-up and a multiplication with operand modification,” IEEE Transactions on Computers, vol. 47, no. 11, pp. 1216–1222, 1998. View at Google Scholar · View at Scopus
  15. M. J. Schulte and K. E. Wires, “High-speed inverse square roots,” in Proceedings of the 14th IEEE Symposium on Computer Arithmetic, pp. 124–131, Adelaide, Australia, April 1999.
  16. M. D. Ercegovac, T. Lang, J. M. Muller, and A. Tisserand, “Reciprocation, square root, inverse square root, and some elementary functions using small multipliers,” IEEE Transactions on Computers, vol. 49, no. 7, pp. 628–637, 2000. View at Google Scholar · View at Scopus
  17. J. N. Coleman and E. I. Chester, “A 32-bit logarithmic arithmetic unit and its performance compared to floating-point,” in Proceedings of the 14th IEEE Symposium on Computer Arithmetic, pp. 142–151, Adelaide, Australia, April 1999.
  18. M. Haselman, M. Beauchamp, A. Wood, S. Hauck, K. Underwood, and K. S. Hemmert, “A comparison of floating point and logarithmic number systems for FPGAs,” in Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '05), pp. 181–190, Napa, Calif, USA, April 2005. View at Publisher · View at Google Scholar
  19. C. H. Chen and C. Y. Lee, “Cost effective lighting processor for 3D graphics application,” in Proceedings of the International Conference on Image Processing (ICIP '99), vol. 2, pp. 792–796, Kobe, Japan, October 1999.
  20. A. Happonen, E. Hemming, and M. Juntti, “A novel coarse grain reconfigurable processing element architecture,” in Proceedings of the IEEE International Midwest Symposium on Circuits and Systems, vol. 3, pp. 827–830, Cairo, Egypt, December 2003.
  21. J. E. Stine and M. J. Schulte, “Symmetric table addition method for accurate function approximation,” Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 21, no. 2, pp. 167–177, 1999. View at Publisher · View at Google Scholar · View at Scopus
  22. F. de Dinechin and A. Tisserand, “Some improvements on multipartite table methods,” in Proceedings of the 15th IEEE Symposium on Computer Arithmetic, pp. 128–135, Vail, Colo, USA, June 2001.
  23. K. Rounioja and J. A. Parviainen, “Arithmetic processing unit for reciprocal operations,” in Proceedings of the International Symposium on System-on-Chip (SoC '03), pp. 109–112, Tampere, Finland, November 2003.
  24. G. H. Golub, Matrix Computations, John Hopkins University Press, Baltimore, Md, USA, 1989.
  25. S. Y. Kung , VLSI Array Processors, Prentice-Hall, Upper Saddle River, NJ, USA, 1987.
  26. R. Andraka, “Survey of CORDIC algorithms for FPGA based computers,” in Proceedings of the ACM/SIGDA 6th International Symposium on Field Programmable Gate Arrays (FPGA '98), pp. 191–200, Monterey, Calif, USA,, February 1998.
  27. H. Corporaal, Microprocessor Architectures from VLIW to TTA, John Wiley & Sons, New York, NY, USA, 1998.
  28. P. Salmela, A. Burian, H. Sorokin, and J. Takala, “Complex-valued QR decomposition implementation for MIMO receivers,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '08), pp. 1433–1436, Las Vegas, Nev, USA, April 2008. View at Publisher · View at Google Scholar
  29. P. Salmela, A. Happonen, T. Järvinen, A. Burian, and J. Takala, “DSP implementation of Cholesky decomposition,” in Proceedings of the Joint 1st Workshop on Sensor Networks and Symposium on Trends in Communications, pp. 6–9, Bratislava, Slovakia, June 2006.