Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines

Choi, Jaeyoung; Dongarra, Jack J.; Ostrouchov, L. Susan; Petitet, Antoine P.; Walker, David W.; Whaley, R. Clint

doi:https://doi.org/10.1155/1996/483083

Scientific Programming

On this page

Abstract Copyright Related Articles

Open Access

Volume 5 | Article ID 483083 | https://doi.org/10.1155/1996/483083

Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines

Jaeyoung Choi,¹Jack J. Dongarra,^1,2L. Susan Ostrouchov,¹Antoine P. Petitet,¹David W. Walker,²and R. Clint Whaley¹

Received22 Sept 1994

Accepted22 May 1995

Abstract

This article discusses the core factorization routines included in the ScaLAPACK library. These routines allow the factorization and solution of a dense system of linear equations via LU, QR, and Cholesky. They are implemented using a block cyclic data distribution, and are built using de facto standard kernels for matrix and vector operations (BLAS and its parallel counterpart PBLAS) and message passing communication (BLACS). In implementing the ScaLAPACK routines, a major objective was to parallelize the corresponding sequential LAPACK using the BLAS, BLACS, and PBLAS as building blocks, leading to straightforward parallel implementations without a significant loss in performance. We present the details of the implementation of the ScaLAPACK factorization routines, as well as performance and scalability results on the Intel iPSC/860, Intel Touchstone Delta, and Intel Paragon System.

Copyright

Copyright © 1996 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation Order printed copies

Views

401

Downloads

1820

Citations