Basic Linear Algebra Subprograms - Implementations

Implementations

Accelerate
Apple's framework for Mac OS X and iOS, which includes tuned versions of BLAS and LAPACK.
ACML
The AMD Core Math Library, supporting the AMD Athlon and Opteron CPUs under Linux and Windows.
C++ AMP BLAS
The C++ AMP BLAS Library is an open source implementation of BLAS for Microsoft's AMP language extension for Visual C++.
ATLAS
Automatically Tuned Linear Algebra Software, an open source implementation of BLAS APIs for C and Fortran 77.
ESSL
IBM's Engineering and Scientific Subroutine Library, supporting the PowerPC architecture under AIX and Linux.
Eigen BLAS
A Fortran 77 and C BLAS library implemented on top of the open source Eigen library, supporting x86, x86 64, ARM (NEON), and PowerPC architectures. (Note: as of Eigen 3.0.3, the BLAS interface is not built by default and the documentation refers to it as "a work in progress which is far to be ready for use".)
Goto BLAS
Kazushige Goto's BSD-licensed implementation of BLAS, tuned in particular for Intel Nehalem/Atom, VIA Nanoprocessor, AMD Opteron.
HP MLIB
HP's Math library supporting IA-64, PA-RISC, x86 and Opteron architecture under HPUX and Linux.
Intel MKL
The Intel Math Kernel Library, supporting the old Intel Pentium (although there are some doubts about future support to the Pentium architecture), Core and Itanium CPUs under Linux, Windows and Mac OS X.
MathKeisan
NEC's math library, supporting NEC SX architecture under SUPER-UX, and Itanium under Linux
Netlib BLAS
The official reference implementation on Netlib, written in Fortran 77.
Netlib CBLAS
Reference C interface to the BLAS. It is also possible (and popular) to call the Fortran BLAS from C.
PDLIB/SX
NEC's Public Domain Mathematical Library for the NEC SX-4 system.
SCSL
SGI's Scientific Computing Software Library contains BLAS and LAPACK implementations for SGI's Irix workstations.
Sun Performance Library
Optimized BLAS and LAPACK for SPARC, Core and AMD64 architectures under Solaris 8, 9, and 10 as well as Linux.
SurviveGotoBLAS2
Optimized BLAS that is an attempt to continue the work of Kazushige Goto by Ei-ji Nakama.
OpenBLAS
Optimized BLAS based on Goto BLAS hosted at GitHub, supporting Intel Sandy Bridge and MIPS_architecture Loongson processors.
cuBLAS
Optimized BLAS for NVIDIA based GPU cards.

Read more about this topic:  Basic Linear Algebra Subprograms