开发者

BLAS: gemm vs. gemv

Why does BLAS have a 开发者_运维技巧gemm function for matrix-matrix multiplication and a separate gemv function for matrix-vector multiplication? Isn't matrix-vector multiplication just a special case of matrix-matrix multiplication where one matrix only has one row/column?


Mathematically, matrix-vector multiplication is a special case of matrix-matrix multiplication, but that's not necessarily true of them as realized in a software library.

They support different options. For example, gemv supports strided access to the vectors on which it is operating, whereas gemm does not support strided matrix layouts. In the C language bindings, gemm requires that you specify the storage ordering of all three matrices, whereas that is unnecessary in gemv for the vector arguments because it would be meaningless.

Besides supporting different options, there are families of optimizations that might be performed on gemm that are not applicable to gemv. If you know that you are doing a matrix-vector product, you don't want the library to waste time figuring out that's the case before switching into a code path that is optimized for that case; you'd rather call it directly instead.


When you optimize gemv and gemm different techniques apply:

  • For the matrix-matrix operation you are using blocked algorithms. Block sizes depend on cache sizes.
  • For optimising the matrix-vector product you use so called fused Level 1 operations (e.g. fused dot-products or fused axpy).

Let me know if you want more details.


I think it just fits the BLAS hierarchy better with its level 1 (vector-vector), level 2 (matrix-vector) and level 3 (matrix-matrix) routines. And it maybe optimizable a bit better if you know it is only a vector.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜