Computing the norm of a vector expression ||aW+bX+cY||

2023-02-23 08:17 问答作者：

I am a PhD student. In the introduction of my thesis, I am insterested by the compromise between expressivity and performances of Linear Algebra tools.

As a simple example, I use the computation of the norm of a vector expression. The C code for my example is:

float normExpression3(float a, float *W, float b, float *X, float c, float*Y){
double norm = 0;
for (int i=0; i<n; ++i) // n in [3e6; 2e8]
{
    float tmp = a*W[i]+b*X[i]+c*Y[i];
    norm+=tmp*tmp;
}
return sqrtf(norm);

}

I compare the performances achieved with different techniques. As the vectors are big (several million elements), the performances are limited by the memory bandwidth. However, there are huge differences beetween the different approaches.

The optimized C version I wrote is not expressive (a new function has to be written to as a 4th vector) and very ugly (threaded and vectorized) but achieved 6.4 GFlops. On the other hand, MATLAB code is very nice:

result = norm(a*W+b*X+c*Y)

but only achieves 0.28 GFlops.

C++ templates expressions à la Blitz++ provide both expressivity and performances to the user (6.5 GFlops).

As a part of my analysis, I would like to know how functionnal languages can compare to these approaches. I thought about showing an example either in Haskell or in OCaml (AFAIK, both are reputed well suited for this kind of operation).

I know none of these languages. I could learn on of them to provide my example but this would'nt be a fair comparison: I am not sure to be able to provide an implementation allowing both expressivity and performances.

So my two questions are : 1) which language is best suited ? 2) how could the norm of vector expressions be computed efficiently without compromising the generality of the 开发者_如何转开发implementation ?

By advance, thank you !

Wilfried K.

Edit : corrected the type of the norm accumulator for float to double

For what is worth, the following is an OCaml version of your function:

let normExpression3 a w b x c y =
    let n = Array.length w in
    if not (n = Array.length x && n = Array.length y)
        then invalid_arg "normExpression3";
    let (@) = (Array.unsafe_get : float array -> int -> float) in
    let rec accum a w b x c y n i norm =
        if i = n then sqrt norm else
        let t = a *. (w @ i) +. b *. (x @ i) +. c *. (y @ i) in
        accum a w b x c y n (i + 1) (norm +. t)
    in accum a w b x c y n 0 0.

It makes some allowances for performance, namely:

Unchecked array access (or rather, array bounds checks manually hoisted out of the loop)
Monomorphic array access
Recursive inner loop to avoid boxing and unboxing the float accumulator
Lambda-lifting of the inner loop to avoid referencing closed-over values

The last optimization should be checked against a closed-over inner loop, since with so many parameters register spilling might dominate over the cost of referencing closed-over parameters.

Note that one normally wouldn't bother with this kind of optimizations unless one were to compete in a benchark ;-) Note also that you'll need to test this with a 64-bit OCaml, since arrays are otherwise limited to 4 mega elements.

1) which language is best suited ?

Either are used for this kind of task. The primary concern would be availability of required libraries (e.g. for vectors or matrices), and whether there was any need for parallelism.

Libraries such as vector and repa in Haskell are well-suited. And in the case of repa, you get parallelism as well.

2) how could the norm of vector expressions be computed efficiently without compromising the generality of the implementation ?

One approach would be to use meta-programming techniques to generate specialized implementations of the computational kernels from high level descriptions. In functional languages this is a relatively common technique, based on small domain-specific languages, with custom code generators.

See e.g. Specialising Simulator Generators for High-Performance Monte-Carlo Methods or the work in OCaml on FFTW.

Not so much as a functional language answer per se, but please note that your implementation to calculate norm (in C) is way different how matlab actually calculates it.

And yes, there are indeed very good reasons for that. Most probably your approximation of norm is pretty much useless (as how it's implemented currently) for any real use case. Please don't underestimate the 'difficulties' related to calculating numerically superior approximations of norms.

As don said, consider Repa. Here is some simple code to get you started.

import Data.Array.Repa as R

len :: Int
len = 50000

main = do
    let ws = R.fromList (Z :. len) [0..len-1]
        xs = R.fromList (Z :. len) [10498..10498 + len - 1]
        ys = R.fromList (Z :. len) [8422..8422 + len - 1]
    print (multSum 52 73 81 ws xs ys)

multSum a b c ws xs ys = R.map (a*) ws +^ R.map (b*) xs +^ R.map (c*) ys

This still leaves you to find a good way to get the data from disk and into a Repa array. I think reading it all in as a Llazy ByteString and using Repa.fromFunction should do, perhaps someone will chime in with a smarter way.

继续阅读：haskell ocaml

Computing the norm of a vector expression ||aW+bX+cY||

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？