Are efficient "repeatedly used intermediates" possible in C++ expression template programming?

2023-03-31 21:46 问答作者：

Here's one thing I haven't seen explicitly addressed in C++ expression template programming in order to avoid building unnecessary temporaries (through creating trees of "inlinable templated objects" that only get collapsed at the assignment operator). Suppose for the illustration we're modeling 1-D sequences of values, with elementwise application of arithmetic operators like +, *, etc. Call the basic class for fully-created sequences Seq (which holds a fixed-length list of doubles for the sake of concreteness) and consider the following illustrative pseudo-C++-code.

void f(Seq &a,Seq &b,Seq &c,Seq &d,Seq &e){
    AType t=(a+2*b)/(a+b+c); // question is about what AType can be
    Seq f=d*t;
    Seq g=e*e*t;
    //do something with f and g
}

where there are expression templated overloads for +, etc, elsewhere. For the line defining t:

I can implement this code if I make AType be Seq, but then I've created this full intermediate variable when I don't need it (except in how it enables computation of f and g). But at least it's only calculated once.
I can also implement this making AType be the appropriate templated expression type, so that a full Seq isn't created at the commented line, but 开发者_如何学运维consumed chunk-by-chunk in f and g. But then the same computation involved in creating every particular chunk will be repeated in both f and g. (I suppose in theory an incredibly smart compiler might realise the same computation is being done twice and CSE-it, but I don't think any do and I wouldn't want to rely on an optimiser always being able to spot the opportunities.)

My understanding is that there's no clever code rewriting and/or usage of templates that allow each chunk of t to be calculated only once and for t to be calculated chunkwise rather than all at once?

(I can vaguely imagine AType could be some kind of object that contains both an expression template type and a cached value that gets written after it's evaluated the first time, but that doesn't seem to help with the need to synchronise the two implicit loops in the assignments to f and g.)

In googling, I have come across one Masters thesis on another subject that mentions in passing that manual "common subexpression elimination" should be avoided with expression templates, but I'd like to find a more authoritative "it's not possible" or a "here's how to do it".

The closest stackoverflow question is Intermediate results using expression templates which seems to be about the type-naming issue rather than the efficiency issue in creating a full intermediate.

Since you obviously don't want to do the entire calculation twice, you have to cache it somehow. The easiest way to cache it seems to be for AType to be a Seq. You say This has the downside of a full intermediate variable, but that's exactly what you want in this case. That full intermediate is your cache, and cannot be trivially avoided.

If you profile the code and this is a chokepoint, then the only faster way I can think of is to write a special function to calculate f and g in parallell, but that'd be super-confusing, and very much not recommended.

void g(Seq &d, Seq &e, Expr &t, Seq &f, Seq &g) 
{
    for(int i=0; i<d.size(); ++i) {
        auto ti = t[i];
        f[i] = d[i]*ti;
        g[i] = e[i]*e[i]*ti;
    }
}
void f(Seq &a,Seq &b,Seq &c,Seq &d,Seq &e) 
{
    Expr t = (a+2*b)/(a+b+c);
    Seq f, g;
    g(d, e, t, f, g);
    //do something with f and g
}

继续阅读：expression templates

Are efficient "repeatedly used intermediates" possible in C++ expression template programming?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？