C++ pimpl idiom wastes an instruction vs. C style?

2022-12-31 18:45 问答作者：

(Yes, I know that one machine instruction usually doesn't matter. I'm asking this question because I want to understand the pimpl idiom, and use it in the best possible way; and because sometimes I do care about one machine instruction.)

In the sample code below, there are two classes, Thing and OtherThing. Users would include "thing.hh". Thing uses the pimpl idiom to hide it's implementation. OtherThing uses a C style – non-member functions that return and take pointers. This style produces slightly better machine code. I'm wondering: is there a way to use C++ style – ie, make the functions into member functions – and yet still save the machine instruction. I like this style because it doesn't pollute the namespace outside the class.

Note: I'm only looking at calling member functions (in this case, calc). I'm not looking at object allocation.

Below are the files, commands, and the machine code, on my Mac.

thing.hh:

class ThingImpl;
class Thing
{
    Thing开发者_JAVA技巧Impl *impl;
public:
    Thing();
    int calc();
};

class OtherThing;    
OtherThing *make_other();
int calc(OtherThing *);

thing.cc:

#include "thing.hh"

struct ThingImpl
{
    int x;
};

Thing::Thing()
{
    impl = new ThingImpl;
    impl->x = 5;
}

int Thing::calc()
{
    return impl->x + 1;
}

struct OtherThing
{
    int x;
};

OtherThing *make_other()
{
    OtherThing *t = new OtherThing;
    t->x = 5;
}

int calc(OtherThing *t)
{
    return t->x + 1;
}

main.cc (just to test the code actually works...)

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();
    printf("calc: %d\n", calc(t2));
}

Makefile:

all: main

thing.o : thing.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c thing.cc

main.o : main.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c main.cc

main: main.o thing.o
    g++ -O2 -o $@ $^

clean: 
    rm *.o
    rm main

Run make and then look at the machine code. On the mac I use otool -tv thing.o | c++filt. On linux I think it's objdump -d thing.o. Here is the relevant output:

Thing::calc():
0000000000000000 movq (%rdi),%rax
0000000000000003 movl (%rax),%eax
0000000000000005 incl %eax
0000000000000007 ret
calc(OtherThing*):
0000000000000010 movl (%rdi),%eax
0000000000000012 incl %eax
0000000000000014 ret

Notice the extra instruction because of the pointer indirection. The first function looks up two fields (impl, then x), while the second only needs to get x. What can be done?

One instruction is rarely a thing to spend much time worrying over. Firstly, the compiler may cache the pImpl in a more complex use case, thus amortising the cost in a real-world scenario. Secondly, pipelined architectures make it almost impossible to predict the real cost in clock cycles. You'll get a much more realistic idea of the cost if you run these operations in a loop and time the difference.

Not too hard, just use the same technique inside your class. Any halfway decent optimizer will inline the trivial wrapper.

class ThingImpl;
class Thing
{
    ThingImpl *impl;
    static int calc(ThingImpl*);
public:
    Thing();
    int calc() { calc(impl); }
};

There's the nasty way, which is to replace the pointer to ThingImpl with a big-enough array of unsigned chars and then placement/new reinterpret cast/explicitly destruct the ThingImpl object.

Or you could just pass the Thing around by value, since it should be no larger than the pointer to the ThingImpl, though may require a little more than that (reference counting of the ThingImpl would defeat the optimisation, so you need some way of flagging the 'owning' Thing, which might require extra space on some architectures).

I disagree about your usage: you are not comparing the 2 same things.

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;                // 1
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();       // 2
    printf("calc: %d\n", calc(t2));
}

You have in fact 2 calls to new here, one is explicit and the other is implicit (done by the constructor of Thing.
You have 1 new here, implicit (inside 2)

You should allocate Thing on the stack, though it would not probably change the double dereferencing instruction... but could change its cost (remove a cache miss).

However the main point is that Thing manages its memory on its own, so you can't forget to delete the actual memory, while you definitely can with the C-style method.

I would argue that automatic memory handling is worth an extra memory instruction, specifically because as it's been said, the dereferenced value will probably be cached if you access it more than once, thus amounting to almost nothing.

Correctness is more important than performance.

Let the compiler worry about it. It knows far more about what is actually faster or slower than we do. Especially on such a minute scale.

Having items in classes has far, far more benefits than just encapsulation. PIMPL's a great idea, if you've forgotten how to use the private keyword.

继续阅读：optimization pimpl-idiom

C++ pimpl idiom wastes an instruction vs. C style?

更多精彩内容

精彩评论

最新问答

知某换热壁面的污垢热阻为0.0003(m2•K),若该换热壁面刚投...？

装修发票的税率是多少？

那个治不孕？

王者荣耀神钩对决怎么进?？

和平谁愿意当我师傅我有点技术但不多？

问答排行榜

Escaping "<" in Perl-generated XML

Is it allowed to ask users to enter credit card details for own payment method?

imessage会显示已读吗？

微信重新建群怎么建？

Heroku and DB GUI