开发者

how to implement an atomic assignment on AIX/powerpc?

I'm porting a kernel extentsion to 32/64 bit AIX on multi-processor PowerPC, written in C. I don't need more than atomic read operation and atomic write operations (I have no use for fetch-and-add, compare-and-swap etc.) Just to clarify: to me, "atomicity" means not only "no interleaving", but also "visibility across multiple cores". The operations operate on pointers, so operations on 'int' variables are useless to me.

If I declare the variable "volatile", the C standard says the variable can be modified by unknown factors and is therefore not subject to optimizations.

From what I read, it seems that regular reads and writes are supposed to be non-interleaved, and the linux kernel souces seem to agree. it says:

__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));

stw is "store word", which is supposedly atomic, but I don't know what the "%U0%X0" means. I do not understand how this assembly instruction imposes visibility. When I compile my kernel extension, 'std' is used for the assignment I want, but it should also be atomic for a 64 bit machine, from what I read. I have very little understanding of the specifics of PowerPC and its instruction set, However I did not find in the assembly listing of the compiled file any memory barrier instructions ("sync" or "eieio").

The kernel provides the fetch_and_addlp() service which can be used to implement atomic read (v = fetch_and_addlp(&x, 0), for example).

So my questions are:

  1. is it enough to declare the variable 'volatile' to achieve read and write atomicity in the sense of visibility and no-interleaving?

  2. if the answer t开发者_开发问答o 1 is "no", how is such atomicity achieved?

  3. what is the meaning of "%U0%X0" in the Linux PowerPC atomic implementation?


There are idiosyncrasies in the GCC inline assembly syntax.

in the line,

__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));

the m is an output operand and the r is an input operand. The %1 and %0 refer to the argument order (0->m, 1->r)

the stw assembly instruction takes 2 arguments and the %U0%X0 are constraints on the arguments. These constraints are to force GCC to analyze the arguments and make sure you dont do something stupid. As it turns out, `U' is powerpc-specific (I'm used to the X64 constraint set :). The full list of constraints can be found in :

http://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints


I have managed to answer questions 1 and 2, but not 3:

  1. No, its not enough.
  2. Memory barriers are still required. I used the XLC built in __lwsync(). This should both prevents reordering by the processor and publishes the change to other processors.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜