What counts as a flop?

2023-01-13 06:15 问答作者：

Say I have a C program that in pseudoish is:

For i=0 to 10
    x++
    a=2+x*5
next

Is the number of FLOPs for this (1 [x++] + 1 [x*5] 开发者_开发技巧+ 1 [2+(x+5))] * 10[loop], for 30 FLOPS? I am having trouble understanding what a flop is.

Note the [...] are indicating where I am getting my counts for "operations" from.

For the purposes of FLOPS measurements, usually only additions and multiplications are included. Things like divisions, reciprocals, square roots, and transcendental functions are too expensive to include as a single operation, while things like loads and stores are too trivial.

In other words, your loop body contains 2 adds and 1 multiply, so (assuming x is floating point) each loop iteration is 3 ops; if you run the loop 10 times you've done 30 ops.

Note that when measuring MIPS, your loop would be more than 3 instructions because it also includes loads and stores that the FLOPS measurement doesn't count.

FLOPS stands for floating operations per second. If you are dealing with integers then you don't have any floating point operations in your code.

The posters have made it clear that FLOPS (detailed here) are concerned with floating point (as opposed to integer) operations per second, so you not only have to count how many operations you're performing, but in what period of time.

If "x" and "a" are floats, you're making a good attempt at counting the number of operations in your code, but you'd have to check the object code to make sure what quantity of floating point instructions are actually used. Eg, if "a" is not subsequently used, an optimizing compiler might not be bothering to compute it.

Also, some floating operations (such as adding) might be much faster than others (such as multiplying), so a loop of only float adds could run at many more FLOPS than a loop of only float multiplies on the same machine.

FLOPs (the lowercase s indicates the plural of FLOP, per Martinho Fernandes comment) are referring to machine language floating point instructions, so it depends how many instructions your code compiles down to.

First off, if all of these variables are integers, then there are no FLOPs in this code. Let's assume, however, that your language recognizes all of these constants and variables as single-precision floating point variables (using single-precision makes loading the constants easier).

This code could compile to (on MIPS):

Assignment of variables: x is in $f1, a is in $f2, i is in $f3.
All other floating point registers are compiler-generated temporaries.
$f4 stores the loop exit condition of 10.0
$f5 stores the floating point constant 1.0
$f6 stores the floating point constant 2.0
$t1 is an integer register used for loading constants
    into the floating point coprocessor.

     lui $t1, *upper half of 0.0*
     ori $t1, $t1,  *lower half of 0.0*
     lwc1 $f3, $t1
     lui $t1, *upper half of 10.0*
     ori $t1, $t1,  *lower half of 10.0*
     lwc1 $f4, $t1
     lui $t1, *upper half of 1.0*
     ori $t1, $t1,  *lower half of 1.0*
     lwc1 $f5, $t1
     lui $t1, *upper half of 2.0*
     ori $t1, $t1,  *lower half of 2.0*
     lwc1 $f6, $t1
st:  c.gt.s $f3, $f4
     bc1t end
     add.s $f1, $f1, $f5
     lui $t1, *upper half of 5.0*
     ori $t1, $t1,  *lower half of 5.0*         
     lwc1 $f2, $t1
     mul.s $f2, $f2, $f1
     add.s $f2, $f2, $f6
     add.s $f3, $f3, $f5
     j st
end: # first statement after the loop

So according to Gabe's definition, there are 4 FLOPs inside the loop (3x add.s and 1x mul.s). There are 5 FLOPs if you also count the loop comparision c.gt.s. Multiply this by 10 for a total of 40 (or 50) FLOPs used by the program.

A better optimizing compiler might recognize that the value of a isn't used inside the loop, so it only needs to compute the final value of a. It could generate code that looks like

     lui $t1, *upper half of 0.0*
     ori $t1, $t1,  *lower half of 0.0*
     lwc1 $f3, $t1
     lui $t1, *upper half of 10.0*
     ori $t1, $t1,  *lower half of 10.0*
     lwc1 $f4, $t1
     lui $t1, *upper half of 1.0*
     ori $t1, $t1,  *lower half of 1.0*
     lwc1 $f5, $t1
     lui $t1, *upper half of 2.0*
     ori $t1, $t1,  *lower half of 2.0*
     lwc1 $f6, $t1
st:  c.gt.s $f3, $f4
     bc1t end
     add.s $f1, $f1, $f5
     add.s $f3, $f3, $f5
     j st
end: lui $t1, *upper half of 5.0*
     ori $t1, $t1,  *lower half of 5.0*         
     lwc1 $f2, $t1
     mul.s $f2, $f2, $f1
     add.s $f2, $f2, $f6

In this case, you have 2 adds and 1 comparision inside the loop (mutiplied by 10 gives you 20 or 30 FLOPs), plus 1 multiplication and 1 addition outside the loop. Thus, your program now takes 22 or 32 FLOPs depending whether we count comparisions.

Is x an integer or a floating-point variable? If it's an integer, then your loop may not contain any flops.

继续阅读：architecture

What counts as a flop?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？