开发者

Ways to speed up a huge case statement? C++

I am running through a file and dealing with 30 or so different fragment types. So every time, I read in a fragment and compare it's type (开发者_开发技巧in hex) with those of the fragments I know. Is this fast or is there another way I can do this quicker?

Here is a sample of the code I am using:

// Iterate through the fragments and address them individually
    for(int i = 0; i < header.fragmentCount; i++) 
    {
        // Read in memory for the current fragment
        memcpy(&frag, (wld + file_pos), sizeof(struct_wld_basic_frag));

        // Deal with each frag type
        switch(frag.id) 
        {
        // Texture Bitmap Name(s)
        case 0x03:
            errorLog.OutputSuccess("[%i] 0x03 - Texture Bitmap Name", i);
            break;
        // Texture Bitmap Info
        case 0x04:
            errorLog.OutputSuccess("[%i] 0x04 - Texture Bitmap Info", i);
            break;
        // Texture Bitmap Reference Info
        case 0x05:
            errorLog.OutputSuccess("[%i] 0x05 - Texture Bitmap Reference Info", i);
            break;
        // Two-dimensional Object
        case 0x06:
            errorLog.OutputSuccess("[%i] 0x06 - Two-dimensioanl object", i);
            break;

It runs through about 30 of these and when there are thousands of fragments, it can chug a bit. How would one recommend I speed this process up?

Thank you!


If all of these cases are the same except for the format string, consider having a array of format strings, and no case, as in:

const char *fmtStrings[] = {
  NULL, NULL, NULL,
  "[%i] 0x03 - Texture Bitmap Name",
  "[%i] 0x04 - Texture Bitmap Info",
  /* ... */
};

// ...
errorLog.OutputSuccess(fmtStrings[i], i);
// (range checks elided)

This should be less expensive than a switch, as it won't involve a branch misprediction penalty. That said, the cost of this switch is probably less than the cost of actually formatting the output string, so your optimization efforts may be a bit misplaced.


The case statement should be very fast, because when your code is optimized (and even sometimes when it isn't) it is implemented as a jump table. Go into the debugger and put a breakpoint on the switch and check the disassembly to make sure that's the case.


I think performing the memcpy is probably causing a lot of overhead. Maybe use your switch statement on a direct access to your data at (wld + file_pos).


I'm skeptical that the 30 case statements are the issue. That's just not very much code compared to whatever your memcpy and errorLog methods are doing. First verify that your speed is limited by CPU time and not by disk access. If you really are CPU bound, examine the code in a profiler.


If your fragment identifiers aren't too sparse, you can create an array of fragment type names and use it as a lookup table.

static const char *FRAGMENT_NAMES[] = {
    0,
    0,
    0,
    "Texture Bitmap Name", // 0x03
    "Texture Bitmap Info", // 0x04
    // etc.
};

...

const char *name = FRAGMENT_NAMES[frag.id];

if (name) {
    errorLog.OutputSuccess("[%i] %x - %s", i, frag.id, name);
} else {
    // unknown name
}


If your log statements are always strings of the form "[%i] 0xdd - message..." and frag.id is always an integer between 0 and 30, you could instead declare an array of strings:

std::string messagesArray[] = {"[%i] 0x00 - message one", "[%i] 0x01 - message two", ...}

Then replace the switch statement with

errorLog.OutputSuccess(messagesArray[frag.id], i);


If the possible fragment type values are all contiguous, and you don't want to do anything much more complex than printing a string upon matching, you can just index into an array, e.g.:

  const char* typeNames[] = {"Texture Bitmap Name", "Texture Bitmap Info", ...};

  /* for each frag.id: */
  if (LOWER_LIMIT <= frag.id && frag.id < UPPER_LIMIT) {
    printf("[%i] %#02x - %s\n", i, frag.id, typeNames[frag.id-LOWER_LIMIT]);
  } else {
   /* complain about error */
  }


It's impossible to say for sure without seeing more, but it appears that you can avoid the memcpy, and instead use a pointer to walk through the data.

struct_wld_basic_frag *frag = (struct_wld_basic_frag *)wld;

for (i=0; i<header.fragmentCount; i++)
    errorlog.OutputSuccess(fragment_strings[frag[i].id], i);

For the moment, I've assumed an array of strings for the different fragment types, as recommended by @Chris and @Ates. Even at worst, that will improve readability and maintainability without hurting speed. At best, it might (for example) improve cache usage, and give a major speed improvement -- one copy of the code to call errorlog.outputSuccess instead of 30 separate copies could make room for a lot of other "stuff" in the cache.

Avoiding copying data every time is a lot more likely to do real good though. At the same time, I should probably add that it's possible for this to cause a problem -- if the data isn't correctly aligned in the original buffer, attempting to use the pointer won't work.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜