开发者

How to optimize or reduce RAM size in embedded system software?

i am working on embedded softwa开发者_如何转开发re projects in automotive domain. In one of my projects, the application software consumes almost 99% of RAM memory. Actual RAM size available is 12KB. we use TMS470R1B1 Titan F05 microcontroller. I have done some optimisation like finding unused messages in software and deleting them but its still not worth reducing RAM. could you please suggest some good ways to reduce the RAM by some software optimisation?


Unlike speed optimisation, RAM optimisation might be something that requires "a little bit here, a little bit there" all through the code. On the other hand, there may turn out to be some "low hanging fruit".

Arrays and Lookup Tables

Arrays and look-up tables can be good "low-hanging fruit". If you can get a memory map from the linker, check that for large items in RAM.

Check for look-up tables that haven't used the const declaration properly, which puts them in RAM instead of ROM. Especially look out for look-up tables of pointers, which need the const on the correct side of the *, or may need two const declarations. E.g.:

const my_struct_t * param_lookup[] = {...};  // Table is in RAM!
my_struct_t * const param_lookup[] = {...};  // In ROM
const char * const strings[] = {...};    // Two const may be needed; also in ROM

Stack and heap

Perhaps your linker config reserves large amounts of RAM for heap and stack, larger than necessary for your application.

If you don't use heap, you can possibly eliminate that.

If you measure your stack usage and it's well under the allocation, you may be able to reduce the allocation. For ARM processors, there can be several stacks, for several of the operating modes, and you may find that the stacks allocated for the exception or interrupt operating modes are larger than needed.

Other

If you've checked for the easy savings, and still need more, you might need to go through your code and save "here a little, there a little". You can check things like:

Global vs local variables

Check for unnecessary use of static or global variables, where a local variable (on the stack) can be used instead. I've seen code that needed a small temporary array in a function, which was declared static, evidently because "it would take too much stack space". If this happens enough times in the code, it would actually save total memory usage overall to make such variables local again. It might require an increase in the stack size, but will save more memory on reduced global/static variables. (As a side benefit, the functions are more likely to be re-entrant, thread-safe.)

Smaller variables

Variables that can be smaller, e.g. int16_t (short) or int8_t (char) instead of int32_t (int).

Enum variable size

enum variable size may be bigger than necessary. I can't remember what ARM compilers typically do, but some compilers I've used in the past by default made enum variables 2 bytes even though the enum definition really only required 1 byte to store its range. Check compiler settings.

Algorithm implementation

Rework your algorithms. Some algorithms have have a range of possible implementations with a speed/memory trade-off. E.g. AES encryption can use an on-the-fly key calculation which means you don't have to have the entire expanded key in memory. That saves memory, but it's slower.


Deleting unused string literals won't have any effect on RAM usage because they aren't stored in RAM but in ROM. The same goes for code.

What you need to do is cut back on actual variables and possibly the size of your stack/stacks. I'd look for arrays that can be resized and unused varaibles. Also, it's best to avoid dynamic allocation because of the danger of memory fragmentation.

Aside from that, you'll want to make sure that constant data such as lookup tables are stored in ROM. This can usually be achieved with the const keyword.


Make sure the linker produces a MAP file - it will show you where the RAM is used. Sometimes you can find things like string literals/constants that are kept in RAM. Sometimes you'll find there are unused arrays/variables put there by someone else.

IF you have the linker map file it's also easy to attack the modules which are using the most RAM first.


Here are the tricks I've used on the Cell:

  • Start with the obvious: squeeze 32-bit words into 16s where possible, rearrange structures to eliminate padding, cut down on slack in any arrays. If you've got any arrays of more than eight structures, it's worth using bitfields to pack them down tighter.
  • Do away with dynamic memory allocation and use static pools. A constant memory footprint is much easier to optimize and you'll be sure of having no leaks.
  • Scope local allocations tightly so that they don't stay on stack longer than they have to. Some compilers are very bad at recognizing when you're done with a variable, and will leave it on the stack until the function returns. This can be bad with large objects in outer functions that then eat up persistent memory they don't have to as the outer function calls deeper into the tree.
  • alloca() doesn't clean up until a function returns, so can waste stack longer than you expect.
  • Enable function body and constant merging in the compiler, so that if it sees eight different consts with the same value, it'll put just one in the text segment and alias them with the linker.
  • Optimize executable code for size. If you've got a hard realtime deadline, you know exactly how fast your code needs to run, so if you've any spare performance you can make speed/size tradeoffs until you hit that point. Roll loops, pull common code into functions, etc. In some cases you may actually get a space improvement by inlining some functions, if the prolog/epilog overhead is larger than the function body.

The last one is only relevant on architectures that store code in RAM, I guess.


w.r.t functions, following are the handles to optimise the RAM

  1. Make sure that the number of parameters passed to a functions is deeply analysed. On ARM architectures as per AAPCS(ARM arch Procedure Call standard), maximum of 4 parameters can be passed using the registers and rest of the parameters would be pushed into the stack.

  2. Also consider the case of using a global rather than passing the data to a function which is most frequently called with the same parameter.

  3. The deeper the function calls, the heavier is the use of the stack. use any static analysis tool, to get to know worst cast function call path and look for venues to reduce it. When function A is calling function B, B is calling C, which in turn calls D, which in turn calls E and goes deeper. In this case registers can't be at all levels to pass the parameters and so obviously stack will be used.

  4. Try for venues for clubbing the two parameters into one wherever applicable. remember that all the registers are of 32bit in ARM and so further optimisation is also possible. void abc(bool a, bool b, uint16_t c, uint32_t d, uint8_t e)// makes use of registers and stack void abc(uint8 ab, uint16_t c, uint32_t d, uint8_t e)//first 2 params can be clubbed. so total of 4 parameters can be passed using registers

  5. Have a re-look on nested interrupt vectors. In any architecture, we use to have scratch-pad registers and preserved registers. Preserved registers are something which needs to be saved before the servicing the interrupt. In case of nested interrupts it will be needing huge stack space to back up the preserved registers to and from the stack.

  6. if objects of type such as structure is passed to the function by value, then it pushes so much of data(depending on the struct size) which will eat up stack space easily. This can be changed to pass by reference.

regards

barani kumar venkatesan


Adding to the previous answers.

If you are running your program from RAM for faster execution, you can create a user defined section which contains all the initialization routines which you are sure that it wont run more than once after your system boots up. After all the initialization functions executed, you can re use the region for heap.

This can be applied to the data section which are identified as not helpful after a certain stage in your program.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜