How can I run ARM code from external memory?
I am using an LPC2132 ARM chip to develop a program. However, my program has grown larger than the sp开发者_StackOverflowace on the chip.
How can I connect my chip to some sort of external memory chip to hold additional executable code? Is this possible? If not, what do people normally do when they run out of chip space?
Mark's answer is a good one. One question -- are you running short of RAM, or flash, or both? The solutions / answers might depend...
A couple years ago, I found myself in a similar situation (running out of room (flash & RAM) on the LPC2148. Of the pin-compatible parts, this was the largest flash & largest RAM. So it was an unfortunate situation of "make do with what you have". And as Mark said, the wrong chip was chosen (well actually, the requirements & functionality grew beyond what the chip was originally supposed to do... I'm sure no one else has ever experienced that ;-) )
Anyway, I found myself in a "battle of bytes". Here are the things I remember doing (mind you, a lot of this code I inherited from the customer...)
- [+RAM, -ROM] make anything const that can be
- [+ROM] use Thumb where possible (see Mark's comments)
- [+ROM] use look-up tables where possible
- [+ROM] re-factor & combine common functionality (esp. convert heavily-used function-like macros into subroutines)
- [+ROM] anything that's a function called from one place - put it directly in-line instead of in a function
- [+ROM, +RAM] change all floating point usage to fixed-point
- [+ROM, +RAM] eliminate unused variables + constants (use lint & linker map to find/eliminate/verify)
- [+ROM] try replacing switch w/ if/else, and vice-versa
- [+ROM] make sure your linker is configured to eliminate "dead" (unused) code
- [+ROM] re-work strings + constants so that identical "things" are defined in only one place
- [+ROM] (gasp, sigh) replace data hiding functions w/ macros (or inline if you can) -- beware of preemption, race conditions, mutual exclusion, etc...
- [+ROM, +RAM] - eliminate all debugging / temp code - usually there are I/O pin toggles/prints/etc... that aren't conditionally compiled out
Man there are a bunch more but I have to run to a meeting. All I remember is that it was progress, tens & hundreds of bytes at a time, that ended up yielding some pretty significant savings. I ended up recovering about 20% from flash & RAM, and that was enough to complete the project. It took me maybe ~2 weeks to clean this stuff up, but the cost savings were well worth it.
I'll try to come back & post more tactics, I just can't right now. For the record, I've been in situations where I had to load/swap code in & out of RAM at run-time from serial flash as needed (algorithms, tables, etc..) and it was awful. First, try to tighten your current code as much as possible. It's also a somewhat intellectual exercise and it forces you to get under the hood & understand what the hell your compiler is really doing.
Last point: write good tight code throughout the project, but do this kind of optimization at the end, when it's necessary and a business case justifies it.
Looking at the datasheet for that part available here:
http://www.keil.com/dd/docs/datashts/philips/lpc2131_32_34_36_38.pdf
It doesn't appear to have interfaces for memory mapped external flash or sdram nor does it have an MMU.
It does have SPI ports, which could be used to interface to SD cards, EEPROM, or serial flash for off chip storage but these would not be memory mapped, you would have to handle moving code segments in an out, given the very limited ram on that chip, that would be difficult.
It may be "enough" that you move data into the external storage and store only code in the on chip ROM, this would simplify your challenge at the expense of increased latency when accessing data. You can also look at using the thumb instruction set, which reduces code size at the expense of some speed as well as having the compiler optimize for code density instead of speed.
If not, what do people normally do when they run out of chip space?
The unfortunate answer here is you chose the wrong chip for your application and/or need to rethink how your application is architected to make it fit in this chip.
EDIT:
It also looks like there are some almost pin compatible parts with more resources. The LPC2138 has 512kB of flash and 32kB of ram (in comparison to 64/16 on your part). There are also a couple sizes in between the two available.
A quick glance at the pin outs looked like the only difference was a second ADC on board that is multiplexed with some of the other pins. Obviously look into this fully but it looks like you could just swap out to the higher end parts without modifying the rest of the board.
If you have to connect external memory (meaning hardware changes are necessary), why not use a chip with bigger memory. In fact some chips will be fully pin compatible and have more flash, so you avoid redesign (only chip replacement).
If not, what do people normally do when they run out of chip space?
The first thing they'd do would be to optimise their application. I am not talking about running the compiler optimiser (although that may be part of the solution), but applying techniques such as Dan has suggested. Look at the space efficiency of your data structures, and algorithms, often there is a trade off between space and execution speed, but you may not need the fastest possible algorithm, but you do need to save space.
You need to know your target and whether it is feasible in the first instance. By how much does your application exceed the available space, and how large is it currently? The linker map or build log should tell you this. If you have not addressed optimisation yet, I have seldom seen an application that could not have at least 5% knocked off relatively painlessly, and more with concerted effort even before using the optimiser.
The linker map will also tell you the amount of memory used by each function/module, so you can target your optimisation where it will have the greatest effect. You may also be surprised from the map file at what library code has become linked, and you could ask yourself why and whether it could be eliminated.
Using compiler optimisation limits the ability to use a debugger easily, but you do not need to optimise every module. So if you need to debug but also use compiler optimisation, optimise all modules except the ones you are debugging at any particular time.
Be aware however that code that appears to work but is flawed or uses undefined language behaviour may change its behaviour (i.e. fail) following compiler optimisation; leaving you with code that fails, but cannot be debugged. The best strategy to help avoid this situation is to build the code with the maximum warning level your compiler allows (and set warnings to errors), and eliminate all warnings. If possible use a static analysis tool such as Lint.
If you have not already done it, the quickest and most drastic saving in your case would likely be to compile to the Thumb rather than ARM instruction set.
Finally when all else fails, your part is a member of a family of devices LPC2131/32/34/36/38, the 'largest' part having 512K Flash/32K RAM, so you could change to a different part in the same family and largely retain software compatibility. Check the datasheet if you also need pin compatibility.
Go for a TI OMAP processor. All of these run code from DDR3 (or DDR2) memory and can operate at 1GHz for some models. The only drawback here is these types of processors are all BGA and DDR2/3 memory PCB layout is not simple or easy to get right the first time.
You're going to have to develop some sort of hot-swappable module code and connect in some sort of memory chip for the external modules.
精彩评论