Interpreting jump tables / branch tables
I've been slowly picking things up with assembly. I am working on a Canon Rebel T1i, here is a small snippet of a code flow chart that I am trying to understand. To my knowledge, I believe the c开发者_运维问答amera has a 132MHz ARM v5 processor:
http://i.imgur.com/PtWC9.png
I have searched the bottom of google attempting to understand how jump tables work, and no matter how much I read I just can't connect things together to understand it. I understand a jump table is similar to a case statement, but I don't understand just how it moves through the table.
Ex: in this example there is only one CMP operation, so I don't understand how exactly this is working. Any help will be greatly appreciated!!
I dont think you have enough info on the screen shot to understand how it connects to your question. But a jump table in general...
In C think of an array of functions, and you have initialized each element in the array of functions, at some point later your code makes some decision and uses an index to choose one of those functions. As you mentioned a case statement, could be implemented that way but that would be the exception not the rule, all depends on the variable being used in the switch and the size/width/nature of the elements in the case statement.
You have been picking up assembly, so you understand registers, doing math with registers, storing things in registers, etc. The program counter can be used by many instructions as just another register, the difference is when you write something to it, you change what instruction is executed next.
Lets try a case statement example:
switch(bob&3)
{
case 0: ted(); break;
case 1: joe(); break;
case 2: jim(); bob=2; break;
case 3: tim(); bob=7; break;
}
What you COULD (probably would not) do is:
casetable:
.word a
.word b
.word c
.word d
caseentry:
ldr r1,=bob
ldr r0,[r1]
ldr r2,=casetable
and r0,#3
ldr pc,[r2,r0,lsl #2]
a:
bl ted
b caseend
b:
bl joe
b caseend
c:
bl jim
mov r0,#2
ldr r1,=bob
str r0,[r1]
b caseend
d:
bl tim
mov r0,#7
ldr r1,=bob
str r0,[r1]
b caseend
caseend:
So the four words after the label casetable: are the addresses where the code starts for each of the cases, case0 starts at a: case1 code starts at b: and so on. What we need to do is take the variable used by the switch statement and mathematically compute an address for the item in the table. Then we need to load the address from the table into the program counter. Writing to the program counter is the same as performing a jump.
So the C sample was crafted intentially to make this easy. First load the contents of the bob variable into r0. And it with 3. The items in the jump table are 32 bit addresses, or 4 bytes so we need to multiply r0 times 4 to get the offset in the table. A shift left of 2 is the same as a multiply by 4. And we need to add r0<<2 to the base address for the jump table. So essentially we are computing address_of(casetable)+((bob&3)<<2) The read memory at that computed address and load that value into the program counter.
With arm (you mentioned this was arm) you can do much of this in one instruction:
ldr pc,[r2,r0,lsl #2]
Load into the register pc, the contents of the memory location [r2+(r0<<2)]. r2 is the address of casetable, and r0 is bob&3.
Basically a jump table boils down to mathmatically computing an offset into a table of addresses. The table of addresses are addresses you want to jump/branch to depending on one of the parameters used in the math operation, in my example above bob is that variable. And the addresses a,b,c,d are the address choices I want to pick from based on the contents of bob. There are a zillion fun and interesting ways to do this sort of thing, but it all boils down to computing at runtime the address to branch to, and shoving that address into the program counter in a way that causes the particular processor to perform what is essentially a jump.
Note another, perhaps easier to read way to compute and jump in my example would be:
mov r3,r0,lsl #2
add r3,r2
bx r3
The cores that support thumb use the bx instruction with a register often, normally you see bx lr to return from a branch link (subroutine) call. bx lr means pc = lr. bx r3 means pc = r3.
I hope this is what you were asking about, if I have misunderstood the question, please elaborate.
EDIT:
Looking at the code on your screen shot.
cmp r0,#4
addls pc,pc,r0,lsl #2
The optional math (ADDLS add if lower or same) computes the new program counter value (a jump table is a computation stored in the program counter) based on the program counter itself plus an offset r0 times 4. For arm processors, at the time of execution, the program counter is two instructions ahead. so, mixing those two lines of code and a portion of my example:
cmp r0,#4
addls pc,pc,r0,lsl #2
ldr pc,=a
ldr pc,=b
ldr pc,=c
ldr pc,=d
...
At the time addls is executed the program counter contains the address for the ldr pc,=b instruction. So if r0 contains a 0 then 0<<2 = 0, pc plus 0 would branch to the ldr pc,=b instruction then that instruction causes a branch to the b: label. if r0 contained a 1 at the time of addls then you would execute the ldr pc,=c instruction next and so on. You can make a table as deep as you want this way. Also note that since the add is conditional, if the condition does not happen you will execute that first instruction after the addls, so maybe you want that to be an unconditional branch to branch over the table, or branch backward an loop or maybe it is a nop so that you fall into the first jump, or what I did above is have it branch to some other place. So to understand what is going on you need to example the instructions that follow the addls to figure out what the possible jump table destinations are.
精彩评论