开发者

How to optimize this function with the compiler?

I have been doing the course on Compiler and Tools (this semester). I ha开发者_开发知识库ve read till intermediate code generation and also saw DAG representation for optimality. One thing is clear with the compiler is that what ever the intermediate code have been produced, it have to be mapped to the Instruction set of the system, so that we can run our program.

Let us say that i have a write a compiler for the particular architecture(say A),where the addition between two numbers is ADD R1,R2,R3(from the A's instruction set) where R1-is the destination,R2,R3 are the sources. And i have mapped with these instruction, that is when i want to add two numbers(regardless of its type,for simplicity) which is represented in the intermediate code, i will run the ADD opcode!.

Assume that the new architecture have been arrived in the market, in which addition of two numbers have different Instruction set say AD R1,R2,R3. Now obviously my complier wont add the numbers!

Now my question is when i write my compiler for the my programming language, i have to add all the architecture with their instruction set, so that my compiler does correctly what it need to do? If so what all the methods there to Optimize this effect? Because adding all the instruction set would nearly down my performance.

Correct If I'm wrong!


You build a compiler for a specific instruction set, which is a subset of a chosen an "instruction set architecture (ISA)". (Many instruction sets have I/O instructions, but compilers almost never generate these). There may be several different processor designs that execute this "instruction set architecture" which will work with specific instruction subset you have chosen.

There are three kinds of evolutionary events occur in practice.

  • You determine your compiler would be better if used some more instructions from the ISA. For instance, you might decide that the MULTIPLY instruction would allow your compiler to generate faster code than the subroutine call you have used for multiply in the past. In this case, you extend your compiler slightly.

  • The owners of the ISA (Intel, AMD, IBM, ...) add entire new sets of instructions to the ISA. For instance, data parallel operations on a cache-line of data ("SIMD instructions"). You can decide to add some of these to your compiler. This event can be difficult, as new families of instructions generally make different assumptions about how data is layed out and processed.

  • You find some completely different ISA that you want to handle. In this case, you are going to rebuild the back end of your compiler as the insturction set is completely different, in terms of what registers exist, how they are used, etc.

Compiler builders often build the compilers to operated in stages. A last stage before generation of actual machine code typically represents the program as an abstract set of operations on pretty low level data (e.g., operations on fixed-word-size values) with fairly standard abstract operations (ADD, MULTIPLY, COMPARE, JUMP, CALL, STORE, LOAD, ...) that have no commitments to the actual ISA (especially no commitements about register or specific machine instructions). Doing it this way allows higher-level optimizations to be done independent of the ISA; just think of this as good modularization. The last few stages are specialized to the ISA; usually on stage to allocate registers, followed by a stage the pattern matches actual instructions against the abstract ones.

There are entire books written on optimizing on the higher level, and other books written on the final code generation sates (and often books that address both in seperate chapters). [Both Aho&Ullman Dragon book, and Torczon's Engineering a Compiler are quite good books on both topics). There is a lot of technology allowing one to write down the final instruction sets and registers layouts, and will generate much of the last stages; GCC has such. That technology is complex and wont fit in this sentence; best to go read the books.

Once you get a compiler working for the first ISA in this fashion, you can build a variant using the same technology. You end up with two physical compiler, one for each ISA. They share all the front end logic and absract code generation and optimization. They vary completely for the last stages.

What you should understand is that building the compiler to take advantage of instructions sets is a complex process.


It all depends on how big the change is. Lets say you have ISA X 1.0 with instruction ADD R1, R2, R3. If you get a new version X 1.1 which replaces instruction ADD R1, R2, R3 with AD R1, R2, R3 the change is small. Essentially you have the same instruction with different name. You can accomodate thi change with a flag cc -arch X1.1 which will emit "AD" instead of "ADD".

If the change is bigger like AD R1, R2 (R2 <- R1 + R2) then the new instruction is different than the old ADD. In this case you need to change your code generator and include this instruction to your set of available instructions. Again cc -arch X1.1 should let the generator know that this instruction is available.

If the change is even bigger like retargeting the compiler to ISA Y then you need to change all instructions generated by the code generator. That can be a lot of work.

Now whether the existing low level compiler optimizations will work with the new target is also questionable. In general a well defined backend (like gcc) can support many ISA's. It is using a low level intermediate representation (IR) where your instruction has several properies including the opcode name ("ADD" or "AD"). However, the low level optimizers are not very much concerned with the name as much as other properies of the instruction i.e. How many operands does it have? What operands are read/written? Does it access memory? Does it change the program flow? etc.

If you can adapt your target architecture into the compiler framework then you can succesfuly utilize existing optimizations.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜