Tools for Automated Source Code Editing
I'm working on a research project to automatically modify code to include advanced mathematical concepts (like adding random effects into a loop or encapsulating an existing function with a new function that adds in a more advanced physical model).
My question to the community is: are there are any good tools for manipulating source code directly? I want to do things like
- Swap out functions
- Add variable declarations wherever they are required
- Determine if a function is multiplied by anything
- Determine what functions are called on a line of code
- See what parameters are passed to a function and replace them with alternatives
- Introduce new function calls on certain lines of code
- Wherever possible just leaving the rest of the code untouched and write out the results
I never want to actually compile the code I only want to understand what symbols are used, replace and add in a syntactically correct way, and be able to declare variables at the right position.
I've been using a minimal flex/bison approach with some success but I do not feel the it is robust. I hate to take on writing a full language parser just to add some new info to the end of a line or the top of a function. It seems like this is almost what is going to be required but it also seems like there should be some tools out there to do these types of manipulations already.
The code to be changed is in a variety of languages, but I'm particularly interested in FORTRAN.
A开发者_JAVA技巧ny thoughts?
Our DMS Software Reengineering Toolkit is a general purpose program transformation system, that accepts arbitrary language descriptions to allow it to manipulate those languages. It has front ends for Fortran, C++, C, Java, C#, COBOL and many other languages. These front ends parse source code to compiler data structures (e.g., complete ASTs), and enable the ASTs to be regenerated as valid language source text even retaining comments. The DMS APIs allow arbitrary navigation/inspection/modificaton of the ASTs, construction of attribute-grammar based analyzers. DMS provides support machinery for building language specific symbol tables, as well as control and data flow analysis. Finally, for any language provided to DMS, it can apply source-pattern matches to the AST, as well as source-to-source pattern-driven transformations to match and modify the ASTs, where each transformation can be enabled by an arbitrary analysis predicate.
One of your tasks is to find a function call multipled by something. This DMS pattern would recognize it:
domain Fortran.
pattern match_multiplied_function_call(f: IDENTIFIER, a: arguments, t: term): product
= " \f(\a)*\t ";
which matches the AST where the corresponding syntax is found.
DMS has been under development and use for over 15 years. It has been used to carry out production analyses and transformations on very large target software systems (for C, 25 million lines, for COBOL 10 million lines, for Fortran 1.5 million lines, etc.).
The Fortran front end handles F77 and F90, and it handles the usual extra gunk found in Fortran programs (smatterings of F2003, Cray pointers, ...) and even handles C preprocessor directives used inside the Fortran text.
I am not sure that this is 100% of what you are looking for but check out ANTLR. Someone even made a Fortran grammar for it.
It appears to be a good environment for working with language representations and seems to be modular enough that it might support the transformations you are talking about.
Just like my predecessor in replying to your question I am not sure whether this is what you're looking for (or even whether it will meet any of your requirements at all), but I know there is the Photran plugin for Eclipse.
I don't use Eclipse, I have never used Photran, but I know some people who do use it, so I just thought I could spread the word...
精彩评论