Decyphering undocumented COM interfaces
I have a pointer to a COM object that implements an undocumented interface. I would really, really like to be able to use said interface. All I have is the IID though. Master software analyst Geoff Chappell has documented a host of these undocumented COM interfaces on his site; see IListView for example. Somehow he even managed to get the function names and signatures. How is something like that even possible? Are they guesses?
Can someone point me in the right direction as to how I would go about something like this? I know the risks of using anything undocumented.
To elaborate, the object I'm interested in is ExplorerFrame.dll's notoriously undocumented ItemsView. By setting an API hook on CoCreateInstance, I can see that the object is created with a certain undocumented IID as its main interface. I'm assuming this is the in开发者_StackOverflow社区terface that through which the control is manipulated, hence my interest in figuring out its members.
You know, you could write to me and ask! There was a time when I would write explicitly that the names and prototypes come from Microsoft's public symbol files, but I long ago abandoned that as verbiage. What sort of reverse engineer would I be if I was always explaining how I got my information! I'd be insulting those of my readers who are reverse engineers and I'd risk boring those who just want the information (which, let's face it, is typically not riveting).
If you don't have the public symbol files, then typelibs are the next best thing. But, of course, not all interfaces appear in the typelibs - not even all that implement IDispatch.
Given that you have an executable and its public symbol file, getting the IID and listing the methods is very nearly the simplest reverse engineering. It's maybe just a bit too complex for reliable automation - though I'd love to be proved wrong on that.
You likely know of the interface because you have a virtual function table for an implementation. Most likely, you found this because you're reverse engineering a class, in which case you find the virtual function tables for all its interfaces by working from the constructor or destructor. The virtual function table is an array of pointers to functions. The public symbol files give you the decorated names of these functions. A competent reverse engineer can undecorate these symbols by sight, mostly, and Visual C++ provides an UNDNAME tool (and your debugger or disassembler may anyway do the work for you). Finding the IID typically requires inspection of the QueryInterface method, matching against the known offset of the interface's virtual function table from the start of the class.
For a simple interface of, say, half a dozen methods, the whole exercise of writing up just a basic listing of IID, offsets and prototypes takes maybe 10 minutes on a good day, and no more than 30 if you're being lazy. Of course, with a lot of these undocumented interfaces, you may then want to check that the implementation and IID are the same in multiple versions - which can quickly turn a good day into a bad one.
By the way, if I guess something or hypothesise, I try to be sure of saying so. For instance, near the end of the documentation you cite of the otherwise undocumented IListView interface, I speak of a window message: you can know the name I give is made up by me because I say "perhaps named something like".
The definitive interpreter of PDB files is MSPDBxx.DLL. The primary tool for interpreting PDB files is the debugger, and by extension nowadays also the Microsoft Visual C++ linker in its guise as the DUMPBIN disassembler. These do not show everything from the PDB files, but they do all the basic stuff, such as listing all the symbols, labelling code and data, and summarising from whatever type information is in the file (which is typically none in public symbol files).
As usual, a competent - well, accomplished - reverse engineer can read these files by sight for information not shown by the standard tools. The most notable example is the section contribution information, which is as close as the public symbol files come to matching code to source files.
How you point the debuggers at symbol files is well documented. My practice for making a listing with DUMPBIN is just to copy both the binary and the corresponding PDB file to the current directory. As long as the filename of the PDB file matches the filename in the binary's debug directory, DUMPBIN works with the PDB file automatically. It really couldn't be easier.
I imagine that non-Microsoft disassemblers and decompilers are at least as capable of using whatever PDB file happens to be available for the target binary.
If your pointer impls IDispatch (which is quite likely) you can QueryInterface for that and then GetIDsOfNames. You likely end up guessing what interfaces it might use and calling QI just to see what works :)
精彩评论