Is rebasing DLLs (or providing an appropriate default load address) worth the trouble?
Rebasing a DLL means to fix up the DLL such, that it's preferred load adress is the load address that the Loader is actually able to load the DLL at.
This can either be achieved by a tool such as Rebase.exe
or by specifying default load addresses for all your (own) dlls so that they "fit" in your executable process.
The whole point of managing the DLL base addresses this way is 开发者_Python百科to speed up application loads. (Or so I understand.)
The question is now: Is it worth the trouble?
I have the book Windows via C/C++ by Richter/Nazarre and they strongly recommend[a] making sure that the load addresses all match up so that the Loader doesn't have to rebase the loaded DLLs.
They fail to argue however, if this speeds up application load times to any significant amount.
Also, with ASLR it seems dubious that this has any value at all, since the load addresses will be randomized anyway.
Are there any hard facts on the pro/cons of this?
[a]: In my WvC++/5th ed it is in the sections titled Rebasing Modules and Binding Modules on pages 568ff. in Chapter 20, DLL Advanced Techniques.
Patching the relocatable addresses isn't the big deal, that runs at memory speeds, microseconds. The bigger issue is that the pages that contains this code now need to be backed up by the paging file instead of the DLL file. In other words, when pages containing code are unmapped, they need to be written to the paging file instead of just getting discarded.
The cost of this isn't that easy to measure, especially on modern machines with lots of RAM. It only counts when the machine starts to get under load with lots of processes competing for memory. And the fragmentation of the paging file.
But clearly, rebasing is a very cheap optimization. And it is very easy to see in the Debug + Windows + Modules window, there's a bright icon on the rebased DLLs. The Address column gives you a good hint what base address would be a good choice. Leave ample space between them so you don't constantly have to tweak this as your program grows.
I'd like to provide one answer myself, although the answers of Hans Passant and others are describing the tradeoffs already pretty well.
After recently fiddling with DLL base addresses in our application, I will here give my conclusion:
I think that, unless you can prove otherwise, providing DLLs with a non-default Base Address is an exercise in futility. This includes rebasing my DLLs.
For the DLLs I control, given the average application, each DLL will be loaded into memory only once anyway, so the load on the paging file should be minimal. (But see the comment of Michal Burr in another answer about Terminal Server environment.)
If DLLs are provided with a fixed base address (without rebasing) it will actually increase address space fragmentation, as sooner or later these addresses won't match anymore. In our app we had given all DLLs a fixed base address (for other legacy reasons, and not because of address space fragmentation) without using rebase.exe and this significantly increased address space fragmentation for us because you really can't get this right manually.
Rebasing (via rebase.exe) is not cheap. It is another step in the build process that has to be maintained and checked, so it has to have some benefit.
A large application will always have some DLLs loaded where the base address does not match, because of some hook DLLs (AV) and because you don't rebase 3rd party DLLs (or at least I wouldn't).
If you're using a RAM disk for the paging file, you might actually be better of if loaded DLLs get paged out :-)
So to sum up, I think that rebasing isn't worth the trouble except for special cases like the system DLLs.
I'd like to add a historical piece that I found on Old New Thing: How did Windows 95 rebase DLLs? --
When a DLL needed to be rebased, Windows 95 would merely make a note of the DLL's new base address, but wouldn't do much else. The real work happened when the pages of the DLL ultimately got swapped in. The raw page was swapped off the disk, then the fix-ups were applied on the fly to the raw page, thereby relocating it. The fixed-up page was then mapped into the process's address space and the program was allowed to continue.
Looking at how this process is done (read the whole thing), I personally suspect that part of the "rebasing is evil" stance dates back to the olden days of Win9x and low memory conditions.
Look, now there's a non-historical piece on Old New Thing:
How important is it nowadays to ensure that all my DLLs have non-conflicting base addresses?
Back in the day, one of the things you were exhorted to do was rebase your DLLs so that they all had nonoverlapping address ranges, thereby avoiding the cost of runtime relocation. Is this still important nowadays?
...
In the presence of ASLR, rebasing your DLLs has no effect because ASLR is going to ignore your base address anyway and relocate the DLL into a location of its pseudo-random choosing.
...
Conclusion: It doesn't hurt to rebase, just in case, but understand that the payoff will be extremely rare. Build your DLL with
/DYNAMICBASE
enabled (and with/HIGHENTROPYVA
for good measure) and let ASLR do the work of ensuring that no base address collision occurs. That will cover pretty much all of the real-world scenarios. If you happen to fall into one of the very rare cases where ASLR is not available, then your program will still work. It just may run a little slower due to the relocation penalty.... ASLR actually does a better job of avoiding collisions than manual rebasing, since ASLR can view the system as a whole, whereas manual rebasing requires you to know all the DLLs that are loaded into your process, and coordinating base addresses across multiple vendors is generally not possible.
They fail to argue however, if this speeds up application load times to any significant amount.
The load time change is minimal, because the v-table is what gets updated with the new addresses. However, if you have low memory - enough that stuff gets loaded in/out of the page file, then the system has to keep the dll in the page file (since the addresses are changed). If the dlls were rebased - and the rebased dlls don't collide with any other dlls - then instead of swapping them out to the page file (and back), the system just overwrites the memory and reloads the dll from the original on the hard drive.
The benefit is only relevant when systems are paging stuff in and out of main memory. The last time I made efforts to keep databases of applications and their base addresses was back in VB6 days, when the computers in our offices and data centers were lucky to have even 256MB of RAM.
Also, with ASLR it seems dubious that this has any value at all, since the load addresses will be randomized anyway.
At the moment ASLR only affects dlls and executables with the dynamic-relocation flag set. This includes Vista/Win7 system dlls and executables, and any developer made items where the developer intentionally set that flag during the build.
If you are going to set the dynamic-relocation flag, then don't bother rebasing the dlls. If all your clients have 4GB of RAM, then don't bother. If your boss is a cheapskate, then maybe.
You have to consider that user DLLs (that are not already loaded into another processes) has to be read from HDD. Usually the memory mapping is used for that (and it uses lazy loading), so if they have to be relocated, they'll have to be actually read from HDD before the process can start.
For those loaded by other processes the copy-on-write mechanism is used. So, again, relocating them will mean additional operations.
What's about ASLR, it's intended for security purposes, not for performance.
Yes, you should do it. ASLR only impacts "system" DLLs and therefore the ones you are writing should not be impacted by ASLR. Additionally, ASLR doesn't completely "randomize" the location of these system binaries, it simply shuffles them around in the basic spot in the vm map.
精彩评论