Bus error when recompiling the binary
Sometimes, on various Unix architectures, recompiling a program while it is running causes the program to crash with a "Bus error". Can anyone explain under which conditions this happens? First, how can updating the binary on disk do anything to the code in memory? The only thing I can imagine is that some systems mmap the code into memory and when the compiler rewrites the disk image, this causes the mmap to become invalid. What would the advantages be of such a met开发者_如何学运维hod? It seems very suboptimal to be able to crash running codes by changing the executable.
On local filesystems, all major Unix-like systems support solving this problem by removing the file. The old vnode stays open and, even after the directory entry is gone and then reused for the new image, the old file is still there, unchanged, and now unnamed, until the last reference to it (in this case the kernel) goes away.
But if you just start rewriting it, then yes, it is mmap(3)'ed. When the block is rewritten one of two things can happen depending on which mmap(3) options the dynamic linker uses:
- the kernel will invalidate the corresponding page, or
- the disk image will change but existing memory pages will not
Either way, the running program is possibly in trouble. In the first case, it is essentially guaranteed to blow up, and in the second case it will get clobbered unless all of the pages have been referenced, paged in, and are never dropped.
There were two mmap flags intended to fix this. One was MAP_DENYWRITE (prevent writes) and the other was MAP_COPY, which kept a pure version of the original and prevented writers from changing the mapped image.
But DENYWRITE has been disabled for security reasons, and COPY is not implemented in any major Unix-like system.
Well this is a bit complex scenario that might be happening in your case. The reason of this error is normally the Memory Alignment issue. The Bus Error is more common to FreeBSD based system. Consider a scenario that you have a structure something like,
struct MyStruct { char ch[29]; // 29 bytes int32 i; // 4 bytes }
So the total size of this structure would be 33 bytes. Now consider a system where you have 32 byte cache lines. This structure cannot be loaded in a single cache line. Now consider following statements
Struct MyStruct abc; char *cptr = &abc; // char point points at start of structure int32 *iptr = (cptr + 1) // iptr points at 2nd byte of structure.
Now total structure size is 33 bytes your int pointer points at 2nd byte so you can 32 byte read data from int pointer (because total size of allocated memory is 33 bytes). But when you try to read it and if the structure is allocated at the border of a cache line then it is not possible for OS to read 32 bytes in a single call. Because current cache line only contains 31 bytes data and remaining 1 bytes is on next cache line. This will result into an invalid address and will give "Buss Error". Most operating systems handle this scenario by generating two memory read calls internally but some Unix systems don't handle this scenario. To avoid this, it is recommended take care of Memory Alignment. Mostly this scenario happen when you try to type cast a structure into another datatype and try reading the memory of that structure.
The scenario is bit complex, so I am not sure if I can explain it in simpler way. I hope you understand the scenario.
精彩评论