Discover program's image in memory

2023-03-15 18:43 问答作者：

Here's something less-than-important that I've been musing about recently:

I know that my program's virtual address space contains the stack (of each thread) and a heap and some statically allocated memory and all that. But does it also contain the program's image with all the instructions? And is it possible somehow (no matter how platform dependent the trick) to find out the address range of my own image? Is the memory read-only?

In short: Can I make a program that prints itself?

If that can't be done, a lesser question would be, can I print my own stack? I was thinking something like this:

const char * BASE;

void print_stack();

int main(int argc, char * argv[]) {
  BASE = &argc;
  /* do stuff */
  print_stack();
  return 0;
}

void print_stack() {
  int sentinel;
  const char * bottom开发者_开发问答 = &sentinel;
  while (bottom < BASE)
    printf("%02X ", *bottom++);
}

To answer your first question, of course it contains your program's instructions: you can only execute what you can access. To get at the address of your instructions, you can take the address of a function and start printing from there. You can then use a library like udis86 to disassemble them. Note however, that your compiler isn't required to order the functions in any specific way, so starting at main and reading from there isn't guaranteed to get everything, might trample on un-allocated memory.

To get at the entire instruction memory range (you're looking for the .text segment), you can look up the address+size from your operating system (In Linux, that info will be in /proc/[pid]/maps, in OS X you can either use vmmap or ask the kernel via the mach_vm_region() kernel trap), and then just read the memory directly. You can also use nm to dump the symbols of your program, isolate all that point to the .text segment (They should be marked with T in nm output) and dump those. This is not a good method, since you'd have to disassemble everything to determine where they end in the case there's padding between them.

All the mapped memory is accessible, but not all of it will be writeable (The .text segment wouldn't be). One thing to keep in mind, the addresses will probably not be stable invocation to invocation if your operating system implements ASLR.

To address your second question, yes you can print your own stack and symbolicate it with the help of third-party libraries, but not the way you're trying to do it. Stack typically grows down (i.e. Starts at a high address and moves towards lower addresses. As an exercise to the reader, disassemble one of your functions via gdb or another disassembler and look how memory on the stack gets allocated during your function prolog), so your for-loop will never run as BASE will probably always be larger than the address of sentinel.

Yes, the code bytes, usually called the program "text" in this context, is part of your virtual address space.

You can determine the address of a function, e.g. main(), and use that to determine a single valid address in a range of text pages. You will then have to call virtual memory specific APIs to determine the extent of the mapping at that address.

Shared libraries (.so files) will have their texts mapped to discontiguous VM regions.

继续阅读：c memory

Discover program's image in memory

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？