Determining programmatically what modules are loaded in another process? (OS X)
What I'm trying to do I feel is pretty straightforward, I'm just not sure exactly how to do it.
Specifically I just want to get a list of modules (shared/dynamic libraries) that are loaded in another process. As well as get the starting address of where that module is in the given process.
It's very straightforward to get this information with GDB. You simple connect to the process, and type "info shared." That is the exact type of information I would like to get to. Such as:
Num Basename
Type Address Reason | | Source | | | | | | | | 1 Adium - 0x1000 开发者_运维百科 exec Y Y /Applications/Adium.app/Contents/MacOS/Adium (offset 0x0) 2 dyld - 0x8fe00000 dyld Y Y /usr/lib/dyld at 0x8fe00000 (offset 0x0) with prefix "__dyld_" 3 WebCore F 0x95b6a000 dyld Y Y /System/Library/Frameworks/WebKit.framework/Versions/A/Frameworks/WebCore.framework/Versions/A/WebCore at 0x95b6a000 (offset 0x95b6a000)
Does anyone know how to do this programmatically? Obviously where modules load is dynamic so I need to determine where it's located.
First use task_for_pid() to obtain a task port.
Then find the "dyld all images info address" using task_info:
struct task_dyld_info dyld_info;
mach_msg_type_number_t count = TASK_DYLD_INFO_COUNT;
if (task_info(task, TASK_DYLD_INFO, (task_info_t)&dyld_info, &count) == KERN_SUCCESS)
{
// retrieve dyld_info.all_image_info_addr;
}
This address will point to a struct dyld_all_image_infos in memory:
struct dyld_all_image_infos {
uint32_t version;
uint32_t infoArrayCount;
const struct dyld_image_info* infoArray;
// ...
}
The infoArrayCount and infoArray entries are important here. You have to retrieve these values (use mach_vm_read) and iterate through the infoArray. Each entry is a struct dyld_image_info:
struct dyld_image_info {
const struct mach_header* imageLoadAddress;
const char* imageFilePath;
uintptr_t imageFileModDate;
};
In this struct, you are interested in retrieving the values to imageLoadAddress (an address to the library in memory) and imageFilePath (an address to the NULL terminated file path in memory).
Important note: the fields that are marked as a pointer or as uintptr_t in the structs above have a different byte size depending on whether the running process is 32 or 64 bit. You may be able to determine pointer size by seeing if dyld_info.all_image_info_format is TASK_DYLD_ALL_IMAGE_INFO_32 or TASK_DYLD_ALL_IMAGE_INFO_64 (should work, but I have not tested this myself).
Lastly, this will still not include an entry to the dynamic linker itself. To retrieve that, one way I've found is to iterate through the vm regions (i.e, mach_vm_region), and find the first region that looks like it's a mach dylinker (check for MH_DYLINKER as the file type; see mach-o file format for more info). Last I recall checking, gdb and/or lldb have a function for doing this too. Parsing the mach header is also one possible way to tell if the process is 32 or 64 bit.
After you retrieve all the dyld image info entries, you may also want to sort them by address.
I recommend not looking at newosxbook's code for its vmmap implementation. It is outdated (since it still uses DYLD_ALL_IMAGE_INFOS_OFFSET_OFFSET), and it does some unnecessary brute-forcing.
I would suggest that you could go download the source for gdb as used by the Development Tools.
But, well, I've read that source and I'm not sure that telling anyone to go read it is a productive suggestion.
In any case, you will want to use the various mach
APIs to do this. In particular, the APIs are found in /usr/include/mach/*.h
. Specifically, you'll want to start with task_for_pid()
and work your way down to the info you need.
Note that task_for_pid()
(and any other mechanism used to grub through another tasks innards) requires either admin access or membership in the development
group on the machine.
There's some existing BSD-licensed code you can take from the Breakpad project that does exactly this:
- http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/mac/handler/dynamic_images.h
- http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/mac/handler/dynamic_images.cc
dyld provides some hooks for GDB, notably a well-known function symbol that gdb can use to get access to a struct that contains this info. See http://www.opensource.apple.com/source/dyld/dyld-132.13/include/mach-o/dyld_images.h You can see how GDB does it here: http://www.opensource.apple.com/source/gdb/gdb-1344/src/gdb/macosx/macosx-nat-dyld.c (look for "macosx_init_addresses"). The internals of lookup_minimal_symbol are too horrible to talk about, but Breakpad's implementation is fairly straightforward.
Thanks @Zorg for great explanation. Based on @Zorg's clarification, I wrote a simple snippet that implements the required functionalities along with kernel memory copy part. Please have a look.
#include <stdio.h>
#include <stdlib.h>
#include <mach-o/dyld_images.h>
#include <mach/vm_map.h>
#define PATH_MAX 2048
// to build.
// cc -o test_mach test_mach.c
// Helper function to read process memory (a la Win32 API of same name) To make
// it easier for inclusion elsewhere, it takes a pid, and does the task_for_pid
// by itself. Given that iOS invalidates task ports after use, it's actually a
// good idea, since we'd need to reget anyway
unsigned char *
readProcessMemory (int pid,
mach_vm_address_t addr,
mach_msg_type_number_t* size) {
task_t t;
task_for_pid(mach_task_self(), pid, &t);
mach_msg_type_number_t dataCnt = (mach_msg_type_number_t) *size;
vm_offset_t readMem;
// Use vm_read, rather than mach_vm_read, since the latter is different in
// iOS.
kern_return_t kr = vm_read(t, // vm_map_t target_task,
addr, // mach_vm_address_t address,
*size, // mach_vm_size_t size
&readMem, //vm_offset_t *data,
&dataCnt); // mach_msg_type_number_t *dataCnt
if (kr) {
fprintf (stderr, "Unable to read target task's memory @%p - kr 0x%x\n" ,
(void *) addr, kr);
return NULL;
}
return ( (unsigned char *) readMem);
}
int main(int argc, char* argv[]) {
if (argc != 2) {
fprintf(stderr, "Invalid usage %s\n", argv[0]);
exit(0);
}
int pid = atoi(argv[1]);
task_t task;
task_for_pid(mach_task_self(),pid, &task);
struct task_dyld_info dyld_info;
mach_msg_type_number_t count = TASK_DYLD_INFO_COUNT;
if (task_info(task, TASK_DYLD_INFO, (task_info_t) &dyld_info, &count)
== KERN_SUCCESS) {
mach_msg_type_number_t size = sizeof(struct dyld_all_image_infos);
uint8_t* data =
readProcessMemory(pid, dyld_info.all_image_info_addr, &size);
struct dyld_all_image_infos* infos = (struct dyld_all_image_infos *) data;
mach_msg_type_number_t size2 =
sizeof(struct dyld_image_info) * infos->infoArrayCount;
uint8_t* info_addr =
readProcessMemory(pid, (mach_vm_address_t) infos->infoArray, &size2);
struct dyld_image_info* info = (struct dyld_image_info*) info_addr;
for (int i=0; i < infos->infoArrayCount; i++) {
mach_msg_type_number_t size3 = PATH_MAX;
uint8_t* fpath_addr = readProcessMemory(pid,
(mach_vm_address_t) info[i].imageFilePath, &size3);
if (fpath_addr)
printf("path: %s %d\n",fpath_addr , size3);
}
}
return 0;
}
精彩评论