Obtain source using debugging symbols
Is it possible to obtain the source of a linux shared library (.so
) that was compiled with debugging information (gcc 开发者_如何学Go.. -g
) ? Thank you for your time.
The answer is: it depends - not only is the compile option -g
necessary, but also the final created executable / shared libary may not be stripped during the build process.
Object files created with -g
do contain a kind of sourcecode - just not sourcecode as you like it ...
If you use objdump -S
on such a file, it'll intersperse the disassembly with sourcecode lines.
But what that shows is the actual compiled source - past any operation done by the preprocessor, and past any inlining done by the compiler.
That means you can get surprising output from it; if nothing else, its verbosity can look a bit like Cobol sources. Start with:
#include <algorithm>
#include <functional>
int main(int argc, char **argv)
{
int array[] = { 1, 123, 1234, 12345, 123456 };
std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());
return 0;
}
And run it through g++ -O8 -g -o t t.C
and then objdump -S t
. This will give you the output for main()
similar to the following (what exactly you see of course depends on your compiler and/or library):
00000000004005e0 :
#include <algorithm>
#include <functional>
int main(int argc, char **argv)
{
4005e0: 41 57 push %r15
_ValueType<)
__glibcxx_requires_valid_range(__first, __last);
if (__first != __last)
{
std::__introsort_loop(__first, __last,
4005e2: ba 04 00 00 00 mov $0x4,%edx
4005e7: 41 56 push %r14
4005e9: 41 55 push %r13
4005eb: 41 54 push %r12
4005ed: 41 bc 04 00 00 00 mov $0x4,%r12d
4005f3: 55 push %rbp
4005f4: 53 push %rbx
4005f5: 48 83 ec 38 sub $0x38,%rsp
4005f9: 4c 8d 74 24 10 lea 0x10(%rsp),%r14
int array[] = { 1, 123, 1234, 12345, 123456 };
4005fe: c7 44 24 10 01 00 00 movl $0x1,0x10(%rsp)
400605: 00
400606: c7 44 24 14 7b 00 00 movl $0x7b,0x14(%rsp)
40060d: 00
40060e: c7 44 24 18 d2 04 00 movl $0x4d2,0x18(%rsp)
400615: 00
400616: c7 44 24 1c 39 30 00 movl $0x3039,0x1c(%rsp)
40061d: 00
40061e: 4d 8d 7e 14 lea 0x14(%r14),%r15
400622: 49 8d 6e 08 lea 0x8(%r14),%rbp
400626: 4c 89 f7 mov %r14,%rdi
400629: c7 44 24 20 40 e2 01 movl $0x1e240,0x20(%rsp)
400630: 00
400631: c6 04 24 00 movb $0x0,(%rsp)
400635: 4c 89 fe mov %r15,%rsi
400638: e8 73 01 00 00 callq 4007b0 <_ZSt16__introsort_loopIPilSt4lessIiEEvT_S3_T0_T1_>
40063d: eb 01 jmp 400640 <main+0x60>
40063f: 90 nop
if (__first == __last) return;
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
{
typename iterator_traits<_RandomAccessIterator>::value_type
__val = *__i;
400640: 8b 5d fc mov -0x4(%rbp),%ebx
if (__comp(__val, *__first))
400643: 3b 5c 24 10 cmp 0x10(%rsp),%ebx
_ValueType>)
__glibcxx_requires_valid_range(__first, __last);
if (__first != __last)
{
std::__introsort_loop(__first, __last,
400647: 4b 8d 0c 34 lea (%r12,%r14,1),%rcx
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
{
typename iterator_traits<_RandomAccessIterator>::value_type
__val = *__i;
if (__comp(__val, *__first))
40064b: 7c 53 jl 4006a0 <main+0xc0>
template<typename _Tp>
struct less : public binary_function<_Tp, _Tp, bool>
{
bool
operator()(const _Tp& __x, const _Tp& __y) const
{ return __x < __y; }
40064d: 8b 55 f8 mov -0x8(%rbp),%edx
{
std::copy_backward(__first, __i, __i + 1);
*__first = __val;
400650: 48 8d 45 f8 lea -0x8(%rbp),%rax
__unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
_Compare __comp)
{
_RandomAccessIterator __next = __last;
--__next;
while (__comp(__val, *__next))
400654: 39 d3 cmp %edx,%ebx
400656: 7c 0e jl 400666 <main+0x86>
400658: eb 1c jmp 400676 <main+0x96>
40065a: eb 04 jmp 400660 <main+0x80>
40065c: 90 nop
40065d: 90 nop
40065e: 90 nop
40065f: 90 nop
400660: 48 89 c1 mov %rax,%rcx
400663: 48 89 f0 mov %rsi,%rax
{
*__last = *__next;
400666: 89 11 mov %edx,(%rcx)
400668: 8b 50 fc mov -0x4(%rax),%edx
__last = __next;
--__next;
40066b: 48 8d 70 fc lea -0x4(%rax),%rsi
__unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
_Compare __comp)
{
_RandomAccessIterator __next = __last;
--__next;
while (__comp(__val, *__next))
40066f: 39 d3 cmp %edx,%ebx
400671: 7c ed jl 400660 <main+0x80>
400673: 48 89 c1 mov %rax,%rcx
{
*__last = *__next;
__last = __next;
--__next;
}
*__last = __val;
400676: 89 19 mov %ebx,(%rcx)
400678: 49 89 ed mov %rbp,%r13
40067b: 48 83 c5 04 add $0x4,%rbp
40067f: 49 83 c4 04 add $0x4,%r12
__insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Compare __comp)
{
if (__first == __last) return;
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
400683: 4d 39 ef cmp %r13,%r15
400686: 75 b8 jne 400640 <main+0x60>
std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());
return 0;
}
400688: 48 83 c4 38 add $0x38,%rsp
40068c: 31 c0 xor %eax,%eax
40068e: 5b pop %rbx
40068f: 5d pop %rbp
400690: 41 5c pop %r12
400692: 41 5d pop %r13
400694: 41 5e pop %r14
400696: 41 5f pop %r15
400698: c3 retq
400699: eb 05 jmp 4006a0 <main+0xc0>
How helpful the presence of this "sourcecode" would be is left as an exercise to the reader at this point ;-)
Tricky question. The easy answer is No, you can't.
However, if you understand assembly you can use tools like objdump, gdb and others to disassemble the application. And from the assembly a skilled programmer can re-write the application. This is no easy task, and it gets more difficult depending on how complex the target application is.
The fact is that release versions are not (or shouldn't) be compiled with -g
.
If you mean by decompiling, look at decompilers (IDA Pro, e.g.); Having debug information can help greatly, especially if you're not interested in the full source.
You can use the debug symbols to identify starting points of procedures that you are interested in. Using a good reverse engineering tool (like IDA or the very excellent OllyDbg) you can get annotated disassembly for those parts. OllyDbg and IDA are able to a certain extent to generate C code from the disassembly.
Having the symbols, again, helps, but is no magic pill
No. The debugging information just contains information about the symbols, i.e. the variables and functions but does not contain the code itself.
精彩评论