开发者

Obtain source using debugging symbols

Is it possible to obtain the source of a linux shared library (.so) that was compiled with debugging information (gcc 开发者_如何学Go.. -g) ? Thank you for your time.


The answer is: it depends - not only is the compile option -g necessary, but also the final created executable / shared libary may not be stripped during the build process.

Object files created with -g do contain a kind of sourcecode - just not sourcecode as you like it ...
If you use objdump -S on such a file, it'll intersperse the disassembly with sourcecode lines.

But what that shows is the actual compiled source - past any operation done by the preprocessor, and past any inlining done by the compiler.

That means you can get surprising output from it; if nothing else, its verbosity can look a bit like Cobol sources. Start with:

#include <algorithm>
#include <functional>
int main(int argc, char **argv)
{
    int array[] = { 1, 123, 1234, 12345, 123456 };
    std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());
    return 0;
}

And run it through g++ -O8 -g -o t t.C and then objdump -S t. This will give you the output for main() similar to the following (what exactly you see of course depends on your compiler and/or library):


00000000004005e0 :
#include <algorithm>
#include <functional>

int main(int argc, char **argv)
{
  4005e0:       41 57                   push   %r15
                                  _ValueType<)
      __glibcxx_requires_valid_range(__first, __last);

      if (__first != __last)
        {
          std::__introsort_loop(__first, __last,
  4005e2:       ba 04 00 00 00          mov    $0x4,%edx
  4005e7:       41 56                   push   %r14
  4005e9:       41 55                   push   %r13
  4005eb:       41 54                   push   %r12
  4005ed:       41 bc 04 00 00 00       mov    $0x4,%r12d
  4005f3:       55                      push   %rbp
  4005f4:       53                      push   %rbx
  4005f5:       48 83 ec 38             sub    $0x38,%rsp
  4005f9:       4c 8d 74 24 10          lea    0x10(%rsp),%r14
        int array[] = { 1, 123, 1234, 12345, 123456 };
  4005fe:       c7 44 24 10 01 00 00    movl   $0x1,0x10(%rsp)
  400605:       00 
  400606:       c7 44 24 14 7b 00 00    movl   $0x7b,0x14(%rsp)
  40060d:       00 
  40060e:       c7 44 24 18 d2 04 00    movl   $0x4d2,0x18(%rsp)
  400615:       00 
  400616:       c7 44 24 1c 39 30 00    movl   $0x3039,0x1c(%rsp)
  40061d:       00 
  40061e:       4d 8d 7e 14             lea    0x14(%r14),%r15
  400622:       49 8d 6e 08             lea    0x8(%r14),%rbp
  400626:       4c 89 f7                mov    %r14,%rdi
  400629:       c7 44 24 20 40 e2 01    movl   $0x1e240,0x20(%rsp)
  400630:       00 
  400631:       c6 04 24 00             movb   $0x0,(%rsp)
  400635:       4c 89 fe                mov    %r15,%rsi
  400638:       e8 73 01 00 00          callq  4007b0 <_ZSt16__introsort_loopIPilSt4lessIiEEvT_S3_T0_T1_>
  40063d:       eb 01                   jmp    400640 <main+0x60>
  40063f:       90                      nop
      if (__first == __last) return;

      for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
        {
          typename iterator_traits<_RandomAccessIterator>::value_type
            __val = *__i;
  400640:       8b 5d fc                mov    -0x4(%rbp),%ebx
          if (__comp(__val, *__first))
  400643:       3b 5c 24 10             cmp    0x10(%rsp),%ebx
                                  _ValueType>)
      __glibcxx_requires_valid_range(__first, __last);

      if (__first != __last)
        {
          std::__introsort_loop(__first, __last,
  400647:       4b 8d 0c 34             lea    (%r12,%r14,1),%rcx

      for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
        {
          typename iterator_traits<_RandomAccessIterator>::value_type
            __val = *__i;
          if (__comp(__val, *__first))
  40064b:       7c 53                   jl     4006a0 <main+0xc0>
  template<typename _Tp>
    struct less : public binary_function<_Tp, _Tp, bool>
    {
      bool
      operator()(const _Tp& __x, const _Tp& __y) const
      { return __x < __y; }
  40064d:       8b 55 f8                mov    -0x8(%rbp),%edx
            {
              std::copy_backward(__first, __i, __i + 1);
              *__first = __val;
  400650:       48 8d 45 f8             lea    -0x8(%rbp),%rax
    __unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
                              _Compare __comp)
    {
      _RandomAccessIterator __next = __last;
      --__next;
      while (__comp(__val, *__next))
  400654:       39 d3                   cmp    %edx,%ebx
  400656:       7c 0e                   jl     400666 <main+0x86>
  400658:       eb 1c                   jmp    400676 <main+0x96>
  40065a:       eb 04                   jmp    400660 <main+0x80>
  40065c:       90                      nop
  40065d:       90                      nop
  40065e:       90                      nop
  40065f:       90                      nop
  400660:       48 89 c1                mov    %rax,%rcx
  400663:       48 89 f0                mov    %rsi,%rax
        {
          *__last = *__next;
  400666:       89 11                   mov    %edx,(%rcx)
  400668:       8b 50 fc                mov    -0x4(%rax),%edx
          __last = __next;
          --__next;
  40066b:       48 8d 70 fc             lea    -0x4(%rax),%rsi
    __unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
                              _Compare __comp)
    {
      _RandomAccessIterator __next = __last;
      --__next;
      while (__comp(__val, *__next))
  40066f:       39 d3                   cmp    %edx,%ebx
  400671:       7c ed                   jl     400660 <main+0x80>
  400673:       48 89 c1                mov    %rax,%rcx
        {
          *__last = *__next;
          __last = __next;
          --__next;
        }
      *__last = __val;
  400676:       89 19                   mov    %ebx,(%rcx)
  400678:       49 89 ed                mov    %rbp,%r13
  40067b:       48 83 c5 04             add    $0x4,%rbp
  40067f:       49 83 c4 04             add    $0x4,%r12
    __insertion_sort(_RandomAccessIterator __first,
                     _RandomAccessIterator __last, _Compare __comp)
    {
      if (__first == __last) return;

      for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
  400683:       4d 39 ef                cmp    %r13,%r15
  400686:       75 b8                   jne    400640 <main+0x60>
        std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());

        return 0;
}
  400688:       48 83 c4 38             add    $0x38,%rsp
  40068c:       31 c0                   xor    %eax,%eax
  40068e:       5b                      pop    %rbx
  40068f:       5d                      pop    %rbp
  400690:       41 5c                   pop    %r12
  400692:       41 5d                   pop    %r13
  400694:       41 5e                   pop    %r14
  400696:       41 5f                   pop    %r15
  400698:       c3                      retq   
  400699:       eb 05                   jmp    4006a0 <main+0xc0>

How helpful the presence of this "sourcecode" would be is left as an exercise to the reader at this point ;-)


Tricky question. The easy answer is No, you can't.

However, if you understand assembly you can use tools like objdump, gdb and others to disassemble the application. And from the assembly a skilled programmer can re-write the application. This is no easy task, and it gets more difficult depending on how complex the target application is.

The fact is that release versions are not (or shouldn't) be compiled with -g.


If you mean by decompiling, look at decompilers (IDA Pro, e.g.); Having debug information can help greatly, especially if you're not interested in the full source.

You can use the debug symbols to identify starting points of procedures that you are interested in. Using a good reverse engineering tool (like IDA or the very excellent OllyDbg) you can get annotated disassembly for those parts. OllyDbg and IDA are able to a certain extent to generate C code from the disassembly.

Having the symbols, again, helps, but is no magic pill


No. The debugging information just contains information about the symbols, i.e. the variables and functions but does not contain the code itself.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜