开发者

Determing the exact line in source code in a kernel crash-dump

Hi I am running a bi-di 'iperf' test on an interface using my driver. Steps to repro would be to run bi-di I/O on one interface(other interface is not active):

  • Run iperf -c -P 8 -t 100000 -I 10 on DUT
  • iperf -c with same params as above from peer almost immediately ( after 1st 10s of above 'iperf send' are over) With 'iperf -s -w 256K' on both

The crash is not happening as such in the driver but in the 'iperf' context. I am going to copy-paste the stack trace:

 PID: 8855   TASK: f7036550  CPU: 0   COMMAND: "iperf"
 #0 [c074bed0] crash_kexec at c0443233
 #1 [c074bf14] die at c04064d3
 #2 [c074bf44] do_page_fault at c062134b
 #3 [c074bf94] error_code (via page_fault) at c0405abb
    EAX: f5888100  EBX: 00000000  ECX: 00100100  EDX: 00200200  EBP: 00000001
    DS:  007b      ESI: f5888000  ES:  007b      EDI: cb614000
    CS:  0060      EIP: c05c4e94  ERR: ffffffff  EFLAGS: 00010046
 #4 [c074bfc8] net_rx_action at c05c4e94
 #5 [c074bfe4] __do_softirq at c042aa65
--- <soft IRQ> ---
 #0 [f281ac4c] do_softirq at c04073e5
 #1 [f281ac58] do_IRQ at c04074d9
 #2 [开发者_如何学Gof281ac70] common_interrupt at c0405975
    EAX: 39383736  EBX: f281af4c  ECX: 00000428  EDX: 31303938  EBP: f378b042
    DS:  007b      ESI: f378b1c2  ES:  007b      EDI: 09fdb448
    CS:  0060      EIP: c04f1c07  ERR: ffffffba  EFLAGS: 00000202
 #3 [f281aca4] __copy_to_user_ll at c04f1c07
 #4 [f281acb0] memcpy_toiovec at c05bfecc
 #5 [f281acc4] skb_copy_datagram_iovec at c05c059b
 #6 [f281acf4] tcp_rcv_established at c05ef40a
 #7 [f281ad20] tcp_v4_do_rcv at c05f48c5
 #8 [f281ad54] tcp_prequeue_process at c05e6bdd
 #9 [f281ad5c] tcp_recvmsg at c05e90e2
#10 [f281ad9c] sock_common_recvmsg at c05bb1c4
#11 [f281adc0] sock_recvmsg at c05b8dc6
#12 [f281aea0] sys_recvfrom at c05ba6ab
#13 [f281af64] sys_recv at c05ba727
#14 [f281af80] sys_socketcall at c05bab52
#15 [f281afb8] system_call at c0404f44
    EAX: ffffffda  EBX: 0000000a  ECX: b6ba2340  EDX: 00014268
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 09fbe630
    SS:  007b      ESP: b6ba2328  EBP: b6ba2378
    CS:  0073      EIP: 004ad410  ERR: 00000066  EFLAGS: 00000293
crash>

the EIP at the time of crash is net_rx_action:0xdd/19ca. Now i have compiled the kernel-2.6.18-238 sources( the source version of the OS on which the DUT is running) and did an 'objdump -S ./net/core/dev.o > dev_o_dmp' on the ./net/core/dev.c which has the definition of the net_rx_acdtion(). Now in the 'dev_o_dmp' file the net_rx_action() has lots of inline definitions and hence somehow does not exactly mirror the flow in the source file. In such a scenario ,is it safe to add 0xdd to the base addr of net_rx_action (say 32FF) => 340C .i.e would 340C be the offending line number that is giving rise to the crash ' kernel paging request error'

Any tips /recommendations on how to go about debugging this problem would be of great help


Unfortunately, or fortunately depending on your perspective, with high levels of optimization it is possible for the compiler to create assembly code that the debug format cannot make a reasonable C code line to assembly instruction(s) mapping. What type of cases you can run into this problem depends on the compiler, optimization level, debug symbol format, debug symbol level, and the code itself.

You have to assume that line numbers gained via this technique could be wrong. That being said, I use this technique frequently in my own kernel work and I have not had any problems yet (knocks on wood). Just remember that if you are faced with something that just makes no sense, you could have a bad line number.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜