开发者

Memcpy segfaulting with valid pointers

I'm using libcurl in my program, and running into a segfault. Before I filed a bug with the curl project, I thought I'd do a little debugging. What I found seemed very odd to me, and I haven't been able to make sense of it yet.

First, the segfault traceback:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe77f6700 (LWP 592)]
0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff5bc29e5 in x509_name_oneline (a=0x7fffe3d9c3c0,
    buf=0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%", size=255) at ssluse.c:629
#2  0x00007ffff5bc2a6f in cert_verify_callback (ok=1, ctx=0x7fffe77f50b0)
    at ssluse.c:645
#3  0x00007ffff72c9a80 in ?? () from /lib/libcrypto.so.0.9.8
#4  0x00007ffff72ca430 in X509_verify_cert () from /lib/libcrypto.so.0.9.8
#5  0x00007ffff759af58 in ssl_verify_cert_chain () from /lib/libssl.so.0.9.8
#6  0x00007ffff75809f3 in ssl3_get_server_certificate ()
   from /lib/libssl.so.0.9.8
#7  0x00007ffff7583e50 in ssl3_connect () from /lib/libssl.so.0.9.8
#8  0x00007ffff5bc48f0 in ossl_connect_step2 (conn=0x7fffe315e9a8, sockindex=0)
    at ssluse.c:1724
#9  0x00007ffff5bc700f in ossl_connect_common (conn=0x7fffe315e9a8,
    sockindex=0, nonblocking=false, done=0x7fffe77f543f) at ssluse.c:2498
#10 0x00007ffff5bc7172 in Curl_ossl_connect (conn=0x7fffe315e9a8, sockindex=0)
    at ssluse.c:2544
#11 0x00007ffff5ba76b9 in Curl_ssl_connect (conn=0x7fffe315e9a8, sockindex=0)
...

The call to memcpy looks like this:

  memcpy(buf, biomem->data, size);
(gdb) p buf
$46 = 0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p biomem->data
$47 = 0x7fffe3e1ef60 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p size
$48 = 255

If I go up a frame, I see that the pointer passed in for buf came from a local variable defined in the calling function:

char buf[256];

Here's where it starts to get weird. I can manually inspect all 256 bytes of both buf and biomem->data without gdb complaining that the memory isn't accesible. I can also manually write all 256 bytes of buf using the gdb set command, without any error. So if all the memory involved is readable and writable, why does memcpy fail?

Also interesting is that I can use gdb to manually call memcpy with the pointers involved. As long as I pass a size <= 160, it runs without a problem. As soon as I pass 161 or higher, gdb gets a sigsegv. I know buf is larger than 160, because it was created on the stack as an array of 256. biomem->data is a little harder to figure, but I can read well past byte 160 with gdb.

I should also mention that this function (or rather the curl method I call that leads to this) completes successfully many times before the crash. My program uses curl to repeatedly call a web service API while it runs. It calls the API every five seconds or so, and runs for about 14 hours before it crashes. It's possible that something else in my app is writing out of bounds and stomping on something that creates the error condition. But it seems suspicious that it crashes at exactly the same point every time, although the timing varies. And all the pointers seem ok in gdb, but memcpy still fails. Valgrind doesn't find any bounds errors, but I haven't let my program run with valgrind for 14 hours.

Within memcpy itself, the disassembly looks like this:

(gdb) x/20i $rip-10
   0x7ffff6a2ea52 <memcpy+242>: jbe    0x7ffff6a2ea74 <memcpy+276>
   0x7ffff6a2ea54 <memcpy+244>: lea    0x20(%rdi),%rdi
   0x7ffff6a2ea58 <memcpy+248>: je     0x7ffff6a2ea90 <memcpy+304>
   0x7ffff6a2ea5a <memcpy+250>: dec    %ecx
=> 0x7ffff6a2ea5c <memcpy+252>: mov    (%rsi),%rax
   0x7ffff6a2ea5f <memcpy+255>: mov    0x8(%rsi),%r8
   0x7ffff6a2ea63 <memcpy+259>: mov    0x10(%rsi),%r9
   0x7ffff6a2ea67 <memcpy+263>: mov    0x18(%rsi),%r10
   0x7ffff6a2ea6b <memcpy+267>: mov    %rax开发者_如何学Python,(%rdi)
   0x7ffff6a2ea6e <memcpy+270>: mov    %r8,0x8(%rdi)
   0x7ffff6a2ea72 <memcpy+274>: mov    %r9,0x10(%rdi)
   0x7ffff6a2ea76 <memcpy+278>: mov    %r10,0x18(%rdi)
   0x7ffff6a2ea7a <memcpy+282>: lea    0x20(%rsi),%rsi
   0x7ffff6a2ea7e <memcpy+286>: lea    0x20(%rdi),%rdi
   0x7ffff6a2ea82 <memcpy+290>: jne    0x7ffff6a2ea30 <memcpy+208>
   0x7ffff6a2ea84 <memcpy+292>: data32 data32 nopw %cs:0x0(%rax,%rax,1)
   0x7ffff6a2ea90 <memcpy+304>: and    $0x1f,%edx
   0x7ffff6a2ea93 <memcpy+307>: mov    -0x8(%rsp),%rax
   0x7ffff6a2ea98 <memcpy+312>: jne    0x7ffff6a2e969 <memcpy+9>
   0x7ffff6a2ea9e <memcpy+318>: repz retq
(gdb) info registers
rax            0x0      0
rbx            0x7fffe77f50b0   140737077268656
rcx            0x1      1
rdx            0xff     255
rsi            0x7fffe3e1f000   140737016623104
rdi            0x7fffe77f4f60   140737077268320
rbp            0x7fffe77f4e90   0x7fffe77f4e90
rsp            0x7fffe77f4e48   0x7fffe77f4e48
r8             0x11     17
r9             0x10     16
r10            0x1      1
r11            0x7ffff6a28f7a   140737331236730
r12            0x7fffe3dde490   140737016358032
r13            0x7ffff5bc2a0c   140737316137484
r14            0x7fffe3d69b50   140737015880528
r15            0x0      0
rip            0x7ffff6a2ea5c   0x7ffff6a2ea5c <memcpy+252>
eflags         0x10203  [ CF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
(gdb) p/x $rsi
$50 = 0x7fffe3e1f000
(gdb) x/20x $rsi
0x7fffe3e1f000: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffe3e1f010: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffe3e1f020: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffe3e1f030: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffe3e1f040: 0x00000000      0x00000000      0x00000000      0x00000000

I'm using libcurl version 7.21.6, c-ares version 1.7.4, and openssl version 1.0.0d. My program is multithreaded, but I have registered mutex callbacks with openssl. The program is running on Ubuntu 11.04 desktop, 64-bit. libc is 2.13.


Clearly libcurl is over-reading the source buffer, and stepping into unreadable memory (page at 0x7fffe3e1f000 -- you can confirm that memory is unreadable by looking at /proc/<pid>/maps for the program being debugged).

Here's where it starts to get weird. I can manually inspect all 256 bytes of both
buf and biomem->data without gdb complaining that the memory isn't accesible.

There is a well-known Linux kernel flaw: even for memory that has PROT_NONE (and causes SIGSEGV on attempt to read it from the process itself), attempt by GDB to ptrace(PEEK_DATA,...) succeeds. That explains why you can examine 256 bytes of the source buffer in GDB, even though only 96 of them are actually accessible.

Try running your program under Valgrind, chances are it will tell you that you are memcpying into heap-allocated buffer that is too small.


Do you any possibility of creating a "crumple zone"?

That is, deliberately increasing the size of the two buffers, or in the case of the structure putting an extra unused element after the destination?

You then seed the source crumple with something such as "0xDEADBEEF", and the destination with som with something nice. If the destination every changes you've got something to work with.

256 is a bit suggestive, any possibility it could somehow be being treated as signed quantity, becoming -1, and hence very big? Can't see how gdb wouldn't show it, but ...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜