scanf %d segfault at large input

2023-03-17 00:31 问答作者：

So I ran some static code analyzer over some c code and one thing that surprised me was a warning about:

开发者_运维技巧int val;
scanf("%d", &val);

which said that for large enough input this may result in a segfault. And surely enough this can actually happen. Now the fix is simple enough (specify some width; after all we know how many places a valid integer may have at most depending on the architecture) but what I'm wondering about is WHY this is happening in the first place and why this isn't regarded as a bug in libc (and a simple one to fix at that)?

Now I assume there's some reason for this behavior in the first place that I'm missing?

Edit: Ok since the question doesn't seem to be such clear cut, a bit more explanation: No the code analyzer doesn't warn about scanf in general but about scanf reading a digit without a width specified in specific.

So here's a minimal working example:

#include <stdlib.h>
#include <stdio.h>

int main() {
    int val;
    scanf("%d", &val);
    printf("Number not large enough.\n");
    return 0;
}

We can get a segfault by sending a gigantic number (using eg Python):

import subprocess
cmd = "./test"
p = subprocess.Popen(cmd, stdin=subprocess.PIPE, shell=True)
p.communicate("9"*50000000000000)
# program will segfault, if not make number larger

If the static analyzer is cppcheck, then it is warning about it because of a bug in glibc which has since been fixed: http://sources.redhat.com/bugzilla/show_bug.cgi?id=13138

edited since I missed the fact you feed a static code analyzer with it

If the format %d matchs the size of int, what overflows should not be what it is written into val through the pointer, since it should be always an int. Try to pass a pointer to long int and see if the analyzer give the warning still. Try to change %d into %ld, keeping the long int pointer, and see if the warning is given again.

I suppose standards should say something about %d, the type it needs. Maybe analyzer is worried about the fact that on some system int could be shorter than what %d means? It would sound odd to me.

Running your example compiled with gcc (and I have python 2.6.6) I obtain

Traceback (most recent call last):
  File "./feed.py", line 4, in <module>
    p.communicate("9"*50000000000000)
OverflowError: cannot fit 'long' into an index-sized integer
Number not large enough.

Then I tried running this instead:

perl -e 'print "1"x6000000000000000;' |./test

and modified the C part to write

printf("%d Number not large enough.\n", val);

I obtain as output

5513204 Number not large enough.

where the number changes at every run... never segfault... the GNU scanf implementation is safe... though the resulting number is wrong...

The first step in processing an integer is to isolate the sequence of digits. If that sequence is longer than expected, it may overflow a fixed-length buffer, leading to a segmentation fault.

You can achieve a similar effect with doubles. Pushed to extremes, you can write 1 followed by one thousand zeroes, and an exponent of -1000 (net value is 1). Actually, when I was testing this a few years ago, Solaris handled 1000 digits with aplomb; it was at a little over 1024 that it ran into trouble.

So, there is an element of QoI - quality of implementation. There is also an element of 'to follow the C standard, scanf() cannot stop reading before it comes across a non-digit'. These are conflicting goals.

继续阅读：c scanf

scanf %d segfault at large input

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？