How do I debug code that segfaults unless run through gdb?
That's a single threaded code.
In particular: ahocorasick Python extension module (easy_install ahocorasick).
I isolated the problem to a trivial example:
import ahocorasick
t = ahoc开发者_高级运维orasick.KeywordTree()
t.add("a")
When I run it in gdb, all is fine, same happens when I enter these instructions into Python CLI. However, when I try to run the script regularily, I get a segfault.
To make it even weirder, the line that causes segfault (identified by core dump analysis) is a regular int incrementation (see the bottom of the function body).
I'm completely stuck by this moment, what can I do?
int
aho_corasick_addstring(aho_corasick_t *in, unsigned char *string, size_t n)
{
aho_corasick_t* g = in;
aho_corasick_state_t *state,*s = NULL;
int j = 0;
state = g->zerostate;
// As long as we have transitions follow them
while( j != n &&
(s = aho_corasick_goto_get(state,*(string+j))) != FAIL )
{
state = s;
++j;
}
if ( j == n ) {
/* dyoo: added so that if a keyword ends up in a prefix
of another, we still mark that as a match.*/
aho_corasick_output(s) = j;
return 0;
}
while( j != n )
{
// Create new state
if ( (s = xalloc(sizeof(aho_corasick_state_t))) == NULL )
return -1;
s->id = g->newstate++;
debug(printf("allocating state %d\n", s->id)); /* debug */
s->depth = state->depth + 1;
/* FIXME: check the error return value of
aho_corasick_goto_initialize. */
aho_corasick_goto_initialize(s);
// Create transition
aho_corasick_goto_set(state,*(string+j), s);
debug(printf("%u -> %c -> %u\n",state->id,*(string+j),s->id));
state = s;
aho_corasick_output(s) = 0;
aho_corasick_fail(s) = NULL;
++j; // <--- HERE!
}
aho_corasick_output(s) = n;
return 0;
}
There are other tools you can use that will find faults that does not necessarily crash the program. valgrind, electric fence, purify, coverity, and lint-like tools may be able to help you.
You might need to build your own python in some cases for this to be usable. Also, for memory corruption things, there is (or was, haven't built exetensions in a while) a possibility to let python use direct memory allocation instead of pythons own.
Have you tried translating that while loop to a for loop? Maybe there's some subtle misunderstanding with the ++j
that will disappear if you use something more intuitive.
精彩评论