开发者

did I find a libxml2 bug (memory leak in multi-threaded parsing)?

I am working actually on a data processing code using libxml2. I am stuck on a memory leak impossible to remove . Here is a minimal code to generate it :

#include <stdlib.h>
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <omp.h>

int main(void)
{
    xmlDoc *doc;
    int tn;
    char fname[32];

    omp_set_num_threads(2);
    xmlInitParser();
    #pragma omp parallel private(doc,tn,fname)
    {
        tn  = omp_get_thread_num();
        sprintf(fname,"testdoc%d.xml",tn);
        doc = xmlReadFile(fname,NULL,0);
        printf("document %s parsed on thread %d (%p)\n",fname,tn,doc);
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();

    return EXIT_SUCCESS;
}

At runtime, output is :

document testdoc0.xml parsed on thread 0 (0x1005413a0)
document testdoc1.xml parsed on thread 1 (0x1005543c0)

confirming that we really have multi-threading and that doc is really private in the parallel region. One can notice that I applied correctly the thread safety instructions for using libxml2 (http://xmlsoft.org/threads.html). Valgrind reports :

HEAP SUMMARY:
    in use at exit: 9,000 bytes in 8 blocks
  total heap usage: 956 allocs, 948 frees, 184,464 bytes allocated

968 bytes in 1 blocks are definitely lost in loss record 6 of 8
   at 0x1000107AF: malloc (vg_replace_malloc.c:236)
   by 0x1000B2590: xmlGetGlobalState (in /opt/local/lib/libxml2.2.dylib)
   by 0x1000B1A18: __xmlDefaultSAXHandler (in /opt/local/lib/libxml2.2.dylib)
   by 0x100106D18: xmlDefaultSAXHandlerInit (in /opt/local/lib/libxml2.2.dylib)
   by 0x100041BE7: xmlInitParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x100042145: xmlNewParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10004615E: xmlCreateURLParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10005B56B: xmlReadFile (in /opt/local/lib/libxml2.2.dylib)
   by 0x100000E03: main.omp_fn.0 (in ./xtest)
   by 0x100028FA3: gomp_thread_start (in /opt/local/lib/gcc44/libgomp.1.dylib)
   by 0x1001E8535: _pthread_start (in /usr/lib/libSystem.B.dylib)
   by 0x1001E83E8: thread_start (in /usr/lib/libSystem.B.dylib)

LEAK SUMMARY:
   definitely lost: 968 bytes in 1 blocks
   indirectly lost: 0 bytes in 0 blocks
     p开发者_运维技巧ossibly lost: 0 bytes in 0 blocks
   still reachable: 8,032 bytes in 7 blocks
        suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-reachable=yes

This is working for me whatever the XML document used. I am using libxml 2.7.8, on Mac OS X 10.6.5 with gcc 4.4.5.

Is someone able to reproduce this bug ?

Thanks,

Antonin


From the web site you listed above (http://xmlsoft.org/threads.html):

Starting with 2.4.7, libxml2 makes provisions to ensure that concurrent threads can safely work in parallel parsing different documents.

Your example seems to be using an xmlReadFile for the same document (testdoc.xml) for each thread. It further states:

Note that the thread safety cannot be ensured for multiple threads sharing the same document, the locking must be done at the application level ...


You should probably bring this up on the libxml2 mailing list.

http://mail.gnome.org/mailman/listinfo/xml

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜