Python C extension not threadsafe?
I made a c extension out of a python script that was fairly labour intensive. The code itself is well tested and simple. The c extension is called with a few large lists, and it then performs some clever arithmetic and returns a few new lists. The c extension is 100% self sufficient, it doesn't use any other c functions nor does it use any of the python objects' methods (it does use these standard Python methods however: PyFloat_AsDouble, PyList_GetItem, PyList_Size, PyList_New, Py_BuildValue, PyList_Append). Up until now I have only used it in a non-multithreaded environment.
Today I started using it in a multithreaded GUI environment and all hell broke lose. I have a few test cases I use for debugging, and weirdly enough the smaller ones pass through ok while the larger ones cause bus errors and segmentation faults (crashing the GUI completely and bringing up the 'Problem Report For Python' window in OS X). Is the problem that my c extension isn't threadsafe? If so, how can I make it threadsafe? I tried googling the subject, but I haven't really found any good info that I can make sense of. I checked this and this page out, but I don't really understand what they are saying. Which type of code will need the GIL and which won't?
For what it's worth here is the dump:
Date/Time: 2010-10-23 03:48:02.714 +0800
OS Version: Mac OS X 10.6.4 (10F569)
Report Version: 6
Interval Since Last Report: 323080 sec
Crashes Since Last Report: 60
Per-App Interval Since Last Report: 110157 sec
Per-App Crashes Since Last Report: 59
Anonymous UUID: 5BD8D75B-9B21-4267-98A4-BAA31E56CB5C
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000b009286c
Crashed Thread: 2
Thread 0: Dispatch queue: com.apple.main-thread
0 ...ple.CoreServices.CarbonCore 0x90b024c8 ConvertFromUnicodeToTextImplementation + 1976
1 com.apple.HIToolbox 0x951c99e5 CEncodingTranslator::TranslateFromUnicode(char*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, short, short) + 549
2 com.apple.HIToolbox 0x951c9d01 CEncodingTranslator::Translate(char*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, unsigned long, short, short, short*, unsigned long) + 101
3 com.apple.HIToolbox 0x951a9e51 TXNGetDataEncoded + 278
4 libwx_macd-2.8.0.dylib 0x0188c7ee wxMacMLTEControl::GetLastPosition() const + 52
5 libwx_macd-2.8.0.dylib 0x0188bf73 wxTextCtrl::SetInsertionPointEnd() + 21
6 libwx_macd-2.8.0.dylib 0x0188bfc9 wxTextCtrl::AppendText(wxString const&) + 25
7 _controls_.so 0x1397e357 _wrap_TextCtrl_AppendText + 247 (wxPython.h:48)
8 org.python.python 0x000ca58b PyEval_EvalFrameEx + 21147
9 org.python.python 0x000cc4ba PyEval_EvalCodeEx + 2042
10 org.python.python 0x00041ca2 function_call + 162
11 org.python.python 0x0000f375 PyObject_Call + 85
12 org.python.python 0x000c7d5b PyEval_EvalFrameEx + 10859
13 org.python.python 0x000cc4ba PyEval_EvalCodeEx + 2042
14 org.python.python 0x00041ca2 function_call + 162
15 org.python.python 0x0000f375 PyObject_Call + 85
16 org.python.python 0x000c435e PyEval_CallObjectWithKeywords + 78
17 _core_.so 0x011859f0 wxPyCallback::EventThunker(wxEvent&) + 234 (helpers.cpp:1759)
18 libwx_macd-2.8.0.dylib 0x0180e360 wxEvtHandler::ProcessEventIfMatches(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) + 108
19 libwx_macd-2.8.0.dylib 0x0180e406 wxEvtHandler::SearchDynamicEventTable(wxEvent&) + 80
20 libwx_macd-2.8.0.dylib 0x0180f205 wxEvtHandler::ProcessEvent(wxEvent&) + 225
21 libwx_macd-2.8.0.dylib 0x0180ef4a wxEvtHandler::ProcessPendingEvents() + 86
22 libwx_macd-2.8.0.dylib 0x0176cd02 wxAppConsole::ProcessPendingEvents() + 102
23 libwx_macd-2.8.0.dylib 0x01806873 wxMacProcessNotifierAndPendingEvents + 33
24 libwx_macd-2.8.0.dylib 0x0183107e wxApp::MacHandleOneEvent(void*) + 90
25 libwx_macd-2.8.0.dylib 0x0183110e wxApp::MacDoOneEvent() + 120
26 libwx_macd-2.8.0.dylib 0x0184b570 wxEventLoop::Dispatch() + 32
27 libwx_macd-2.8.0.dylib 0x01906e71 wxEventLoopManual::Run() + 97
28 libwx_macd-2.8.0.dylib 0x018dd364 wxAppBase::MainLoop() + 76
29 _core_.so 0x0117c75c wxPyApp::MainLoop() + 52 (helpers.cpp:215)
30 _core_.so 0x011c9e66 _wrap_PyApp_MainLoop + 82 (_core_wrap.cpp:31686)
31 org.python.python 0x000ca58b PyEval_EvalFrameEx + 21147
32 org.python.python 0x000cc4ba PyEval_EvalCodeEx + 2042
33 org.python.python 0x00041ca2 function_call + 162
34 org.python.python 0x0000f375 PyObject_Call + 85
35 org.python.python 0x00021c66 instancemethod_call + 422
36 org.python.python 0x0000f375 PyObject_Call + 85
37 org.python.python 0x000c8ad6 PyEval_EvalFrameEx + 14310
38 org.python.python 0x000cbc88 PyEval_EvalFrameEx + 27032
39 org.python.python 0x000cc4ba PyEval_EvalCodeEx + 2042
40 org.python.python 0x000cc647 PyEval_EvalCode + 87
41 org.python.python 0x000f0ae8 PyRun_FileExFlags + 168
42 org.python.python 0x000f1a23 PyRun_SimpleFileExFlags + 867
43 org.python.python 0x0010a42b Py_Main + 3163
44 org.python.python 0x00001f82 0x1000 + 3970
45 org.python.python 0x00001ea9 0x1000 + 3753
Thread 1: Dispatch queue: com.apple.libdispatch-manager
0 libSystem.B.dylib 0x96068942 kevent + 10
1 libSystem.B.dylib 0x9606905c _dispatch_mgr_invoke + 215
2 libSystem.B.dylib 0x96068519 _dispatch_queue_invoke + 163
3 libSystem.B.dylib 0x960682be _dispatch_worker_thread2 + 240
4 libSystem.B.dylib 0x96067d41 _pthread_wqthread + 390
5 libSystem.B.dylib 0x96067b86 start_wqthread + 30
Thread 2 Crashed:
0 ccookies.so 0x0060a949 my_calc + 249 (ccookies.c:23)
1 org.python.python 0x000ca3e0 PyEval_EvalFrameEx + 20720
2 org.python.python 0x000cbc88 PyEval_EvalFrameEx + 27032
3 org.python.python 0x000cbc88 PyEval_EvalFrameEx + 27032
4 org.python.python 0x000cbc88 PyEval_EvalFrameEx + 27032
5 org.python.python 0x000cc4ba PyEval_EvalCodeEx + 2042
6 org.python.python 0x00041ca2 func开发者_Python百科tion_call + 162
7 org.python.python 0x0000f375 PyObject_Call + 85
8 org.python.python 0x00021c66 instancemethod_call + 422
9 org.python.python 0x0000f375 PyObject_Call + 85
10 org.python.python 0x000c435e PyEval_CallObjectWithKeywords + 78
11 org.python.python 0x0010c79c t_bootstrap + 76
12 libSystem.B.dylib 0x9606f81d _pthread_start + 345
13 libSystem.B.dylib 0x9606f6a2 thread_start + 34
Thread 2 crashed with X86 Thread State (32-bit):
eax: 0x0007d090 ebx: 0x0060a85d ecx: 0x000ef236 edx: 0xb010f920
edi: 0x02315180 esi: 0xb0092890 ebp: 0xb018d378 esp: 0xb0092870
ss: 0x0000001f efl: 0x00010282 eip: 0x0060a949 cs: 0x00000017
ds: 0x0000001f es: 0x0000001f fs: 0x0000001f gs: 0x00000037
cr2: 0xb009286c
I finally managed to get rid of the problem, but in a pretty long winded way. Here goes.
I spent a very very long time trying to make sense of the documentation for c extensions and their thread-safeness. On one of the many google trajectories that night I stumbled on this page describing how to use numpy arrays in c extensions. Since my problems seemed to be performance related (the original c extension worked for smaller datasets) I suspected that my implementation of looping through the python lists and using PyList_GetItem to get the data into their c array counterparts was not up to scratch. (I deduced the following actual number crunching in the c extension wasn't the issue since it was very generic c without any special stuff at all.)
Hence I decided to undertake a complete rewrite of the c extension and my calling python script to use numpy arrays instead of lists. It took a good two days including all the debugging. But now it works like a charm. All datasets are processed ok, there are no signs of any bus errors nor segmentation faults.
TLDR: Use numpy arrays instead of python lists when working with large datasets and python c extensions to avoid bus errors and segmentation faults.
cPython is not thread safe. That is the purpose of the GIL, which must be used whenever accessing or modifying the interpreter state.
If you need threading and python, then you will need to use an implementation other than cPython (the standard one), such as IronPython or Jython, both of which are perfectly robust in the case of threading. There are some modified cPython's such as Stackless python that might work better as well.
精彩评论