Linux process goes in D-state when I connect serial device via usb hub
I have a serial GPS connected to an embedded PC via serial<->USB adapter (Prolific PL2303). Every 5 minutes a shell script runs a Python script that reads GPS data via Pyserial then upload them to Internet. If I plug my GPS directly to the PC (via PL2303) everything is ok and my system runs forever BUT if I use a usb HUB between pl2303 and the PC I have a this problem: the Python script runs ok for about 3-6 hours then it goes in a D-state (uninterruptible sleep) and the shell script cannot run it again (I can only shutdown the system, no kill possible). I checked 开发者_JS百科my script and I used usb hubs from various vendors (powered and not) with the same result.
PS my embedded pc (from Embeddedarm) runs an updated Debian Lenny.
Ho can I fix it ?
A process in D state means the kernel (most probably a device driver), has put your process into uninterruptible sleep.
To be honest, there is probably quite little you can do about it as a user, unless you intend to debug the kernel USB stack and/or specific USB chipset device driver.
Here is what would do -
Make sure the kernel configuration of you embedded device has the kernel config option for the magic sysreq key and the run time configuration for it turned on. See: http://en.wikipedia.org/wiki/Magic_SysRq_key on how to do that.
Recreate the problem (have the process get stuck in D state).
Find out the PID of the stuck python script with ps and run strace -p PID on it. This will give you the specific system call that the process is sleeping in.
Send the magic sysreq key command 't', that lists all tasks and their kernel stack to console. Look for the specific task of the python script by PID, see at what part of the kernel code exactly are you stuck.
Open the kernel code and try to debug the problem if you can, or port it to the relevant mailing list if you don't.
One more suggestion would be to try and see if the problem goes away in a more recent kernel version then Debian ships. If so, you know it is a bug fixed in newer version of the kernel and you have the choice of either using the newer version and try to port the fix to the old version you care using.
Good luck! you'll need it...
Ubuntu launchpad has a bug filed that is suspiciously like yours. The workaround suggested is:
modprobe -r pl2303
modprobe pl2303
See if this works around the bug?
精彩评论