How can I prevent semaphore lockup when thread is terminated with bus error
I am developing a Linux device driver running on an embedded CPU. This device driver control some external hardware. The external hardware has it's own DDR controler and external DDR. The hardware's DDR is visible on the embedded CPU via a movable memory window (so I have paged access to the external DDR from the Linux driver). I'm using Linux kernel version 2.6.33.
My driver uses sysfs to allow control of the external hardware from userspace. As an example, the external hardware generates a heartbeat counter which increments a specific address in external DDR. The driver reads this to detect if the external hardware is still running.
If the external DDR is not working correctly then an access to the external DDR produces a bus error on the embedded CPU. To protect against simultaneous multi-thread access, the driver uses a se开发者_如何学Gomaphore.
Now to the problem. If a thread grabs the semaphore, then terminates with a bus error, the semaphore is still locked. All subsequent calls to grab the semaphore block indefinatly. What techniques can I use to avoid this hanging the driver forever?
An example sysfs function (simplified):
static ssize_t running_attr_show(struct device *dev, struct device_attribute *attr, char *buffer)
{
struct my_device * const my_dev = container_of(dev, struct my_device, dev);
int ret;
if(down_interruptible(&my_dev->sem))
{
ret = -ERESTARTSYS;
}
else
{
u32 heartbeat;
int running;
// Following line could cause bus error
heartbeat = mwindow_get_reg(&my_dev->mwindow, HEARTBEAT_COUNTER_ADDR);
running = (heartbeat != my_dev->last_heartbeat) ? 1 : 0;
my_dev->last_heartbeat = heartbeat;
ret = sprintf(buffer, "%d\n", result);
/* unlock */
up(&my_dev->sem);
}
return ret;
}
You'll need to modify mwindow_get_reg()
and possibly the architecture fault handler that's invoked on a bus error so that mwindow_get_reg()
can return an error, rather than terminating the process.
You can then handle that error gracefully, by releasing the semaphore and returning an error to userspace.
Thanks to @caf, here is the solution I've implemented.
I've converted part of mwindow_get_reg to assembly. For the possible faulting read I've added an entry into the ex_table section with the faulting address and fixup address. This causes the exception handler to jump to the fixup code instead of terminating the thread if an exception occurs at this address. The fixup assembler sets a 'faulted' flag, which I can then test for in my c code:
unsigned long ret = 0;
int faulted;
asm volatile(
" 1: lwi %0, %2, 0; " // ret = *window_addr
" 2: addik %1, r0, 0; " // faulted = 0
" 3: "
" .section .fixup, \"ax\"; " // fixup code executed if exception occurs
" 4: brid 3b; " // jump to next line of c code
" addik %1, r0, 1; " // faulted = 1 (in delay slot)
" .previous; "
" .section __ex_table,\"a\"; "
" .word 1b,4b; " // ex_table entry. Gives fault address and jump address if fault occurs
" .previous; "
: "=r" (ret), "=r" (faulted) // output registers
: "r" (window_addr) // input registers
);
if (faulted)
{
printk(KERN_ERROR "%s: %s: FAULTED!", MODNAME, __FUNCTION__);
ret = 0xdeadbeef;
}
I also had to modify my DBUS exception handler by adding the following:
const struct exception_table_entry *fixup;
fixup = search_exception_tables(regs->pc);
if (fixup) {
printk(KERN_ERROR "DBUS exception: calling fixup\n");
regs->pc = fixup->fixup;
return;
}
精彩评论