开发者

How do disk controllers handle concurrent writes to same sector in absence of write barriers?

When I open a file with O_DIRECT|O_ASYNC and do two concurrent writes to the same disk sector, without a fsync or fdatasync in between, does the linux disk subsystem or the Hardware disk controllers offer any guarantee that the final data on that disk sector will be the second write ?

While its true that O_DIRECT bypasses the OS buffer cache, data ultimately ends up in the low level IO queue (disk scheduler queue, disk driver's queue, hardware controller's cache/queues etc). I have traced the IO stack all the way down to the elevator algorithm.

For example if the following sequence of requests end up in the disk scheduler queue

write sector 1 from buffer 1  
write sector 2 from buffer 2  
write sector 1 from buffer 3 [Its not buffer 1!!]  

the elevator code would do a "back merge" to coalesce sector1,2 from buffers 1,2 respectively. And then issue disk two disk IOs. But I am not sure if the final data on disk sector 1 is from buffer 1 or buffer 3 (as I dont know about the write re-ordering semantics of drivers/controllers).

Scenario 2:

write sector 1 from buffer 1  
write sector 500 from buffer 2
write sector 1 from buffer 3

How will this scenario be handled? A more basic question is when doing writes in O_DIRECT mode with AIO, can this sequence of requests end up in the disk scheduler's queue, in the absence of explicit write barriers ?

If yes, is there any ordering guarantee like "multiple writes to same sector will result in the last write being the final write" ?

or is 开发者_C百科that ordering non-deterministic [left at the mercy of the disk controller/its caches that reorder writes within barriers to optimize seek time]


Barriers are going away. If you require ordering among overlapping writes, you're supposed to wait for completion of the first before issuing the second. (Barriers are going away.)

In the general case I believe there is no guarantee. The final result is non-deterministic from the application perspective, depending on timing, state of the host and storage device, etc.

The request queue will merge requests in a predictable fashion, but hardware is not required to provide consistent results for writes that are in the drive's queue at the same time.

Depending on how fast the storage device is and how slow the host CPU is, you can't necessarily guarantee that merging will take place in the request queue before commands are sent to the storage device.

Unfortunately, how applications using O_DIRECT (as opposed to filesystems that directly construct bios) are supposed to wait for completion is not clear to me.


OK, write requests end up in a linear elevator queue. At this point it's not relevant whether they came from different threads. Same arrangement could be a result of a single thread issuing three sequential writes. Now, would you trust your files to an OS or to a controller that reorders sequential writes to the same sector in some arbitrary fashion? I wouldn't but I might be wrong of course :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜