Distributed File Access Strategies
I have a Windows service written in C# which monitors a folder for files to process. When files are added to this folder, 开发者_高级运维the service should pick up the file and perform a task with it, consuming the file in the process.
I would like to distribute the work over multiple physical servers for fault-tolerance. The files will be placed on NAS accessible to all service instances.
The significant requirement is that I would like each service to pick up a file exclusively; a file should not be processed by more than one service.
Are there any good strategies for working with files in this way?
The simplest solution, it seems to me, would be to create a .lock file. So if ServiceA sees a file called myfile.dat
, it would look for a myfile.dat.lock
file. If it doesn't find it, it would create one; subsequence services would see the myfile.data.lock
file and skip over that file.
There's still the potential that two services would attempt to create the .lock file at the exact same time, but one of those services would receive an exception for attempting to create a duplicate file. So you could handle that exception and retry the .lock file check (or just skip it) and continue from there.
You can deploy Apache ZooKeeper. When a processing server wants to work on a file it creates and locks a "node", works on the file, and then unlocks the node. If once-and-only-once processing of the file is an important requirement I would not roll your own. It's harder to implement than it sounds, and ZooKeeper will handle it correctly.
精彩评论