using Kernel#fork for backgrounding processes, pros? cons?
I'd like some thoughts on whether using fork{} to 'background' a process from a rails app is such a good idea or not...
From what I gather fork{my_method; Process#setsid} does in fact do what it's supposed to do.
1) creates another processes with a different PID
2) doesn't interrupt the c开发者_如何学JAVAalling process (e.g. it continues w/o waiting for the fork to finish)
3) executes the child until it finishes
..which is cool, but is it a good idea? What exactly is fork doing? Does it create a duplicate instance of my entire rails mongrel/passenger instance in memory? If so that would be very bad. Or, does it somehow do it without consuming a huge swath of memory.
My ultimate goal was to do away with my background daemon/queue system in favor of forking these processes (primarily sending emails) -- but if this won't save memory then it's definitely a step in the wrong direction
The fork does make a copy of your entire process, and, depending on exactly how you are hooked up to the application server, a copy of that as well. As noted in the other discussion this is done with copy-on-write so it's tolerable. Unix is built around fork(2), after all, so it has to manage it fairly fast. Note that any partially buffered I/O, open files, and lots of other stuff are also copied, as well as the state of the program that is spring-loaded to write them out, which would be incorrect.
I have a few thoughts:
- Are you using Action Mailer? It seems like email would be easily done with AM or by
Process.popen
of something. (Popen will do a fork, but it is immediately followed by an exec.) - immediately get rid of all that state by executing
Process.exec
of another ruby interpreter plus your functionality. If there is too much state to transfer or you really need to use those duplicated file descriptors, you might do something likeIO#popen
instead so you can send the subprocess work to do. The system will share the pages containing the text of the Ruby interpreter of the subprocess with the parent automatically. - in addition to the above, you might want to consider the use of the
daemons
gem. While your rails process is already a daemon, using the gem might make it easier to keep one background task running as a batch job server, and make it easy to start, monitor, restart if it bombs, and shut down when you do... - if you do exit from a
fork(2)
ed subprocess, useexit!
instead ofexit
- having a message queue and a daemon already set up, like you do, kinda sounds like a good solution to me :-)
Be aware that it will prevent you from using JRuby on Rails as fork() is not implemented (yet).
The semantics of fork is to copy the entire memory space of the process into a new process, but many (most?) systems will do that by just making a copy of the virtual memory tables and marking it copy-on-write. That means that (at first, at least) it doesn't use that much more physical memory, just enough to make the new tables and other per-process data structures.
That said, I'm not sure how well Ruby, RoR, etc. interacts with copy-on-write forking. In particular garbage collection could be problematic if it touches many memory pages (causing them to be copied).
精彩评论