开发者

How to limit (or queue) calls to external processes in Node.JS?

Scenario

I have a Node.JS service (written using ExpressJS) that accepts image uploads via DnD (example). After an image is uploaded, I do a few things to it:

  1. Pull EXIF data from it
  2. Resize it

These calls are being handled via the node-imagemagick module at the moment and my code looks something like this:

app.post('/upload', function(req, res){
  ... <stuff here> ....

  im.readMetadata('./upload/image.jpg', function(err, meta) {
      // handle EXIF data.
  });

  im.resize(..., function(err, stdout, stderr) {
      // handle resize.
  });
});

Question

As some of you already spotted, the problem is that if I get enough simultaneous uploads, every single one of those uploads will spawn an 'identity' call then a resize operation (from Image Magick), effectively killing the server under high load.

Just testing with ab -c 100 -n 100 locks my little 512 Linode dev server up such that I have to force a reboot. I understand that my test may just be too much load for the server, but I would like a more robust approach to processing these requests so I have a more graceful failure then total VM suicide.

In Java I solved this issue by creating a fixed-thread ExecutorService that queues up the work and executes it on at most X number of threads.

In Node.JS, I am not even sure where to start to solve a problem like this. I don't quite have my brain wrapped around the non-threaded nature and how I can create a async JavaScript function that queues up the work while another... (thread?) processes the queue.

Any pointers on how to think about this or how to approach this would be appreciated.

Addendum

This is not the same as this question about FFMpeg, although I imagine that person will have this exact same question as soon as his webapp is under load as it boils down to the same problem (f开发者_如何学Goiring off too many simultaneous native processes in parallel).


The threads module should be just what you need:

https://github.com/robtweed/threads


Since Node does not allow threading, you can do work in another process. You can use a background job system, like resque, where you queue up jobs to be handled into a datastore of some type and then run a process (or several processes) that pulls jobs from the datastore and does the processing; or use something like node-worker and queue your jobs into the workers memory. Either way, your main application is freed up from doing all the processing and can focus on serving web requests.

[Update] Another interesting library to check out is hook.io, especially if you like the idea of node-workers but want to run multiple background processes. [/Update]

[Edit]

Here's a quick and dirty example of pushing work that takes a while to run to a worker process using node-worker; the worker queues jobs and processes them one by one:

app.js

var Worker = require('worker').Worker;
var processor = new Worker('image_processor.js');

for(var i = 0; i <= 100; i++) {
  console.log("adding a new job");
  processor.postMessage({job: i});
}

processor.onmessage = function(msg) {
  console.log("worker done with job " + msg.job);
  console.log("result is " + msg.data.result);
};

image_processor.js

var worker = require('worker').worker;
var queue = [];

worker.onmessage = function(msg) {
  var job = msg.job;
  queue.push(job);
}

var process_job = function() {
  if(queue.length == 0) {
    setTimeout(process_job, 100);
    return;
  }

  var job = queue.shift();
  var data = {};

  data.result = job * 10;

  setTimeout(function() {
    worker.postMessage({job: job, data: data});
    process_job();
  }, 1000);
};

process_job();


For anyone who thought Brandon's quick-and-dirty might be too quick-and-dirty, here's a variation that is no longer and doesn't have the unnecessary busy-wait. I'm not in a position to test it but it should work.

var enqueue = function() {
  var queue = [];
  var execImmediate = function(fImmediate) {
    enqueue = function(fDelayed) 
      queue.push(fDelayed);
    };
    fImmediate();

    var ic = setInterval(function() {
      var fQueued = queue.shift();
      if (fQueued) {
        fQueued();
      } else {
        clearInterval(ic);
        enqueue = execImmediate;
      }
    }, 1000);
  };
  return execImmediate;
}();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜