Line-oriented streams in Node.js
I'm developing a multi-process application using Node.js. In this application, a parent process will spawn a child process and communicate with it using a JSON-based messaging protocol over a pipe. I've found that large JSON messages may get "cut off", such that a single "chunk" emitted to the data listener on the pipe does not contain开发者_如何学Python the full JSON message. Furthermore, small JSON messages may be grouped in the same chunk. Each JSON message will be delimited by a newline character, and so I'm wondering if there is already a utility that will buffer the pipe read stream such that it emits one line at a time (and hence, for my application, one JSON document at a time). This seems like it would be a pretty common use case, so I'm wondering if it has already been done.
I'd appreciate any guidance anyone can offer. Thanks.
Maybe Pedro's carrier can help you?
Carrier helps you implement new-line terminated protocols over node.js.
The client can send you chunks of lines and carrier will only notify you on each completed line.
My solution to this problem is to send JSON messages each terminated with some special unicode character. A character that you would never normally get in the JSON string. Call it TERM.
So the sender just does "JSON.stringify(message) + TERM;" and writes it. The reciever then splits incomming data on the TERM and parses the parts with JSON.parse() which is pretty quick. The trick is that the last message may not parse, so we simply save that fragment and add it to the beginning of the next message when it comes. Recieving code goes like this:
s.on("data", function (data) {
var info = data.toString().split(TERM);
info[0] = fragment + info[0];
fragment = '';
for ( var index = 0; index < info.length; index++) {
if (info[index]) {
try {
var message = JSON.parse(info[index]);
self.emit('message', message);
} catch (error) {
fragment = info[index];
continue;
}
}
}
});
Where "fragment" is defined somwhere where it will persist between data chunks.
But what is TERM? I have used the unicode replacement character '\uFFFD'. One could also use the technique used by twitter where messages are separated by '\r\n' and tweets use '\n' for new lines and never contain '\r\n'
I find this to be a lot simpler than messing with including lengths and such like.
Simplest solution is to send length of json data before each message as fixed-length prefix (4 bytes?) and have a simple un-framing parser which buffers small chunks or splits bigger ones.
You can try node-binary to avoid writing parser manually. Look at scan(key, buffer)
documentation example - it does exactly line-by line reading.
As long as newlines (or whatever delimiter you use) will only delimit the JSON messages and not be embedded in them, you can use the following pattern:
let buf = ''
s.on('data', data => {
buf += data.toString()
const idx = buf.indexOf('\n')
if (idx < 0) { return } // No '\n', no full message
let lines = buf.split('\n')
buf = lines.pop() // if ends in '\n' then buf will be empty
for (let line of lines) {
// Handle the line
}
})
精彩评论