开发者

Finding framed data in a byte-array

I have a bytearray consisting of data received from a WebSocket-client. The data I have can be either 1 receive, or buffered data + the last receive. This depends on weather or not there were any data buffered.

Now, there are really 3 possible things that should happen when data is received, and they are as following (note, this is after I've concatenated the received data with the buffer):

  1. A partial message has been received: The client started to transmit a message, and only parts of the message (the beginning) has gotten trough yet. The data I've received needs to be buffered until the rest is received.
  2. A whole message has been received: The client has transmitted a whole message. An event should be fired with the message.
  3. A close-signal has been received: The client has sent a request to close the signal.

Now, there is no good reason as to why not several of these should happen at once (for instance, 1 and a half message is received, or 1 message and the close-signal is received). Messages in websocket are framed with the bytes 0x00 and 0xFF, in other words, incoming messages looks like this: 0x00,...binary UTF8 data,0xFF,0x00,...binary UTF8 data,0xFF, and the end signal looks like this: 0xFF,0x00. Now, what I need is an efficient way of taking the incoming data-stream and split it up into messages or the end-signal. I've never worked with framed data like this before, so I'm not sure how to do it efficie开发者_Python百科ntly. What I would like is more or less a function that takes in the binary data as an array, and returns the messages (as binary data without the frame) or the close-flag, and a byte-array with the data to be buffered. The important thing is that it is fast and don't consume too much unnecessary memory. Or if you got some links that might help me to solve this issue, I'll take them too gladly.


Your close-signal is just an empty message, not really a special case.

So what you have is a mismatch between segments over the line and interpreted messages.

It doesn't look that hard, you have to extract sequences between 0x00 and 0xFF from a stream of bytes. You will need a buffer that is bigger than the biggest message, you will have to scan all incoming bytes and extract complete messages and move the remainder 'down' the buffer.

You will also need:

  • to verify that your UTF8 data can not contain 0x00 or 0xFF, I'm not sure this is true
  • handle the case that 0xFF is followed by something else than 0x00
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜