Java: interspersing bytes and characters
I have a piece of test equipment, from which I can read data using an InputStream
, which intersperses bytes and characters (organized into lines), e.g.:
TEST1
TEST2
500
{500 binary bytes follows here}
TEST3
TE开发者_开发问答ST4
600
{600 binary bytes follows here}
I'd like to use BufferedReader so I can read a line at a time, but then switch to InputStream so I can read the binary bytes. But this neither seems to work nor seems like a good idea.
How can I do this? I can't get bytes from a BufferedReader, and if I use a BufferedReader on top of an InputStream, it seems like the BufferedReader "owns" the InputStream.
Edit: the alternative, just using an InputStream everywhere and having to convert bytes->characters and look for newlines, seems like it would definitely work but would also be a real pain.
When using BufferedReader
, you can just use String#getBytes()
to get the bytes out of a String
line. Don't forget to take character encoding into account. I recommend using UTF-8
all the time.
Just for your information: from the other side, if you only have bytes and you want to construct the characters, just use new String(bytes)
. Also don't forget to take the character encoding into account here.
[Edit] after all, it's a better idea to use BufferedInputStream and construct a byte buffer for a single line (fill until the byte matches the linebreak) and test if the character representation of its start matches with one of the predefined strings.
Instead of using a Reader
and InputStream
and attempting to switch back and forth between the two, try using a callback interface with one method for binary data and another for character data. e.g.
interface MixedProcessor {
void processBinaryData(byte[] bytes, int off, int len);
void processText(String line);
}
Then have another "splitter" class that:
- Decides which sections of the input are text and which are binary, and passes them to the corresponding processor method
- Converts bytes to characters when required (with the help of a
CharsetDecoder
)
The splitter class might look something like this:
class Splitter {
public Splitter(Charset charset) { /* ... */ }
public void readFully(InputStream is, MixedProcessor processor) throws IOException { /* ... */ }
}
I think I'm going to take a stab at using java.nio.ByteBuffer and ByteBuffer.asCharBuffer, which looks promising. Still have to look for newlines manually but at least it looks like it will handle the character translation properly.
Take a look at the source code of LineNumberInputStream. The class itself has been deprecated, but it looks like this is exactly what you need here.
This class allows you to read byte lines and then use regular InputStream
read methods.
If you don't want to drag deprecated code into your system just borrow some implementation details from it.
I don't have a good answer for the general case (so other answers are welcome), but if I assume input is ISO-8859-1 (8-bit chars) the following works for me, although I guess casting an 8-bit byte as char
doesn't necessarily guarantee ISO-8859-1 either.
The existing InputStream.read(byte[] b) and InputStream.read(byte[] b, int ofs, int len) allows me to read bytes.
public class OctetCharStream extends InputStream {
final private InputStream in;
static final private String charSet = "ISO-8859-1";
public OctetCharStream(InputStream in)
{
this.in=in;
}
@Override public int read() throws IOException {
return this.in.read();
}
public String readLine() throws IOException
{
StringBuilder sb = new StringBuilder();
while (true)
{
/*
* cast from byte to char:
* fine for 8-byte character sets
* but not good in general
*/
char c = (char) read();
if (c == '\n')
break;
sb.append(c);
}
return sb.toString();
}
public String readCharacters(int n) throws IOException
{
byte[] b = new byte[n];
int i = read(b);
String s = new String(b, 0, i, charSet);
return s;
}
}
Interestingly, when I tried using InputStreamReader alone rather than wrapping BufferedReader around it, the InputStreamReader.read() still buffers to some extent, by reading "greedily" more than one character even if you just want to pull out one character. So I couldn't use InputStreamReader to wrap an InputStream and try to use both the InputStream and InputStreamReader to read bytes/characters according to which one I needed at the moment.
BufferedReader has read(char[] cbuf, int off, int len)
can't you use that, convert chars to bytes and wrap it with ByteArrayInputStream?
EDIT: why would someone downvote that? Give a comment please. This works perfectly fine:
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try {
bos.write("TEST1\n".getBytes());
bos.write("10\n".getBytes());
for (int i = 0; i < 10; i++)
bos.write(i);
bos.write("TEST2\n".getBytes());
bos.write("1\n".getBytes());
bos.write(25);
ByteArrayInputStream bis = new ByteArrayInputStream(bos.toByteArray());
BufferedReader br = new BufferedReader(new InputStreamReader(bis));
while (br.ready()) {
String s = br.readLine();
String num = br.readLine();
int len = Integer.valueOf(num);
System.out.println(s + ", reading " + len + " bytes");
char[] cbuf = new char[len];
br.read(cbuf);
byte[] bbuf = new byte[len];
for (int i = 0; i < len; i++)
bbuf[i] = (byte) cbuf[i];
for (byte b: bbuf)
System.out.print(b + " ");
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
}
Output:
TEST1, reading 10 bytes
0 1 2 3 4 5 6 7 8 9
TEST2, reading 1 bytes
25
精彩评论