How to identify contents of a byte[] is a JPEG?
I have a small byte array (under 25K) that I receive and decode as part of a larger message envelope. Sometimes this is an image, furthermore it is a JPG. I have no context information other than the byte array, and need to identify both if it IS an image, and if the image is of type JPG.
Is there some magic number, or magic bytes that exist at the beginning, end or at some offset that I can look at to identify it?
An example of my code looks like this (from memory, not c/p):
byte[] messageBytesAfterDecode = retrieveBytesFromEnvelope();
if(null != messageBytesAfterDecode &&a开发者_JAVA技巧mp; messageBytesAfterDecode > 0){
if(areTheseBytesAJpeg(messageBytesAfterDecode)){
doSomethingWithAJpeg(messageBytesAfterDecode)
}else{
flagEnvelopeAsHavingBadContentInTheField();
}
}
I really need what would go into the
areTheseBytesAJpeg(byte[] mBytes){}
method, or even a pointer to a spec that details it. I'm hoping there is a very quick way to make this determination, since I don't really want to read them into an Image, etc.
From wikipedia:
JPEG image files begin with FF D8 and end with FF D9.
http://en.wikipedia.org/wiki/Magic_number_(programming)
Some Extra info about other file format with jpeg: initial of file contains these bytes
BMP : 42 4D
JPG : FF D8 FF EO ( Starting 2 Byte will always be same)
PNG : 89 50 4E 47
GIF : 47 49 46 38
When a JPG file uses JFIF or EXIF, The signature is different :
Raw : FF D8 FF DB
JFIF : FF D8 FF E0
EXIF : FF D8 FF E1
some code:
private static Boolean isJPEG(File filename) throws Exception {
DataInputStream ins = new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));
try {
if (ins.readInt() == 0xffd8ffe0) {
return true;
} else {
return false;
}
} finally {
ins.close();
}
}
Another source of "knowledge" about magic numbers (including for JPEG files) is the magic
file used by the GNU/Linux file
command.
If you have the file
command installed, then file --version
will tell you where the magic
file lives, and you can read it using a text editor ... and careful reading of man 5 magic
.
(And the magic
file contents confirm the details of other answers.)
Quoting this wikipedia article:
JPEG image files begin with FF D8 and end with FF D9. JPEG/JFIF files contain the ASCII code for "JFIF" (4A 46 49 46) as a null terminated string. JPEG/Exif files contain the ASCII code for "Exif" (45 78 69 66) also as a null terminated string, followed by more metadata about the file.
A lot of formats are identified by so-called magic numbers. These are byte sequences usually in the front of the file to identify whether the following binary data is really what you think it is. A quick google search returned: http://www.linfo.org/magic_number.html and specifically the citation:
"Similarly, a commonly used magic number for JPEG (Joint Photographic Experts Group) image files is 0x4A464946, which is the ASCII equivalent of JFIF (JPEG File Interchange Format). However, JPEG magic numbers are not the first bytes in the file; rather, they begin with the seventh byte. Additional examples include 0x4D546864 for MIDI (Musical Instrument Digital Interface) files and 0x425a6831415925 for bzip2 compressed files."
A JPG file does have a specific header that you could use to determine a very good likelihood that it is a JPG file. However, it's not clear if you will have the entire file in the byte array.
Anyway, here's specifics on the header: http://www.fastgraph.com/help/jpeg_header_format.html
精彩评论