Is the 2nd and 3rd byte of a JPEG image always the APP0 or APP1 marker?
I have a few different JPEG im开发者_开发知识库ages I've been testing with. As far as I've seen the 0th and first bytes are always 0xFF
and 0xD8
.
The second and third are usually either 0xFF
and 0xE0
( APP0 ) indicating either a JFIF segment or JFIF extension segment or 0xFF
and 0xE1
( APP1 ) indicating an EXIF segment.
My question: is this always the case? Are the 2nd and 3rd bytes always APP0 or APP1?
No. There are e.g. several cameras that create JPEGs without these markers, or with other APP markers. The only thing you can rely on is the SOI sequence, FF D8
, not even EOI is produced by all cameras. Also be aware that JPEGs with embedded JPEGs exist - you can have nested SOI/EOI within an image.
If you need to deal with embedded JPEG data in raw camera images, several models produce JPEG-like data that can only be parsed by being a bit slack with the jpeg spec - especially in relation to escaped FF
bytes in data. And then you have cameras that produce proprietary data that at first glance looks like jpeg data (e.g. some of Sony's "encrypted" raw formats)
Things are complicated here. Since I am currently writing a javascript file identifier, I'll try to answer with my javascript object for JPEG. Especially because the question had a "javascript" tag.
The basic answer is already given (the accepted one) but this is more detailed about how to check the different App markers (with fallback).
There are special APP0s so far for JFIF, EXIF, Adobe, Canon and Samsung (but we don't know about the future). So the logic for the js object is:
If one of the SPECS[x].regex is matched it wins (first one wins). But if nothing is matched, the parent object (only FFd8) wins.
The SPECS object delivers according PRONOM identifiers - you can view them like so
'http://apps.nationalarchives.gov.uk/pronom/fmt/'.concat(PUID) [official] 'http://apps.nationalarchives.gov.uk/pronom/x-fmt/'.concat(xPUID) [experimental]
_FFD8: {
SPECS: [
{
PUID: 112,
regex: /^FFD8FFE8(.{2})53504946460001/,
desc: 'jpeg: Still Picture Interchange Format file (SPIF)',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.00'
}
},
{
PUID: 44,
regex: /^FFD8FFE0(.{2})4A464946000102/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.02',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.02',
}
},
{
PUID: 43,
regex: /^FFD8FFE0(.{2})4A464946000101/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.01',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.01',
}
},
{
PUID: 42,
regex: /^FFD8FFE0(.{2})4A464946000100/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.00',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.00',
}
},
{
PUID: 41,
xPUID: 398,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323030/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.0',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.0',
}
},
{
PUID: 41,
xPUID: 398,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323030/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.0',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.0',
}
},
{
PUID: 41,
xPUID: 390,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323130/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.1',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.1',
}
},
{
PUID: 41,
xPUID: 390,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323130/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.1',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.1',
}
},
{
PUID: 41,
xPUID: 391,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323230/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.2',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.2',
}
},
{
PUID: 41,
xPUID: 391,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323230/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.2',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.2',
}
},
// specific JPEG (all begin with FFD8FF, map them to PUID 41)
{
PUID: 41,
regex: /^FFD8FFED/,
desc: 'jpeg: JPG Image File, Adobe JPEG, Photoshop CMYK buffer'
},
{
PUID: 41,
regex: /^FFD8FFE2/,
desc: 'jpeg: JPG Image File, Canon JPEG, Canon EOS-1D'
},
{
PUID: 41,
regex: /^FFD8FFE3/,
desc: 'jpeg: JPG Image File, Samsung JPEG, e.g. Samsung D500'
},
{
PUID: 41,
regex: /^FFD8FFDB/,
desc: 'jpeg: JPG Image File, Samsung JPEG, e.g. Samsung D807'
}
],
ext: ['JPG', 'JPE', 'JPEG', 'SPF', 'SPIFF'],
signature: [ 255, 216 ],
desc: 'jpeg: JPEG File Interchange Format file, App0 marker not known',
mime: 'image/jpeg',
specifications: [
{ text:'Specification for the JFIF file format', href:'http://www.w3.org/Graphics/JPEG/jfif3.pdf', type:'W3', format:'pdf' },
{ text:'The JPEG compression specification', href:'http://www.w3.org/Graphics/JPEG/itu-t81.pdf', type:'W3', format:'pdf' },
{ text:'Exchangeable image file format for digital still cameras', href:'http://home.jeita.or.jp/tsc/std-pdf/CP3451C.pdf', type:'vendor', format:'pdf' }
],
references: [
{ text:'JPEG JFIF W3 Info', href:'http://www.w3.org/Graphics/JPEG/', type:'W3', format:'html' },
{ text:'JPEG.org', href:'http://www.jpeg.org/', type:'info', format:'html' },
{ text:'JPEG Exif App markers', href:'http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/JPEG.html', type:'info', format:'html'}
]
}
No, it certainly doesn't have to be that way. Reading Wikipedia.
As far as I can tell, the APPn segments are just ways for applications to embed arbitrary data into the image file. Obviously, applications commonly take advantage of this and write 0xFF 0xEO
or 0xFF 0xE1
bytes into the header, but it would be entirely plausible for an application to not do this and just go on with the image data. The first two bytes (0xFF and 0xD8) are mandatory, as they are the SOI (start-of-image) marker.
Those first two bytes are the JPEG SOI marker so are always present.
The 2nd and 3rd bytes appear to store meta data, which probably isn't present in every JPEG.
Further Reading.
Typically yes, however my understanding of JPEG is that any segment type can follow the header.
For a JFIF file, a JFIF header should follow immediately after FFD8. The JFIF header is contained within an APP0 marker. The specification doesn't say anything about padding though.
Without the JFIF header we can only guess what color format is used.
In theory, yes. According to the JFIF spec (pdf), its APP0 section should come first in the file. And the Exif spec (pdf) requires the same for its APP1 section.
But you shouldn't count on the order (or even the existance of) APPn sections; there are crazy JPEG writers out there. Start with the SOI and read the sections as they come by.
No. There are programs that remove some markers. ImageOptim is such a program. You only need some of the markers. This program will also optimise the huffman tables
精彩评论