Detect MPEG4/H264 I-Frame (IDR) in RTP stream
I need to detect MPEG4 I-Frame in RTP packet. I know how to remove RTP header and get the MPEG4 frame in it, but I can't figure out how to identify开发者_如何转开发 the I-Frame.
Does it have a specific signature/header?
Ok so I figured it out for h264 stream.
How to detect I-Frame:
- remove RTP header
- check the value of the first byte in h264 payload
- if the value is 124 (0x7C) it is an I-Frame
I cant figure it out for the MPEG4-ES stream... any suggestions?
EDIT: H264 IDR
This works for my h264 stream (fmtp:96 packetization-mode=1; profile-level-id=420029;
). You just pass byte array that represents the h264 fragment received through RTP. If you want to pass whole RTP, just correct the RTPHeaderBytes
value to skip RTP header. I always get the I-Frame, because it is the only frame that can be fragmented, see here. I use this (simplified) piece of code in my server, and it works like a charm!!!! If the I-Frame (IDR) is not fragmented, the fragment_type
would be 5, so this code would return true
for the fragmented and not fragmented IDRs.
public static bool isH264iFrame(byte[] paket)
{
int RTPHeaderBytes = 0;
int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
int start_bit = paket[RTPHeaderBytes + 1] & 0x80;
if (((fragment_type == 28 || fragment_type == 29) && nal_type == 5 && start_bit == 128) || fragment_type == 5)
{
return true;
}
return false;
}
Here's the table of NAL unit types:
Type Name
0 [unspecified]
1 Coded slice
2 Data Partition A
3 Data Partition B
4 Data Partition C
5 IDR (Instantaneous Decoding Refresh) Picture
6 SEI (Supplemental Enhancement Information)
7 SPS (Sequence Parameter Set)
8 PPS (Picture Parameter Set)
9 Access Unit Delimiter
10 EoS (End of Sequence)
11 EoS (End of Stream)
12 Filter Data
13-23 [extended]
24-31 [unspecified]
EDIT 2: MPEG4 I-VOP
I forgot to update this... Thanx to Che and ISO IEC 14496-2 document, I managed to work this out! Che was rite, but not so precise in his answer... so here is how to find I, P and B frames (I-VOP, P-VOP, B-VOP) in short:
- VOP (Video Object Plane -- frame) starts with a code
000001B6
(hex). It is the same for all MPEG4 frames (I,P,B) Next follows many more info, that I am not going to describe here (see the IEC doc), but we only (as che said) need the higher 2 bits from the following byte (next two bits after the byte with the value
B6
). Those 2 bits tell you the VOP_CODING_TYPE, see the table:VOP_CODING_TYPE (binary) Coding method 00 intra-coded (I) 01 predictive-coded (P) 10 bidirectionally-predictive-coded (B) 11 sprite (S)
So, to find I-Frame find the packet starting with four bytes 000001B6
and having the higher two bits of the next byte 00
. This will find I frame in MPEG4 stream with a simple video object type (not sure for advanced simple).
For any other problems, you can check the document provided (ISO IEC 14496-2), there is all you want to know about MPEG4. :)
As far as I know, MPEG4-ES stream fragments in RTP payload usually start with MPEG4 startcode, which can be one of these:
0x000001b0
: visual_object_sequence_start_code (probably keyframe)0x000001b6
: vop_start_code (keyframe, if the next two bits are zero)0x000001b3
: group_of_vop_start_code, which contains three bytes and then hopefully a vop_start_code that may or may not belong to a keyframe (see above)0x00000120
: video_object_layer_start_code (probably keyframe)0x00000100
-0x0000011f
: video_object_start_code (those look like keyframes as well)- something else (probably not a keyframe)
I'm afraid that you'll need to parse the stream to be sure :-/
Actually, you was correct for h264 stream, if the NAL value (first byte) is 0x7C
it means that the I-Frame is fragmented. No other frames (P and B) can be fragmented, so if there is packetization-mode=1
in SDP
, then it means that the I-Frames are fragmented, and therefore if you read 0x7C
as first byte, then it is I-Frame. Read more here: http://www.rfc-editor.org/rfc/rfc3984.txt.
This worked for me:
- Figure out the "payload type", for example: Payload type: DynamicRTP-Type-96 (96)
- Tell wireshark which stream is H264: File->preferences->protocols->H264. Enter 96 as payload type.
- Filter on slice_type:"h264.slice_type eq 7"
For H264:
- Remove RTP header.
- If chunk NAL type (in first byte) is SPS (7) or PPS (8) mark the frame as IFrame (many cameras not use SPS and PPS (Axis included)).
- If chunk NAL type is #28 FU A (fragmentation unit A), check FU Header (next byte) if is NAL type IDR (5) (IDR (Instantaneous Decoding Refresh) Picture) is an IFrame.
Examples:
nal_ref_idc: 3, nal type: 7 (0x07) descripcion: 7 (SPS)<br>
00000000 24 00 00 2B 80 60 22 ED 96 57 3E 68 57 F3 22 B5 $..+.`"í.W>hWó"µ<br>
00000010 67 64 00 1E AD 84 01 0C 20 08 61 00 43 08 02 18 gd..... .a.C...
00000020 40 10 C2 00 84 2B 50 5A 09 34 DC 04 04 04 08 @.Â..+PZ.4Ü....<br>
nal_ref_idc: 3, nal type: 8 (0x08) descripcion: 8 (PPS)<br>
00000000 24 00 00 10 80 60 22 EE 96 57 3E 68 57 F3 22 B5 $....`"î.W>hWó"µ
00000010 68 EE 3C B0 hî<°
FU_A (fragmentation unit A)
nal_ref_idc: 3, nal type: 5 (0x05) descripcion: 5 (IDR (Instantaneous Decoding Refresh) Picture)
00000000 24 00 05 96 80 60 22 F1 96 57 3E 68 57 F3 22 B5 $....`"ñ.W>hWó"µ
00000010 7C 05 A0 AA 2F 81 92 AB CA FE 9E 34 D8 06 AD 74 |. ª/..«Êþ.4Ø.t
...
0x000001b6: vop_start_code (keyframe, if the next two bits are zero) this is correct way for MPEG-4
精彩评论