Hex dump parsing in perl

2023-01-10 12:44 问答作者：

I have a hex dump of a message in a file which i want to get it in an array so i can perform the decoding logic on it.

I was wondering if that was a easier way to parse a message which looks like this.

37 39 30 35 32 34 35 34 3B 32 31 36 39 33 34 35
3B 32 31 36 39 33 34 36 00 00 01 08 40 00 00 15
6C 71 34 34 73 69 6D 31 5F 33 30 33 31 00 00 00
00 00 01开发者_如何学编程 28 40 00 00 15 74 65 6C 63 6F 72 64 69
74 65 6C 63 6F 72 64 69

Note that the data can be max 16 bytes on any row. But any row can contain fewer bytes too (minimum :1 )

Is there a nice and elegant way rather than to read 2 chars at a time in perl ?

Perl has a hex operator that performs the decoding logic for you.

hex EXPR

hex

Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0, 0x, or 0b, see oct.) If EXPR is omitted, uses $_.
print hex '0xAf'; # prints '175'
print hex 'aF'; # same

Remember that the default behavior of split chops up a string at whitespace separators, so for example

$ perl -le '$_ = "a b c"; print for split'
a
b
c

For every line of the input, separate it into hex values, convert the values to numbers, and push them onto an array for later processing.

#! /usr/bin/perl

use warnings;
use strict;

my @values;
while (<>) {
  push @values => map hex($_), split;
}

# for example
my $sum = 0;
$sum += $_ for @values;
print $sum, "\n";

Sample run:

$ ./sumhex mtanish-input 
4196

I would read a line at a time, strip the whitespace, and use pack 'H*' to convert it. It's hard to be more specific without knowing what kind of "decoding logic" you're trying to apply. For example, here's a version that converts each byte to decimal:

while (<>) {
  s/\s+//g;
  my @bytes = unpack('C*', pack('H*', $_));
  print "@bytes\n";
}

Output from your sample file:

55 57 48 53 50 52 53 52 59 50 49 54 57 51 52 53
59 50 49 54 57 51 52 54 0 0 1 8 64 0 0 21
108 113 52 52 115 105 109 49 95 51 48 51 49 0 0 0
0 0 1 40 64 0 0 21 116 101 108 99 111 114 100 105
116 101 108 99 111 114 100 105

I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units.

Is there some reason you think that's ugly?

If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.

继续阅读：hexdump parsing perl

Hex dump parsing in perl

`hex EXPR`

`hex`

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

hex EXPR

hex