Can I use Perl's unpack to break up a string into vars?

2022-12-08 12:33 问答作者：

I have an image file name that consists of four parts:

$Directory (the directory where the image exists)
$Name (for a art site, this is the paintings name reference #)
$File (the images file name minus extension)
$Extension (the images extension)

$example 100020003000.png

Which I desire to be broken down accordingly:

$dir=1000 $name=2000 $file=3000 $ext=.png

I was wondering if substr was the best option in breaking up the incoming $example so I can do stuff with the 4 variables like validation/error checking, grabbing the verbose name from its $Name assignment or whatever. I found this post:

is unpack faster than substr? So, in my beginners "stone tool" approach:

my $example = "100020003000.png";
my $dir = substr($example, 0,4);
my $name = substr($example, 5,4);
my $file = substr($example, 9,4);
my $ext = substr($example, 14,3); # will add the the  "." later #

So, can I use unpack, or maybe even another approach that would be more efficient?

I would also like to avoid loading any modules unless doing so would use less resources for some reason. Mods are great tools I luv开发者_JAVA百科'em but, I think not necessary here.

I realize I should probably push the vars into an array/hash but, I am really a beginner here and I would need further instruction on how to do that and how to pull them back out.

Thanks to everyone at stackoverflow.com!

Absolutely:

my $example = "100020003000.png";
my ($dir, $name, $file, $ext) = unpack 'A4' x 4, $example;

print "$dir\t$name\t$file\t$ext\n";

Output:

1000    2000    3000    .png

I'd just use a regex for that:

my ($dir, $name, $file, $ext) = $path =~ m:(.*)/(.*)/(.*)\.(.*):;

Or, to match your specific example:

my ($dir, $name, $file, $ext) = $example =~ m:^(\d{4})(\d{4})(\d{4})\.(.{3})$:;

Using unpack is good, but since the elements are all the same width, the regex is very simple as well:

my $example = "100020003000.png";
my ($dir, $name, $file, $ext) = $example =~ /(.{4})/g;

It isn't unpack, but since you have groups of 4 characters, you could use a limited split, with a capture:

my ($dir, $name, file, $ext) = grep length, split /(....)/, $filename, 4;

This is pretty obfuscated, so I probably wouldn't use it, but the capture in a split is an ofter overlooked ability.

So, here's an explanation of what this code does:

Step 1. split with capturing parentheses adds the values captured by the pattern to its output stream. The stream contains a mix of fields and delimiters.

qw( a 1 b 2 c 3 ) == split /(\d)/, 'a1b2c3';

Step 2. split with 3 args limits how many times the string is split.

qw( a b2c3 ) == split /\d/, 'a1b2c3', 2;

Step 3. Now, when we use a delimiter pattern that matches pretty much anything /(....)/, we get a bunch of empty (0 length) strings. I've marked delimiters with D characters, and fields with F:

 ( '', 'a', '', '1', '', 'b', '', '2' ) == split /(.)/, 'a1b2';
   F    D   F    D   F    D   F    D

Step 4. So if we limit the number of fields to 3 we get:

 ( '', 'a', '', '1', 'b2' ) == split /(.)/, 'a1b2', 3;
   F    D   F    D   F

Step 5. Putting it all together we can do this (I used a .jpeg extension so that the extension would be longer than 4 characters):

 ( '', 1000, '', 2000, '', 3000, '.jpeg' ) = split /(....)/, '100020003000.jpeg',4;
   F   D     F   D     F   D     F

Step 6. Step 5 is almost perfect, all we need to do is strip out the null strings and we're good:

( 1000, 2000, 3000, '.jpeg' ) = grep length, split /(....)/, '100020003000.jpeg',4;

This code works, and it is interesting. But it's not any more compact that any of the other solutions. I haven't bench-marked, but I'd be very surprised if it wins any speed or memory efficiency prizes.

But the real issue is that it is too tricky to be good for real code. Using split to capture delimiters (and maybe one final field), while throwing out the field data is just too weird. It's also fragile: if one field changes length the code is broken and has to be rewritten.

So, don't actually do this.

At least it provided an opportunity to explore some lesser known features of split.

Both substr and unpack bias your thinking toward fixed-layout, while regex solutions are more oriented toward flexible layouts with delimiters.

The example you gave appeared to be fixed layout, but directories are usually separated from file names by a delimiter (e.g. slash for POSIX-style file systems, backwardslash for MS-DOS, etc.) So you might actually have a case for both; a regex solution to split directory and file name apart (or even directory/name/extension) and then a fixed-length approach for the name part by itself.

继续阅读：perl substr unpack

Can I use Perl's unpack to break up a string into vars?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？