Deleting and adding Hash and Array references in Perl
Continuing on from my previous question, I ran into another problem down the road. I realized not only should there be hashes within hashes, there can also be arrays within hashes. So the paths would be something like
one/two/three
one/two[]/three
one/two/four
one/two[]/four
i.e. the hash is supposed to contain the array will always have a []
as its suffix. According to the script I'm using (slightly modified version of the answer开发者_StackOverflow of my previous question), the above paths would result in:
one => {
two => {
three => "",
four => "",
}
two[] => [
{
three => "",
four => "",
}
]
}
The script I'm using is:
# !/usr/bin/perl
use Data::Dumper;
sub insert {
my ($ref, $head, @tail) = @_;
if ( @tail ) {
if( $head !~ /^(.*)(\[\])$/ ) {
insert( \%{$ref->{$head}}, @tail );
} else {
my %newhash = ();
unshift(@{$ref->{$1 . $2}}, %newhash);
insert( \%{$ref->{$1 . $2}[0]}, @tail );
}
} else {
$ref->{$head} = '';
}
}
my %hash;
chomp and insert \%hash, split( '/', $_ ) while <>;
print Dumper %hash;
What I'd like to do, is once I find two[]
, I'd like to delete two
and add it to the array of two[]
(if two
exists) and then rename key two[]
to two
.
So the end result would look something like:
one => {
two => [
{
three => "",
four => "",
},
{
three => "",
four => "",
}
]
}
So I tried adding checks within the if else
for checking keys with or without []
suffixes but I got a range or errors like [$variable] is not a valid HASH reference
, etc. How would one go about checking the type of the variable (eg $ref->{$head} is array?
) and deleting and renaming keys of the hash efficiently?
Thanks.
Okay, by all rights, this should suck AND not do what you want - But I spent the last hour trying to get it somewhat right, so I'll be damned. Each 'anything[]' is an array of two elements, each a hashref: One for elements that appear after a bare 'anything', and the second for elements appearing after a 'anything[]'. I probably should have used a closure instead of relying on that crappy $is_non_bracket variable -- I'll take another look in the morning when I'm less retarded and more ashamed of writing this.
I think that it's tail-call optimized (the goto &SUB part). It also makes (small) use of named captures.
use strict;
use warnings;
use 5.010;
use Data::Dumper;
sub construct {
my $node = shift;
return unless @_;
my $next = shift;
my $is_non_bracket = 1;
$next .= '[]' and $is_non_bracket-- if exists $node->{ $next . '[]' };
if ( $next =~ / (?<node>[^\[\]]+) \Q[]/x ) {
if ( exists $node->{ $+{node} } or not defined( $node->{$next} ) ) {
push @{ $node->{$next} }, (delete $node->{ $+{node} } // {}); #/
}
unshift @_, $node->{$next}->[$is_non_bracket] ||= {};
}
else {
$node->{$next} ||= @_ ? {} : $node->{$next};
unshift @_, $node->{$next} //= @_ ? {} : ''; #/
}
goto &construct;
}
my %hash;
while (<DATA>) {
chomp;
construct( \%hash, split m!/! );
}
say Dumper \%hash;
__DATA__
one/two/three
one/two[]/three
one/two[]/three/four
one/two[]/three/four/five[]
one/two[]/three/four/whatever
one/two/ELEVAN
one/three/sixteen
one/three[]/whygodwhy
one/three/mrtest/mruho
one/three/mrtest/mruho[]/GAHAHAH
EDIT: Regex had an extra space after the quotemeta that made it break down; My bad.
EDIT2: Okay, it's the morning, edited in a version that isn't so stupid. No need for the ref, as we always pass a hashref; The #/ are there to stop the //'s from borking the highlighting.
EDIT3: Just noticed you DON'T want those [] to show up in the data structure, so here's a version that doesn't show them:
sub construct {
my $node = shift;
return unless @_;
my $is_bracket = (my $next = shift) =~ s/\Q[]// || 0;
if (ref $node->{$next} eq 'ARRAY' or $is_bracket) {
if ( ref $node->{ $next } ne 'ARRAY' ) {
my $temp = delete $node->{ $next } || {};
push @{ $node->{$next} = [] }, $temp;
}
unshift @_, $node->{$next}->[$is_bracket] ||= {};
}
else {
$node->{$next} ||= @_ ? {} : $node->{$next};
unshift @_, $node->{$next} //= @_ ? {} : ''; #/
}
goto &construct;
}
EDITNaN: Here's the gist of what it does: If there are enough arguments, we shift for a second time and put the value in the $next, which is promptly pulled into a substitution, which takes away its [], should it have any: If it does, the substitution returns 1, otherwise, s/// returns undef (or the empty string, I forget), so we use the logical-or to set the return value to 0; Either way, we set $is_bracket to this.
Afterwards, if $node->{$next} is an arrayref or $next had brackets: If $node->{$next} wasn't an arrayref (so we got here because $next had brackets, and it was the first time this has happened), it' either undef, the empty string, or a hashref; We delete whatever it is, and store it in $temp. We then set the now-empty $node->{$next} to an arrayref, and set(push) $temp as its first element - Meaning that, for instance, if 'two' had existed previous, and $next was originally 'two[]', then 'two' will now point to an arrayref, and its old value will be stored in [0]. Once $node->{$next} is an arrayref (or if it already was), we unshift the hashref in the index pointed by $is_backet - 0 if $next didn't have brackets, and 1 if it did - to @_. If the hashref doesn't exist (either because it's undef, for both, or possibly the empty string, for 0), we assign it an brand new hashref with the logical-or.
If it wasn't an arrayref, it's a hashref, so we do the same thing as before, and unshift the resulting value to @_.
We do all this unshifting because the magic goto passes our current @_ to the function that will replace us.
I'm not sure by what logic you reach your expected output, but I can clarify the following:
- You can check the type of the reference by using the
ref
function. - In your current code, $ref is treated as a hash reference in every single case. I can tell because you dereference it as a hash using the $ref->{...} syntax in every single clause of your if statements.
- If I read you correctly,
$1 . $2
should be the same thing as$head
. I find it more clear as just$head
. - Your
unshift
line vivifies$ref->{$1 . $2}
as an array reference (to an explicit empty array). And the next line vivifies its first element as a hash reference. The first line seems superfluous to me, you'd get the same results with the soleinsert
line; so I'm not sure of the intent.
精彩评论