Does using undef as hash values save any memory in Perl?
A Perl idiom for removing duplicate values from an array:
@uniq = keys %{{map{$_=>1}@list}}
Is it cheaper to use this version:
@uniq = keys %{{map{$_=>undef}@list}}
I tested it with these one-liners, and seems that it is true on some versions of Perl:
perl -e 'my %x; $x{$_} = 1 for 0..1000_000; system "ps -ovsz $$"'
perl -e 'my %x; $x{开发者_如何转开发$_} = undef for 0..1000_000; system "ps -ovsz $$"'
Well, undef
is supposed to be a flyweight value, meaning that all references to it point to the same datum. You don't get that for other literals. You still need the overhead of the slot that references it though. However, I'm not seeing it save any memory for me on Perl 5.10 or 5.11 on Mac OS X. While perl
may not be using more memory in the undef
case, I bet it's anticipating using more memory so it grabs it anyway. However, I'm not keen on investigating memory use in the internals right now.
Devel::Peek is pretty handy for showing these sorts of things:
#!perl
use Devel::Peek;
my $a = undef;
my $b = undef;
Dump( $a );
Dump( $b );
my %hash = map { $_, undef } 1 .. 3;
$hash{4} = 'Hello';
Dump( \%hash );
The output looks a bit scary at first, but you see that the undef
values are NULL(0x0)
instead of individual string values (PV
):
SV = NULL(0x0) at 0x100208708
REFCNT = 1
FLAGS = (PADMY)
SV = NULL(0x0) at 0x100208738
REFCNT = 1
FLAGS = (PADMY)
SV = RV(0x100805018) at 0x100805008
REFCNT = 1
FLAGS = (TEMP,ROK)
RV = 0x100208780
SV = PVHV(0x100809ed8) at 0x100208780
REFCNT = 2
FLAGS = (PADMY,SHAREKEYS)
ARRAY = 0x100202200 (0:5, 1:2, 2:1)
hash quality = 91.7%
KEYS = 4
FILL = 3
MAX = 7
RITER = -1
EITER = 0x0
Elt "4" HASH = 0xb803eff9
SV = PV(0x100801c78) at 0x100804ed0
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x100202a30 "Hello"\0
CUR = 5
LEN = 8
Elt "1" HASH = 0x806b80c9
SV = NULL(0x0) at 0x100820db0
REFCNT = 1
FLAGS = ()
Elt "3" HASH = 0xa400c7f3
SV = NULL(0x0) at 0x100820df8
REFCNT = 1
FLAGS = ()
精彩评论