What's wrong with this alternative mechanism to make DBI queries?
In the DBI documentation, this is the recommended code for executing a query many times:
$sth = $dbh->prepare_cached($statement);
$sth->execute(@bind);
$data = $sth->fetchall_arrayref(@attrs);
$sth->finish;
However, I see that many* query methods allow passing a prepared and cached statement handle in place of a query string, which makes this possible:
$sth = $dbh->prepare_cached($statement);
$data = $dbh->selectall_arrayref($sth, \%attrs, @bind);
Is there anything wrong with this approach? I haven't seen it开发者_JAVA百科 used in the wild.
FWIW, I have benchmarked these two implementations. And the second approach appears marginally (4%) faster, when querying for two consecutive rows using fetchall_arrayref
in the first implementation vs selectall_arrayref
in the second.
* The full list of query methods which support this are:
- selectrow_arrayref - normal method with prepared statements is fetchrow_arrayref
- selectrow_hashref - " " fetchrow_hashref
- selectall_arrayref - " " fetchall_arrayref
- selectall_hashref - " " fetchall_hashref
- selectcol_arrayref (doesn't really count, as it has no parallel method using the first code path as described above - so the only way to use prepared statements with this method is to use the second code path above)
There's nothing wrong with it, as long as you were planning to do only one fetch. When you use the select*_*
methods, all the data comes back in one chunk. My DBI code more often looks like this:
$sth = $dbh->prepare_cached($statement);
$sth->execute(@bind);
while (my $row = $sth->fetch) { # alias for fetchrow_arrayref
# do something with @$row here
}
There's no equivalent to this using a select*_*
method.
If you're going to call fetchall_*
(or you're only fetching 1 row), then go ahead and use a select*_*
method with a statement handle.
No, there's nothing wrong that approach. There is something wrong with your benchmark or its analysis, though.
You've claimed that
$sth->execute(@bind);
$data = $sth->fetchall_arrayref(@attrs);
$sth->finish;
is slower than a call to
sub selectall_arrayref {
my ($dbh, $stmt, $attr, @bind) = @_;
my $sth = (ref $stmt) ? $stmt : $dbh->prepare($stmt, $attr)
or return;
$sth->execute(@bind) || return;
my $slice = $attr->{Slice}; # typically undef, else hash or array ref
if (!$slice and $slice=$attr->{Columns}) {
if (ref $slice eq 'ARRAY') { # map col idx to perl array idx
$slice = [ @{$attr->{Columns}} ]; # take a copy
for (@$slice) { $_-- }
}
}
my $rows = $sth->fetchall_arrayref($slice, my $MaxRows = $attr->{MaxRows});
$sth->finish if defined $MaxRows;
return $rows;
}
Maybe if you got rid of the useless call to finish
you'll find the first faster? Note that benchmarks with less than 5% difference are not very telling; the accuracy isn't that high.
Update: s/faster than/slower than/
The performance difference should not be between selectall_arrayref() and fetchall_arrayref() but between fetchall_arrayref() and doing a fetch() in a loop yourself. fetchall_arrayref() may be faster as it is hand optimized in C.
The docs for fetchall_arrayref discuss performance...
If $max_rows is defined and greater than or equal to zero then it is
used to limit the number of rows fetched before returning.
fetchall_arrayref() can then be called again to fetch more rows. This
is especially useful when you need the better performance of
fetchall_arrayref() but don't have enough memory to fetch and return
all the rows in one go.
Here's an example (assumes RaiseError is enabled):
my $rows = []; # cache for batches of rows
while( my $row = ( shift(@$rows) || # get row from cache, or reload cache:
shift(@{$rows=$sth->fetchall_arrayref(undef,10_000)||[]}) )
) {
...
}
That might be the fastest way to fetch and process lots of rows using
the DBI, but it depends on the relative cost of method calls vs memory
allocation.
A standard "while" loop with column binding is often faster because the
cost of allocating memory for the batch of rows is greater than the
saving by reducing method calls. It's possible that the DBI may provide
a way to reuse the memory of a previous batch in future, which would
then shift the balance back towards fetchall_arrayref().
So that's a definitive "maybe". :-)
I don't think there's really any advantage to using one over the other, other than that the first uses three lines and the second uses one (less possibility for bugs with the second method). The first might be more commonly used because the documentation states that the "typical method call sequence for a SELECT statement is prepare, execute, fetch, fetch, ... execute, fetch, fetch, ..." and gives this example:
$sth = $dbh->prepare("SELECT foo, bar FROM table WHERE baz=?");
$sth->execute( $baz );
while ( @row = $sth->fetchrow_array ) {
print "@row\n";
}
Now, I'm not suggesting that programmers actually read the documentation (heaven forbid!) but given its prominence near the top of the documentation in a section designed to show you how to use the module, I would suspect that the more-verbose method is slightly more preferred by the module's author. As to why, your guess is as good as mine.
精彩评论