MySQL optimization: Perform Maths operation inside or outside of a query?
I have a strong feeling that all mathematical operations unnecessary to the query itself ought to be preformed outside of the query. For example:
$result = mysql_query(SELECT a, a*b/c as score FROM table)
while ($row = mysq开发者_StackOverflow中文版l_fetch_assoc($result))
{
echo $row['a'].' score: '.$row['score'].<br>;
}
vs:
$result = mysql_query(SELECT a, b, c FROM table)
while ($row = mysql_fetch_assoc($result))
{
echo $row['a'].' score: '.$row['a']*$row['b']/$row['c'].<br>;
}
the second option would usually be better, especially with complex table joins & such. This is my suspicion, I only lack confirmation . . .
Faster depends on the machines involved, if you're talking about faster for one user. If you're talking about faster for a million users hitting a website, then it's more efficient to do these calculations in PHP.
The load of a webserver running PHP is very easily distributed over a large number of machines. These machines can run in parallel, handling requests from visitors and fetching necessary information from the database. The database, however, is not easy to run in parallel. Issues such as replication or sharding are complex and can require specialty software and properly organized data to function well. These are expensive solutions compared to adding another PHP installation to a server array.
Because of this, the value of a CPU cycle on the database machine is far more valuable than one on the webserver. So you should perform these math functions on the webserver where CPU cycles are cheaper and significantly more easy to parallelize.
This also assumes that the database isn't holding open any sort of data lock while performing the calculation. If so, then you're not just using precious CPU cycles, you're locking data from other users directly.
My feeling would be that doing the maths in the database would be slightly more efficient in the long run, given your query setup. With the select a,b,c
version, PHP has to create 3 elements and populate them for each row fetched.
With the in-database version, only 2 elements are created, so you've cut creation time by 33%. Either way, the calculation has to be done, so there's not much in the way of savings there.
Now, if you actually needed the b
and c
values to be exposed to your code, then there'd be no point in doing the calculation in the database, you'd be adding more fields to the result set with their attendant creation/processing/populating overhead.
Regardless, though, you should benchmark both version. What works in one situation may be worse than useless in another, and only some testing will show which is better.
I'd agree in general. Pull data from source in your query, manipulate data in the calling/scripting environment.
I wouldn't worry too much about efficiency/speed unless your queries get really complex, but it still seems like the right thing to do.
Math in the query is generally not a problem, UNLESS it is in the WHERE clause. Example:
SELECT a, b, c FROM table WHERE a*b=c
This makes it rather impossible to use an index.
SELECT a*b/c FROM table
Is fine.
If there is any performance advantage of one way over the other it is likely going to be very negligible making it more a matter of preference than optimization.
I prefer it in the query, personally because I feel it encapsulates the calculation in the data tier.
Also, although it doesn't apply to your specific example, the more information you give the DB engine about what you are ultimately trying to do, the more information it has to feed the query optimizer. It seems theoretically possible that the query might actually run faster if you put the calculation in the SQL.
Do it in the database is better because you can run the application in one machine and the database in another, that said, I will balance your overall performance. Specially in cheap hosting services, they generally do that, application in one machine database in another.
I doubt it could be a bottleneck.
especially with complex table joins & such, where one filesort will outcome these maths by factor of 1000s
However, you can always perpend your query with BENCHMARK keyword and take some measurements
BENCHMARK 1000 SELECT a, a*b/c as score FROM table
精彩评论