Need help on nested loop of queries in php and mysql?
I am trying to get do this:
<?php
$good_customer = 0;
$q = mysql_query("SELECT user FROM users WHERE activated = '1'"); // this gives me about 40k users
while($r = mysql_fetch_assoc($q)){
$money_spent = 0;
$user = $r['user'];
// Do queries on another 20 tables
for($i = 1; $i<=20 ; $i++){
$tbl_name = 'data' . $i;
$q2 = mysql_query("SELECT money_spent FROM $tbl_name WHERE user = '{$user}'");
while($r2 = mysql_fetch_assoc($q2)){
$money_spend += $r2['money_spent'];
}
if($money_spend > 1000000){
$good_customer += 1;
}
}
}
This is just an example. I am testing on localhost, for single user, it returns very fast. But when I try 1000, it takes forever, not even mentioned 40k users.
Anyway to optimise/improv开发者_如何学Ce this code?
EDIT: By the way, each of the others 20 tables has ~20 - 40k records
EDIT2:
Okay, drop the "money spend" idea. This is my current structures:
user table => user is PK
logs_week_1 table => user is FK.
logs_week_2 table => user is FK
logs_week_3 table => user is FK
... will have more logs tables in future.
I want to find the "average time" they spend on my site which the time stored in each of the logs tables.
So you guys were saying, storing the logs weekly is a bad idea? I should merge into one table?
Sounds like you have a problem with your model. Why do you have 20 data
-tables instead of one with a week
-column?
Then you could do a
Select user, Sum( money_spent ) As total_money_spent
From data
Group By user
or even
Select Count(*) As good_customer_count
From data
Group By user
Having Sum( money_spent ) > 1000000
With your current structure you can only do something like this:
Select u.user, d1.money_spent + d2.money_spent + ...
From users u
Join data1 d1 On ( d1.user = u.user )
Join data2 d2 On ( d2.user = u.user )
...
or
Select Count(*) As good_customer_count
From
( Select d1.money_spent + d2.money_spent + ... As total_money_spent
From data1 d1
Join data1 d1 On ( d1.user = u.user )
Join data2 d2 On ( d2.user = u.user )
...
)
Where total_money_spent > 1000000
This will certainly be faster than your current solution.
And the time spent on a page should be stored in a numeric field.
As Peter already gave a good answer I will only post how the query would look with proper design (all log data in one table)
SELECT user, AVG(TIMEDIFF(start_time, end_time)) AS average_time
FROM logs
GROUP BY user
You can apply further where conditions to the above to get stats for only a certain period (week, month, etc) or you can also group by another level.
You can also get MAX and COUNT in the same query (as well as standard deviation and other aggregate function) in an effective manner.
Of course, take care of your indexes for best performance with larger data sets.
EDIT:
Just as I was giving peter +1 I noticed that he did not mention UNION ALL option
So, you could (this is not optimal and does not contradict design issue warnings given by others)
SELECT user, AVG(TIMEDIFF(start_time, end_time)) AS average_time
FROM (
SELECT * FROM log_week_1
UNION ALL
SELECT * FROM log_week_2
UNION ALL
SELECT * FROM log_week_3
...
) U
GROUP BY user
And also you can create a VIEW for this union.
You should store the time spent on your site as number (in minutes or seconds), not as time. Then you can calculate the averages and sums on this value. And keep your logs in one table.
For 40k users you are creating 1 + 20*40k queries. This will be slow in any case. Stop keeping the logs in 20 tables. You should design your database in another way. On a proper designed database this should all be accomplished with 1 query
SELECT count(user) as good_customers FROM users JOIN $tbl_name ON users.user = {$tbl_name}.user ON WHERE users.activated = '1' HAVING SUM(money_spent) > 100000.
In the worst case scenario you should also do this all with 1 query for each table.
SELECT user, SUM(money_spent) as money_spent FROM users JOIN $tbl_name ON users.user = {$tbl_name}.user ON WHERE users.activated = '1'.
Then sum these 20 money_spent columns and you have your answer.
精彩评论