Doing some calculations in mysql, numbers off when using GROUP BY
Im running the following query to get the stats for a user, based on which I pay them.
SELECT hit_paylevel, sum(hit_uniques) as day_unique_hits
, (sum(hit_uniques)/1000)*hit_paylevel as day_earnings
, hit_date
FROM daily_hits
WHERE hit_user = 'xxx' AND hit_date >= '2011-05-01' AND hit_date < '2011-06-01'
GROUP BY hit_user
The table in question looks like this:
CREATE TABLE IF NOT EXISTS `daily_hits` (
`hit_itemid` varchar(255) NOT NULL,
`hit_mainid` int(11) NOT NULL,
`hit_user` int(11) NOT NULL,
`hit_date` date NOT NULL,
`hit_hits` int(11) NOT NULL DEFAULT '0',
`hit_uniques` int(11) NOT NULL,
`hit_embed` int(11) NOT NULL,
`hit_paylevel` int(1) NOT NULL D开发者_开发知识库EFAULT '1',
PRIMARY KEY (`hit_itemid`,`hit_date`),
KEY `hit_user` (`hit_user`),
KEY `hit_mainid` (`hit_mainid`,`hit_date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The problem in the calculation has to do with the hit_paylevel which acts as a multiplier. Default is one, the other option is 2 or 3, which essentially doubles or triples the earnings for that day.
If I loop through the days, the daily day_earnings
is correct, its just that when I group them, it calculates everything as paylevel 1. This happens if the user was paylevel 1 in the beginning, and was later upgraded to a higher level. if user is pay level 2 from the start, it also calculates everything correctly.
Shouldn't this be sum(hit_uniques * hit_paylevel) / 1000
?
Like @Denis said:
Change the query to
SELECT hit_paylevel, sum(hit_uniques) as day_unique_hits
, sum(hit_uniques * hit_paylevel) / 1000 as day_earnings
, hit_date
FROM daily_hits
WHERE hit_user = 'xxx' AND hit_date >= '2011-05-01' AND hit_date < '2011-06-01'
GROUP BY hit_user;
Why this fixes the problem
Doing the hit_paylevel
outside the sum, first sums all hit_uniques
and then picks a random hit_paylevel
to multiply it by.
Not what you want. If you do both columns inside the sum MySQL will pair up the correct hit_uniques
and hit_paylevels
.
The dangers of group by
This is an important thing to remember on MySQL.
The group by
clause works different from other databases.
On MSSQL *(or Oracle or PostgreSQL) you would have gotten an error
non-aggregate expression must appear in group by clause
Or words to that effect.
In your original query hit_paylevel
is not in an aggregate (sum) and it's also not in the group by
clause, so MySQL just picks a value at random.
精彩评论