开发者

How do I properly join tables and use a group by to summarize data in MySQL?

I am working in MySQL and having a bit of trouble with building a query that will summarize columns from two tables. I want to comp开发者_运维知识库are the quantity of requests per day, for a table containing hourly records and a table containing daily aggregation, per day. Ideally the sums of each would be identical.

Here is the schema:

Hourly Table:

CREATE TABLE requests_hourly (
 customer_id INT,
 date DATETIME,
 requests BIGINT,
 req_type SMALLINT );

Daily Table

CREATE TABLE requests_daily (
 customer_id INT,
 date DATE,
 requests BIGINT,
 req_type SMALLINT );

Not working SQL to get me all the requests, by req_type across both tables for June 2010

SELECT
 SUM(h.requests),
 SUM(d.requests),
 h.req_type
FROM requests_hourly h
LEFT OUTER JOIN requests_daily d ON d.req_type = h.req_type
WHERE h.date >= '2010-06-01 00:00:00'
 AND h.date < '2010-07-01 00:00:00'
 AND d.date >= '2010-06-01 00:00:00'
 AND d.date < '2010-07-01 00:00:00'
GROUP BY h.req_type;

I have a feeling the error is in the JOIN. Thank you in advance for your help!

Answer

I gave credit to Peter for for the answer, but it did require a little modification. So here is the MySQL SQL code:

SELECT *
FROM
  (SELECT SUM(requests) AS 'Daily Request Sum', req_type
   FROM requests_daily
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type, date) d
  INNER JOIN
  (SELECT SUM(requests) AS 'Hourly Request Sum', req_type
   FROM requests_hourly
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type, DATE(date)) h
USING (req_type, date)


According to my query analyzer this is the fastest / most efficient method presented thus far:

SELECT *
FROM
  (SELECT SUM(requests) AS 'Daily Request Sum', req_type
   FROM requests_daily
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type)
  INNER JOIN
  (SELECT SUM(requests) AS 'Hourly Request Sum', req_type
   FROM requests_hourly
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type)
USING (req_type, date)

It fallows simple logic that by restricting the size and prearranging the indexes in the tables being joined before the operation takes place significantly reduces the relatively "expensive" overheard incurring by matching the indexes of the two tables (req_type) against each other especially since you do not have unique indexes (req_type) which will force the database to perform searches rather than seeks.

Hope this was helpful.


I'd use a subquery.

SELECT `req_type`
     , `date`
     , COUNT(0) AS `daily_data_count`
     , (
           SELECT COUNT(0)
             FROM `requests_hourly`
            WHERE CAST(`date` AS DATE)
                = `requests_daily`.`date`
       ) AS `hourly_data_count`
  FROM `requests_daily`
 WHERE `date` BETWEEN '2010-06-01' AND '2010-06-30'
 GROUP BY `req_type`, `date`


Have you considered approaching the problem with a union all.

select
  sum(requests) total,
  req_type
from
  requests_daily
where
 requests_daily.date >= '2010-06-01 00:00:00' AND requests_daily.date < '2010-07-01 00:00:00' 
group by requests_daily.req_type
union all
select
  sum(requests) total,
  req_type
from
  requests_hourly
where
 requests_hourly.date >= '2010-06-01 00:00:00' AND requests_hourly.date < '2010-07-01 00:00:00' 
group by requests_hourly.req_type

Enjoy!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜