开发者

Get Total Associated Record Count Across Multiple MySQL Tables

I have the following query that UNIONs two tables which contains tags associated with two separate entities: cat and dog. I am trying to create a table that lists tags and the number of times their appear across two tables. This is a normalized tagging system, so only the tag ID is referenced in the cat_tags and dog_tags tables, and that's why I am doing another INNER JOIN to get the actual tag value.

SELECT x.tag_id, (COUNT(y.tag_id) + COUNT(z.tag_id)) AS num, tag_name AS tag
FROM (SELECT dt.tag_id FROM dog_tags dt UNION SELECT st.tag_id FROM cat_tags st) x
LEFT JOIN dog_tags y ON y.tag_id = x.tag_id
LEFT JOIN cat_tags z ON z.tag_id = x.tag_id
INNER JOIN tags t ON x.tag_id = t.tag_id
GROUP BY x.tag_id ORDER BY num DESC LIMIT 0,100

The problem is that the num counts are not correct for tags that appear multiple times across the two tables. For example tag ID number 5 ("collar") appears twice in cat_tags and twice in dog_tags, however the above query gives the total count as 8 instead of 4. Another tag that ap开发者_运维技巧pears three times comes out as 6. Something is multiplying them by 2. What is it?


I think your query is really complicated and you could try something like this:

  SELECT tag_id
       , tag_name
       , sum(num) as num
    FROM tags
    join
      (
          SELECT tag_id, count(*) as num FROM dog_tags GROUP BY tag_id
          union all
          SELECT tag_id, count(*) as num FROM cat_tags GROUP BY tag_id
      ) as AnimalsCount on AnimalsCount.tag_id = tags.tag_id
GROUP BY tag_id
       , tag_name -- you can remove this if you are 100% sure is not necessary

BTW check the difference between union and union all: http://dev.mysql.com/doc/refman/5.0/en/union.html.


Try this:

select t.tag_id, t.tag_name as tag,
    ifnull(dc.dog_total, 0) + ifnull(cc.cat_total, 0) as num
from
    tags t
    left join (
        select tag_id, count(*) as dog_total
        from dog_tags
        group by tag_id
    ) as dc on t.tag_id = dc.tag_id
    left join (
        select tag_id, count(*) as cat_total
        from cat_tags
        group by tag_id
    ) as cc on t.tag_id = cc.tag_id
order by num desc
limit 0, 100

Well, the problem was the joins you have in your query. Because you weren't grouping by tag_id for each source table. So, if collar appeared 2 times in dog_tags and 3 in cat_tags the join would result in 6 rows that share the same tag_id, so the counts would be wrong. Remember that a join is a Cartesian product of the rows of each joined table with a certain criteria. So, by grouping first by tag_id for each source table, we ensure that the tag_id appears only once in each table or derived table. And when we join the tables each tag_id will produce a single row.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜