开发者

very difficult mysql query - random order on two tables

Consider this classical setup:

entry table:

id (int, PK)

title (varchar 255)

entry_category table:

entry_id (int)

category_id (int)

category table:

id (int, PK)

title (varchar 255)

Which basically means entries can be in one or more categories (the entry_category table is used as MM/join table)

Now I need to query 6 unique categorys along with 1 unique entries from these categories by RANDOM!

EDIT: To clarify: the purpose of this is to display 6 random categories with 1 random entry per category.

A correct result set would look like this:

category_id   entry_id     
10            200 
20            300
30            400
40            500
50            600
60            700

This would be incorrect as there are duplicates in the category_id column:

category_id  entry_id
10           300
20           300
...

And this is incorrect as there are duplicates in the member_id column:

category_id  entry_id     
20           300
20           400
...

How can I query this?

If I use this simple query with order by rand, the result contains duplicated rows:

select c.id, e.id
from category c    
inner join entry_category ec on ec.category_id = c.id
inner join entry e on e.id = ec.entry_id
group by c.id
order by rand()

Performance is at the moment not the most important factor, but I wou开发者_如何学编程ld need a reliably working query for this, and the above is pretty much useless and does not do what I want at all.

EDIT: as an aside, the above query is no better when using select distinct ... and leaving out the group by. This includes duplicate rows as distinct only makes sure that the combinations of c.id and e.id are unique.


EDIT: one solution I found, but probably slow as hell on larger datasets:

select t1.e_id, t2.c_id
from (select e.id as e_id from entry e order by rand()) t1
inner join (select ec.entry_id as e_id, ec.category_id as c_id from entry_category ec group by e_id order by rand()) t2 on t2.e_id = t1.e_id
group by t2.c_id
order by rand()


SELECT  category_id, entity_id
FROM    (
        SELECT  category_id,
                @ce :=
                (
                SELECT  entity_id
                FROM    category_entity cei
                WHERE   cei.category_id = ced.category_id
                        AND NOT FIND_IN_SET(entity_id, @r)
                ORDER BY
                        RAND()
                LIMIT 1
                ) AS entity_id,
                (
                SELECT  @r := CAST(CONCAT_WS(',', @r, @ce) AS CHAR)
                )
        FROM    (
                SELECT  @r := ''
                ) vars,
                (
                SELECT  DISTINCT category_id
                FROM    category_entity
                ORDER BY
                        RAND()
                LIMIT 15
                ) ced
       ) q
WHERE  entity_id IS NOT NULL
LIMIT  6

This solution is not a piece of code I'd be proud of, since it relies on black magic of session variables in MySQL to keep the recursion stack. However, it works.

Also it's not perfectly random and can in fact yield less than 6 values (if entity_id's duplicate across the categories too often). In this case, you can increase the value of 15 in the innermost query.

Create a unique index or a PRIMARY KEY on category_entity (category_id, entity_id) for this to work fast.


Seems to me that the good way to do this is to pick 6 distinct values from each set, shuffle each list of values (each list individually), and then glue the lists together into a two-column result.

To randomize which six you get, shuffle the entire list of each type of value, and grab the first six.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜