开发者

complex group by mysql query

I ha开发者_如何学运维ve a table with the following rows:

id. user_id, type  - link 

 1. 555, image - http://1
 2. 555, image - http://2
 3. 654, image  - http://3
 4. 245, video - http://..
 5. 555, image - http://..
 6. 878, text  - http://..

I want to group the type (image) by date, so they show as single row. In this example, the first two images will merge together and output will be like following, also note that if its not same user, then it wont group for that user.

output

1. 555, image - http://1, http://2  ** GROUPED BY DATE, if they are same type and not break type after it.
2. 654, image - http://3
3. 245, video - http://..
4. 555, image - http://..
5. 878, text  - http://.

btw im tring to do facebook like news feed, if someone has better idea, then please share.


SELECT `date`, `user_id`, `type`, GROUP_CONCAT(`link`)
FROM `table`
GROUP BY `date`, `user_id`, `type`


Is there any reason why you NEED to do this in SQL? I try to do data collection in SQL and push formatting into the code from which the SQL is called.

I would do an SQL query that simple, retrieving all of the image files but ordered by user_id/date. Then my code would loop through the results, concatenating each image file into the same output line until the user id/date changes, in which case we know we need to start a new line with a new user id/date.

This also makes it easier to do any funky formatting around the text, particularly if you're intending to emit HTML.


This is not so easy to do in SQL, since it is order-dependent, which SQL is not well suited to.

The query is quite unwieldy, so I'll give it in full first, followed by a breakdown showing how it's put together.

SELECT @rownum:=@rownum+1 AS id, t.user_id, type, date, urls FROM
 (SELECT MIN(ID) AS original_id, user_id, type, date, GROUP_CONCAT(url) urls FROM
  (SELECT i1.*, 
   IF(i1.type='image', 
     IFNULL((SELECT MIN(i2.ID)-1 FROM Items i2
             WHERE i2.ID>i1.ID AND 
                  (i2.type!=i1.type OR i1.user_id!=i2.user_id OR i1.date!=i2.date)), 
            (SELECT MAX(id) FROM Items)), 
     i1.ID) AS lastRow,
   IF (i1.type='image', 
     IFNULL(SELECT MAX(i3.ID)+1 FROM Items i3 
             WHERE i3.ID<=i1.ID AND 
                  (i3.type!=i1.type OR i1.user_id!=i3.user_id OR i1.date!=i3.date)), 
            (SELECT MIN(id) FROM Items)), 
      i1.ID) AS firstRow) AS groupItems
  GROUP BY user_id, type, date, firstRow, lastRow) t, (SELECT @rownum:=0) r
  ORDER BY t.original_id; 

The query uses a correlated subquery to find the start and end IDs of each image group. The group boundary is an item that is not the same type, user or date.

SELECT i1.ID, 
  IF(i1.type='image', 
     IFNULL((SELECT MIN(i2.ID)-1 FROM Items i2
             WHERE i2.ID>i1.ID AND 
                  (i2.type!=i1.type OR i1.user_id!=i2.user_id OR i1.date!=i2.date)), 
            (SELECT MAX(id) FROM Items)), 
     i1.ID) AS lastRow,
  IF (i1.type='image', 
     IFNULL(SELECT MAX(i3.ID)+1 FROM Items i3 
             WHERE i3.ID<=i1.ID AND 
                  (i3.type!=i1.type OR i1.user_id!=i3.user_id OR i1.date!=i3.date)), 
            (SELECT MIN(id) FROM Items)), 
      i1.ID) AS firstRow

For each item, the firstRow/lastRow columns give the start and end of the group. We can then use GROUP_CONCAT to concatenate all the URLs. To preserve order, MIN(id) is output, giving the first ID of each group.

SELECT MIN(id) AS original_id, user_id, type, date, GROUP_CONCAT(url) urls FROM
 (SELECT i1.*, 
  IF(i1.type='image', 
     IFNULL((SELECT MIN(i2.ID)-1 FROM Items i2
             WHERE i2.ID>i1.ID AND 
                  (i2.type!=i1.type OR i1.user_id!=i2.user_id OR i1.date!=i2.date)), 
            (SELECT MAX(id) FROM Items)), 
     i1.ID) AS lastRow,
  IF (i1.type='image', 
     IFNULL(SELECT MAX(i3.ID)+1 FROM Items i3 
             WHERE i3.ID<=i1.ID AND 
                  (i3.type!=i1.type OR i1.user_id!=i3.user_id OR i1.date!=i3.date)), 
            (SELECT MIN(id) FROM Items)), 
      i1.ID) AS firstRow) AS groupItems
 GROUP BY user_id, type, date, firstRow, lastRow

Finally, to get consecutive IDs for the new table, use variable to compute the rank:

SELECT @rownum:=@rownum+1 AS id, user_id, type, date, urls FROM
 (SELECT MIN(ID) AS original_id, user_id, type, date, GROUP_CONCAT(url) urls FROM
  (SELECT i1.*, 
   IF(i1.type='image', 
     IFNULL((SELECT MIN(i2.ID)-1 FROM Items i2
             WHERE i2.ID>i1.ID AND 
                  (i2.type!=i1.type OR i1.user_id!=i2.user_id OR i1.date!=i2.date)), 
            (SELECT MAX(id) FROM Items)), 
     i1.ID) AS lastRow,
   IF (i1.type='image', 
     IFNULL(SELECT MAX(i3.ID)+1 FROM Items i3 
             WHERE i3.ID<=i1.ID AND 
                  (i3.type!=i1.type OR i1.user_id!=i3.user_id OR i1.date!=i3.date)), 
            (SELECT MIN(id) FROM Items)), 
      i1.ID) AS firstRow) AS groupItems
  GROUP BY user_id, type, date, firstRow, lastRow) t, (SELECT @rownum:=0) r
  ORDER BY t.original_id; 

SQL is best suited to working with unordered sets of data, rather than sequences, as here. If you can do this in the presentation code, or possibly better in your application layer, I imagine that will be faster and more flexible. A hand-coded solution would find the start and end of each group in a single pass through the data. I doubt the SQL query will execute as efficiently as that.


this query would do the trick:

select id, user_id, `type`, group_concat(link) from images
group by user_id, `type`, date
order by id


I would possibly consider a different datastore such as CouchDB or RavenDB. This would be much better at serving up this type of content and wouldn't have to deal with foreign keys or joins.

Simply use continuous replication to the databases of all the friends.

Ultimately you're going to have to pre-query or abandon the relational model in order to gain back your speed.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜