开发者

SQL Query For Most Popular Combination

Suppose I have a grocery store application with a table of purchases:

customerId int
itemId int

Four customers come into the store:

Bob buys a banana, lemonade, and a cookie 
Kevin buys a banana, lemonade, and a donut
Sam buys a banana, orange juice, and a cupcake 
Susie buys a banana

I am trying to write a query which would return which combinations of items are most popular. In this case, the results of this query should be:

banana and lemonade-2

I have already written a query which tells me a list of all items which were in a multi-item purchase (we exclude sales of one item - it cannot form a "combination"). It returns:

banana - 3
lemonade - 2
cookie - 1
donut - 1
cupcake - 1
orange juice - 1

Here is the query:

SELE开发者_JS百科CT itemId, count( * ) 
FROM grocery_store
INNER JOIN (
SELECT customerId
FROM grocery_store
GROUP BY customerId
HAVING count( itemId ) > 1
)subQuery ON subQuery.customerId = grocery_store.customerId
GROUP BY itemId;

Could I get a pointer about how to expand my existing query to get the desired output?


select a.itemID, b.itemID, COUNT(*) countForCombination
from grocery_store a
inner join grocery_store b
on a.customer_id = b.customer_id
and a.itemID < b.itemID
group by a.itemID, b.itemID
order by countForCombination desc

Assumed:

grocery_store = sales records
customer_id    = unique sale

  • This query takes all the grocery_store records and for each single sales transaction, it creates all the possible combinations (a.itemid, b.itemid) in a specific order (a.itemid
  • This specific order eliminates duplicates (apple, orange) is kept whereas (orange, apple) is not necessary.
  • After producing all the combinations from all sales, a simple group by and sorting by count is used to show the most popular combinations at the top
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜