开发者

Faster CROSS JOIN alternative - PostgreSQL

I am trying to CROSS JOIN two tables, customers and items, so I can then create a sales by customer by item report. I have 2000 customer and 2000 items.

SELECT customer_name FRO开发者_JS百科M customers; --Takes 100ms

SELECT item_number FROM items; --Takes 50ms

SELECT customer_name, item_number FROM customers CROSS JOIN items; Takes 200000ms

I know this is 4 million rows, but is it possible to get this to run any faster? I want to eventually join this with a sales table like this:

SELECT customer_name, item_number, sales_total FROM customers CROSS JOIN items LEFT JOIN sales ON (customer.customer_name = sales.customer_name, item.item_number=sales.item_number);

The sales table will obviously not have all customers or all items, so the goal here is to have a report that shows all customers and all items along with what was sold and not sold.

I'm using PostgreSQL 8.4


To answer your question: No, you can't do a cross join faster than that - if you could then that would be how CROSS JOIN would be implemented.

But really you don't want a cross join. You probably want two separate queries, one which lists all customers, and another which lists all items and whether or not they were sold.


This really needs to be multiple reports. I can think of several off the top of my head that will yield more efficient packaging of information:

  1. Report: count of all purchases by customer/item (obvious).
  2. Report: list of all items not purchased, by customer.
  3. Report: Summary of Report #2 (count of items) in order to prioritize which customers to focus on.
  4. Report: list of all customer that have not bought an item by item.
  5. Report: Summary of Report #3 (count of customers) in order to identify both the most popular and unpopular items for further action.
  6. Report: List of all customers who purchased an item in the past, but did not purchase it his reporting period. This report is only relevant when the sales table has a date and the customers are expected to be regular buyers (i.e. disposable widgets). Won't work as well for things like service contracts.

The point here is that one should not insist that the tool process every possible outcome at once and generate more data and anyone could possibly digest manually. One should engage the end-users and consumers of the data as to what their needs are and tailor the output to meet those needs. It will make both sides' lives much easier in the long run.


If you wish to see all items for a given client (even if the cient has no items), i would rather try

SELECT c.customer_name, i.item_number, s.sales_total
FROM customers c LEFT JOIN 
    sales s ON c.customer_name = s.customer_name LEFT OIN
    items i on i.item_number=s.item_number

This should give you a list of all clients, and all items joined by sales.


Perhaps you want something like this?

select c.customer_name, i.item_number, count( s.customer_name ) as total_sales
from customers c full join sales s on s.customer_name = c.customer_name
full join items i on i.item_number = s.item_number
group by c.customer_name, i.item_number
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜