开发者

Do I have this LEFT JOIN right?

I need to find the total number of customers created after a given date that haven't yet ordered.

I have two tables:

| customers           |    // There's more to this but this is all
+----+------+---------+    // you need to answer this question
| ID | Name | Created |
+----+------+---------+

| orders           |    // There's more to this but this is all
+----开发者_如何学运维+-------------+    // you need to answer this question
| ID | customer_id |
+----+-------------+

I need to find the total number of customers created after a given date that haven't yet ordered.

This is what I did:

SELECT Count(*)
FROM customers
LEFT JOIN orders ON customers.ID = orders.customer_id
WHERE customers.Created > #arbitrary date#
AND orders.ID Is Null;

I feel certain this is right, however it doesn't seem to be producing the right results. By that I mean my boss is looking at the results and telling me that from experience the numbers of new customer signups who aren't ordering are far more than what this query is producing.

So am I doing this right? If I am, I clearly need to look at what else could be causing the problem. Thanks.

EDIT 1

A comment below suggested that the orders.id is not nullable. This may be the case, however I get the same results if I check if orders.customer_id Is Null.

EDIT 2

I didn't want to overcomplicate things but it might be useful if you know that I'm doing this in an Access database so I can't use all the goodies that standard SQL allows. (I've edited the tags)


Your boss must tell you what he considered 'good values'.

Maybe he used to have a report with others filter to count this customer. Maybe you have to check a price somewhere in orders table, and consider not a real orders if price = 0.

With the information you give us, your query is absolutely right, maybe I will add something :

SELECT Count(DISTINCT customers.ID)
FROM customers
    LEFT JOIN orders 
        ON customers.ID = orders.customer_id
WHERE customers.Created > #arbitrary date#
AND orders.ID Is Null;

For easier maintenance and understanding (the query should be a little slower, but not that much)


I'm wondering if their is a subtle bug-waiting-to-happen in your code.

The spec states, "find the total number of customers". By joining to the orders table you are actually counting on the join. Of course, in the case of customers with no orders you are indeed counting customers. However, for customers with orders you would be counting their orders.

I'd prefer a query construct that reflected the count on the customers so that I (or indeed a more casual user) could switch the antijoin to a semijoin and obtain the expected results (with your construct they would obtain a misleading count) e.g.

SELECT COUNT(*)
  FROM customers
 WHERE customers.Created > #specific date#
       AND NOT EXISTS (
                       SELECT *
                         FROM orders
                        WHERE orders.customer_id = customers.ID
                      );

As for the accepted answer, I offer this "For easier maintenance and understanding" (the query may be a little slower but hopefully not by much)."


Perhaps somewhere your code allows orders to be recorded for customers (but not completed). You can count customers that have ordered only orders with 0 amount.

SELECT COUNT(*)
FROM
  ( SELECT customers.id
    FROM customers
      INNER JOIN orders ON customers.ID = orders.customer_id
    WHERE customers.Created > #arbitrary date# 
    GROUP BY customers.id
    HAVING COUNT(IIF(orders.amount > 0, 1, NULL)) = 0  
  ) AS grp

Access has not CASE statement but it has an IIF() function.

And INNER JOIN is required instead of JOIN.


No this is not right.

Try this instead:

SELECT
SUM(CASE WHEN (SELECT COUNT(*) FROM orders WHERE customers.ID = orders.customer_id) = 0 
  THEN 1
  ELSE 0
END) 
FROM customers
WHERE customers.Created > #arbitrary date#


have a look at this website, it explains all the differences between the sql joins

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜