开发者

Problem with joining to an empty table

I use the following query:

select * from A LEFT JOIN B on ( A.t_id != B.t_id)

to get all the records in A that are not in B.

The results are fi开发者_如何学编程ne except when table B is completely empty, but then I do not get any records, even from table A.

Later

It won't work yet!

Table structure for T1

CREATE TABLE IF NOT EXISTS `T1` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `t_id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3;

Dumping data for table T1

INSERT INTO `T1` (`id`, `title`, `t_id`) VALUES
(1, 'apple', 1),
(2, 'orange', 2);

Table structure for table T2

CREATE TABLE IF NOT EXISTS `T2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `t_id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=2;

Dumping data for table T2

INSERT INTO `T2` (`id`, `title`, `t_id`) VALUES
(1, 'dad', 2);

Operations

Now I want to get all records in T1 that do not have a corresponding records in T2.

I try

SELECT *
FROM T1
LEFT OUTER JOIN T2 ON T1.t_id != T2.t_id

and it won't work.


select * from A where t_id not in (select t_id from B)


Your have two problems

1) Wrong diagnosis

Given two tables

mysql> CREATE TABLE T1 (ID INT);
Query OK, 0 rows affected (0.10 sec)
mysql> CREATE TABLE T2 (ID INT);
Query OK, 0 rows affected (0.10 sec)
mysql> INSERT INTO T1 VALUES (1);
Query OK, 1 row affected (0.00 sec)

the query

mysql> SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.ID != T2.ID;
+------+------+
| ID   | ID   |
+------+------+
|    1 | NULL |
+------+------+
1 row in set (0.00 sec)

returns the non-matched rows even though the T2 is empty (so your problem is actually elsewhere; you probably simplified the query for us here).

2) Query performance

There are a few ways to write a query that gets records from table A not in table B.

However, they are not all equivalent, for example mysql creates different plans for

mysql> EXPLAIN SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.ID != T2.ID;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type   | possible_keys | key  | key_len | ref  | rows | Extra               |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
|  1 | SIMPLE      | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                     |
|  1 | SIMPLE      | T2    | system | NULL          | NULL | NULL    | NULL |    0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
2 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM T1 WHERE ID NOT IN (SELECT ID FROM T2);
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
| id | select_type        | table | type   | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
|  1 | PRIMARY            | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                                |
|  2 | DEPENDENT SUBQUERY | NULL  | NULL   | NULL          | NULL | NULL    | NULL | NULL | no matching row in const table |
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.02 sec)


mysql> EXPLAIN SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.ID = T2.ID WHERE T2.ID IS NULL;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type   | possible_keys | key  | key_len | ref  | rows | Extra               |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
|  1 | SIMPLE      | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                     |
|  1 | SIMPLE      | T2    | system | NULL          | NULL | NULL    | NULL |    0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
2 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM T1 WHERE NOT EXISTS (SELECT ID FROM T2 WHERE T2.ID = T1.ID);
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
| id | select_type        | table | type   | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
|  1 | PRIMARY            | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                                |
|  2 | DEPENDENT SUBQUERY | NULL  | NULL   | NULL          | NULL | NULL    | NULL | NULL | no matching row in const table |
+----+--------------------+-------+--------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

So, for the 4 different versions of the same query (they should all be equivalent if I didn't make stupid errors) you get two different execution plans.

Judging the plans from this example should not be done since the tables T1 and T2 contain only one and zero rows, respectively, and the plans do depend on the size, indexes and statistics. For production systems do compare the plans since I don't think mysql recognizes the equality of these queries.

EDIT Ups, I did make a stupid error rushing on the interpretation of your query...

SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.t_id != T2.t_id

Facts 1) Left join

SELECT * FROM T1 LEFT OUTER JOIN T2 ...

will get all records from T1 regardless of the join criteria if there are no additional WHERE conditions (regardless of the join condition!)

2) the join condition determines which records are going to get joined to the records from T1 (determines cardinality and values of the attributes from T2)

so for a particular record from T1 it will

2a) fill attributes from T2 with NULLS only if and only if the join condition T1.t_id != T2.t_id is FALSE for ALL the records from T2. This occurs only if for a particular record in T1 every record in T2 has the same value of t_id as the record in T1.

2b) it will return ALL records from T2 if the join condition is TRUE, and this all is definitively not what you want

Here's an example continuing on your sample data

mysql> INSERT INTO `T2` (`id`, `title`, `t_id`) VALUES (8, 'xdad', 3);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.ID != T2.ID;
+----+--------+------+------+-------+------+
| id | title  | t_id | id   | title | t_id |
+----+--------+------+------+-------+------+
|  1 | apple  |    1 |    8 | xdad  |    3 |
|  2 | orange |    2 |    1 | dad   |    2 |
|  2 | orange |    2 |    8 | xdad  |    3 |
+----+--------+------+------+-------+------+
3 rows in set (0.00 sec)

The other three versions should indeed be equivalent

SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1.t_id = T2.t_id WHERE T2.t_id IS NULL;
SELECT * FROM T1 WHERE NOT EXISTS (SELECT t_id FROM T2 WHERE T2.t_id = T1.t_id);
SELECT * FROM T1 WHERE t_id NOT IN (SELECT t_id FROM T2);

The last version can also be written as

SELECT * FROM T1 WHERE t_id NOT IN (SELECT t_id FROM T2 WHERE T2.t_id = T1.t_id);

Sorry for my de-concentration on the first try..


Tried

SELECT * FROM A LEFT OUTER JOIN B ON ( A.t_id != B.t_id)

?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜