Select parent row only if it has no children

2023-03-22 00:37 问答作者：

I have a MySQL database in which table A has a one-to-many relation to table B, and I would like to select all rows in table B that have no children in table A. I have tried using

SELECT id FROM A WHERE NOT EXISTS (SELECT * FROM B WHERE B.id=A.id)

and

SELECT id FROM A LEFT JOIN B ON A.id=B.id WHERE B.id IS NULL

Both of these seem slow. Is there a faster query to achieve the same thing?

In case this is relevant, in my database table A has about 500,000 rows and table B has about 3 to 4 million rows.

Edit: For the actual tables in my database, explain gives me:

+----+--------------------+------------------+-------+---------------+---------------------------+---------+------+---------+--------------------------+
| id | select_type        | table            | type  | possible_keys | key                       | key_len | ref  | rows    | Extra                    |
+----+--------------------+------------------+-------+---------------+---------------------------+---------+------+---------+--------------------------+
|  1 | PRIMARY            | frontend_form471 | index | NULL          | frontend_form471_61a633e8 | 32      | NULL |  671927 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | SchoolData       | index | PRIMARY       | PRIMARY                   | 49      | NULL | 3121110 | Using where; Using index |
+----+--------------------+------------------+-------+---------------+---------------------------+---------+------+---------+--------------------------+

for

select number from frontend_form471 where not exists (select * from SchoolData where SchoolData.`f471 Application Number`=frontend_form471.number)

and

+----+-------------+------------------+-------+---------------+---------------------------+---------+------+---------+------------------------------------------------+
| id | select_type | table            | type  | possible_keys | key                       | key_len | ref  | rows    | Extra                                          |
+----+-------------+------------------+-------+---------------+---------------------------+---------+------+---------+------------------------------------------------+
|  1 | SIMPLE      | frontend_form471 | index | NULL          | frontend_form471_61a633e8 | 32      | NULL |  671927 | Using index; Using temporary                   |
|  1 | SIMPLE      | SchoolData       | index | PRIMARY       | PRIMARY                   | 49      | NULL | 3121110 | Using where; Using index; Not exists; Distinct |
+----+-------------+------------------+-------+---------------+---------------------------+---------+------+---------+------------------------------------------------+

for

select distinct number from frontend_form471 left join SchoolData on frontend_form471.number=SchoolData.`f471 Application Number` where SchoolData.`f471 Application Number` is NULL

where in my case frontend_form471 is table A and SchoolData is table B

Edit2: In table B (SchoolData) in my database, id is the first part of a two part primary key, so it is indexed and ther开发者_C百科e are still multiple entries in B with the same id.

SELECT id FROM A LEFT OUTER JOIN B ON A.id=B.id WHERE B.id IS NULL

you can do this. the outer join should bring a little performance, but not much.

new database systems will probably optimize your query anyway so that there wont be any difference.

the correct way here is caching! try the query cacher and application level caching if possible.

of course you need proper indexes.

and by proper i mean on both tables and preferably a hash index as it will have static lookup time in comparision to any tree that has logarithmic

Try putting an explain before the query to see what really slows this down.

if you really need this to be fast you may re-facture your data structure.

you could possibly create a trigger to mark a flag in table A whether there is a corresponding entry in table be. of course this id data redundancy, but sometimes its worth it. just think of it as caching.

one last thought: you could try SELECT id FROM A WHERE id NOT IN (SELECT id FROM B) it may be a little faster because no actual joining is necessary, however it may also be slower because the lookup in the set of be will be a full scan. I am not really sure how this will be processed but it may be worth a try.

It's going to be slow no matter how you look at it. Worst case performance is going to be a full cross join creating 2 trillion potential matches (4 mill * 500k).

The second one will most likely perform faster, since it's a single query.

Your indexing is poor.

For all forms (EXISTS, IN, LEFT JOIN) you should have indexes on id in both tables

You could try

SELECT id FROM A WHERE A.id NOT IN (SELECT id FROM B)

but i don't know if this will be any faster. I would have tried the left join first. I think your problem is more to do with indexes. Do you have indexes on both id fields.

Be sure to have an index on A.id and another one on B.id.

What seems a like bit strange is that you join A.id with B.id. Is B.id the foreign key to A or is it the primary key of B?

If your schema is something like this:

CREATE TABLE b(
    id int,
    value varchar(255)
)

CREATE TABLE a(
    id int,
    father_id int,
    value varchar(255)
)

If you want all the rows of table A that don't have child in table A why you don't try something like:

SELECT * FROM B WHERE id NOT IN (SELECT father_id FROM A GROUP BY father_id)

I haven't tested but i think that it's fester. Remember to put an index over id

Hope this helps

Why not try empty value instead of NULL. In SQL, the NULL value is never true in comparison to any other value, even NULL. An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for the operators and functions involved in the expression.

继续阅读：sql

Select parent row only if it has no children

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？