开发者

left join returning more than expected

Using the following query

select开发者_如何学运维 *
from table1
left join table2 on table1.name = table2.name

table1 returns 16 rows and table2 returns 35 rows.

I was expecting the above query to return 16 rows because of the left join, but it is returning 35 rows. right join also returns 35 rows

Why is this happening and how do I get it to return 16 rows?


LEFT JOIN can return multiple copies of the data from table1, if the foreign key for a row in table 1 is referenced by multiple rows in table2.

If you want it to only return 16 rows, one for each table 1 row, and with a random data set for table 2, you can use just a plain GROUP BY:

select *
from table1
left join table2 on table1.name = table2.name
group by table1.name

GROUP BY aggregates rows based on a field, so this will collapse all the table1 duplicates into one row. Generally, you specify aggregate functions to explain how the rows should collapse (for example, for a number row, you could collapse it using SUM() so the one row would be the total). If you just want one random row though, don't specify any aggregate functions. MySQL will by default just choose one row (note that this is specific to MySQL, most databases will require you to specify aggregates when you group). The way it chooses it is not technically "random", but it is not necessarily predictable to you. I guess by "random" you really just mean "any row will do".


Let's assume you have the following tables:

tbl1:
|Name |
-------
|Name1|
|Name2|

tbl2:
|Name |Value |
--------------
|Name1|Value1|
|Name1|Value2|
|Name3|Value1|

For your LEFT JOIN you'll get:

|tbl1.Name|tbl2.Name|Value |
----------------------------
|Name1    | Name1   |Value1|
|Name1    | Name1   |Value2|
|Name2    | NULL    | NULL |

So, LEFT JOIN means that all records from LEFT (first) table will be returned regardless of their presence in right table.

For your question you need to specify some specific fields instead of using "*" and add GROUP BY tbl1.Name - so your query will look like

select tbl1.Name, SOME_AGGREGATE_FUNCTION(tbl2.specific_field), ...
from table1
left join table2 on table1.name = table2.name
GROUP BY tbl1.Name


One way to use this is by using the power of SQL distinct.

select distinct tbl1.id, *
from table1 tbl1
left join table2 tbl2 on tbl2.name = tbl1.name

where
 ....................

Please not that I am also using aliasing.


Duplication may be reason. See example in the post
https://alexpetralia.com/posts/2017/7/19/more-dangerous-subtleties-of-joins-in-sql


If the name column is not unique in the tables then you may simply have duplicates on table2.

Try running:

select * from table2 where name not in (select name from table1);

If you get no results back then duplicates on the name column is the reason for the extra rows coming back.


if you want to join the single latest/earliest relative row from right table, you can limit the join data using min/max primary key and then limiting to 1 row using group Like this:

SELECT * FROM table1
    LEFT JOIN (SELECT max(tbl2_primary_col), {table2.etc} FROM table2 GROUP BY name) AS tbl2
    ON table1.name = tbl2.name
WHERE {condition_for_table1}

And remember don't use * for left join because it will disable min/max and always return first row.


As per your comment "A random row from table2, as long as name from table1 matches name from table2", you can use the following:

select table1.name, (select top 1 somecolumn from table2 where table2.name = table1.name)
from table1

Note that top 1 is not mysql but it is for SQL Server

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜