开发者

SQL set operation with different number of columns in each set

Let say I have set 1:

1   30  60
2   45  90
3   120 240
4   30  60
5   20  40

and set 2

30  60
20  40

I would like to do some sort of union where I only keep rows 1,4,5 from set 1 because the latter 2 columns of set 1 can be found in set 2.

My problem is that set based operations insist on the same 开发者_StackOverflownumnber of columns.

I've thought of concatenating the columns contents, but it feels dirty to me.

Is there a 'proper' way to accomplish this?

I'm on SQL Server 2008 R2

In the end, I would like to end up with

1   30  60
4   30  60
5   20  40

CLEARLY I need to go sleep as a simple join on 2 columns worked.... Thanks!


You are literally asking for

give me the rows in t1 where the 2 columns match in T2

So if the output is only rows 1, 4 and 5 from table 1 then it is a set based operation and can be done with EXISTS or INTERSECT or JOIN. For the "same number of column", then you simply set 2 conditions with an AND. This is evaluated per row

EXISTS is the most portable and compatible way and allows any column from table1

select  id, val1, val2
from    table1 t1
WHERE EXISTS (SELECT * FROM table2 t2
                  WHERE t1.val1 = t2.val1 AND t1.val2 = t2.val2)

INTERSECT requires the same columns in each clause and not all engines support this (SQL Server does since 2005+)

select  val1, val2
from    table1
INTERSECT
select  val1, val2
from    table2

With an INNER JOIN, if you have duplicate values for val1, val2 in table2 then you'll get more rows than expected. The internals of this usually makes it slower then EXISTS

select  t1.id, t1.val1, t1.val2
from    table1 t1
        JOIN
        table2 t2 ON t1.val1 = t2.val1 AND t1.val2 = t2.val2

Some RBDMS support IN on multiple columns: this isn't portable and SQL Server doesn't support it

Edit: some background

Relationally, it's a semi-join (One, Two).

SQL Server does it as a "left semi join"

INTERSECT and EXISTS in SQL Server usually give the same execution plan. The join type is a "left semi join" whereas INNER JOIN is a full "equi-join".


You could use union which, as opposed to union all, eliminates duplicates:

select  val1, val2
from    table1
union
select  val1, val2
from    table1

EDIT: Based on your edited question, you can exclude rows that match the second table using a not exists subquery:

select  id, col1, col2
from    table1 t1
where   not exists
        (
        select  *
        from    table2 t2
        where   t1.col1 = t2.col1
                and t1.col2 = t2.col2
        )
union all
select  null, col1, col2
from    table2

If you'd like to exclude rows from table2, omit union all and everything below it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜