Comparing two large SQL queries

2023-03-22 10:02 问答作者：

I am replacing an old untidy SQL query with a new version, because it was failing to include several rows it should have.

The new query certainl开发者_JAVA百科y includes these missing rows, but I want to be entirely sure that it also includes all of the rows from the original query.

The two queries use entirely different tables. Each query is ~14000 rows.

Is there a query I can write that checks whether QueryA contains any rows that QueryB does not have?

You could do something like this.

Select  * FROM 
(
QUERY A GOES HERE 
) AS A
LEFT JOIN
(
QUERY B GOES HERE 
) AS B
ON A.Col1=B.Col1 AND A.Col2=B.Col2 ....
WHERE B.Col1 IS NULL

You can either include all the columns in the "on clause" or you can just include the columns you need to ensure the rows are the same, such as the primary key.

Assuming that both queries returns the primary key column from the same table:

select *
from (QueryA) a
where a.PK not in (select PK from (QueryB) b)

Please note that the parantheses means that these are subqueries.

Could you use a LEFT OUTER JOIN?

SELECT
  *
FROM
  ( < put query 1 here > ) AS Query1
LEFT JOIN
  ( < put query 2 here > ) AS Query2
    ON  Query1.Field1 = Query2.Field1
    AND Query1.Field2 = Query2.Field2
    AND Query1.Field3 = Query2.Field3
    etc, etc
WHERE
  Query2.Field1 IS NULL

Select the results from the old query into a temp table.
Select the results from the new query into another temp table.
Outer join from one table to the other, equating all the columns.
Add a where clause to only return rows where there are nulls on one side or the other.

For example, if the results looked like (theoretically):

UserId | FirstName | LastName | Email

Then run each query into a separate temp table with the same structure, say #resOld and '#resNew'.

Then:

SELECT
    *

FROM #resOld OLD

LEFT OUTER JOIN #redNew NEW -- LEFT OUTER JOIN, so we still retain rows which don't match
    ON  NEW.UserId    = OLD.UserId
    AND NEW.FirstName = OLD.FirstName
    AND NEW.LastName  = OLD.LastName
    AND NEW.Email     = OLD.Email

WHERE OLD.UserID IS NULL -- Only retain rows where we didn't match. Can use any field from OLD which cannot be null by design.

This query will only return rows when there are rows from one results set which don't match rows in the other.

EDIT: That is more complicated than it needs to be, you can just invert the join criteria and remove the WHERE like so:

SELECT
    *

FROM #resOld OLD

INNER JOIN #redNew NEW -- Inner join where rows are different.
    ON  NEW.UserId    != OLD.UserId
    AND NEW.FirstName != OLD.FirstName
    AND NEW.LastName  != OLD.LastName
    AND NEW.Email     != OLD.Email

If you're able to get it into SQL Server, you can just do:

<Query 1>
EXCEPT
<Query 2>

<Query 2>
EXCEPT
<Query 1>

This will only output records that don't exist in the other query. It checks all fields in the result sets.

I included both directions since if there are more records in the lower part of the EXCEPT statement those don't get shown as exceptions.

The easy way is to use a Union of QueryA and QueryB and Group By.

See an example of that here: http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx on Jeff's SQL server blog.

继续阅读：sql

Comparing two large SQL queries

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？