开发者

Convert this SQL query to Linq (Not Exists + sub query)

I would like this SQL to be converted to LINQ. (it shouldl select rows from input which do not exist in table production based on 3 columns. If a column in both tables c开发者_StackOverflow中文版ontains NULL, it should be considered as having the same value)

SELECT i.* FROM INPUT AS i
WHERE NOT EXISTS
(SELECT p.Agent FROM Production AS p
WHERE ISNULL(i.CustID,'') <> ISNULL(p.CustID,'')
AND ISNULL(i.CustName,'') <> ISNULL(p.CustName,'')
AND ISNULL(i.household,'') <> ISNULL(p.Household,''))


First of all - this is not a good SQL query. Every column is wrapped in a non-sargable function which means that the engine won't be able to take advantage of any indexes on any of those columns (assuming you have any).

Let's start by rewriting this as a semi-decent SQL query:

SELECT i.*
FROM Input i
LEFT JOIN Production p
    ON (p.CustID = i.CustID OR (p.CustID IS NULL AND i.CustID IS NULL))
    AND (p.CustName = i.CustName OR (p.CustName IS NULL AND i.CustName IS NULL))
    AND (p.Household = i.Household OR
        (p.Household IS NULL AND i.Household IS NULL))
WHERE p.CustID IS NULL

Now having said this, LEFT JOIN / IS NULL is not great for efficiency either, but we don't have much choice here because we're comparing on multiple columns. Based on your column names, I'm starting to wonder if the schema is properly normalized. A CustID should most likely be associated with one and only one CustName - the fact that you have to compare both of these seems a bit odd. And Household - I'm not sure what that is, but if it's a varchar(x)/nvarchar(x) column then I wonder if it might also have a 1:1 relationship with the customer.

If I'm speculating too much here then feel free to dismiss this paragraph; but just in case, I want to say that if this data isn't properly normalized, normalizing it would make it much easier and faster to query on:

SELECT *
FROM Input
WHERE CustID NOT IN (SELECT CustID FROM Production)

Anyway, going back to the first query, since that's what we have to work with for now. Unfortunately it's impossible to create a join on those specific conditions in Linq, so we need to rewrite the SQL query as something slightly worse (because we now have to read from Input twice):

SELECT *
FROM Input
WHERE <Primary Key> NOT IN
(
    SELECT i.<Primary Key>
    FROM Input i
    INNER JOIN Production p
    ON (p.CustID = i.CustID OR (p.CustID IS NULL AND i.CustID IS NULL))
    AND (p.CustName = i.CustName OR (p.CustName IS NULL AND i.CustName IS NULL))
    AND (p.Household = i.Household OR
        (p.Household IS NULL AND i.Household IS NULL))
)

Now we have something we can finally translate to Linq syntax. We still can't do the join explicitly, which would be best, but we go old-school, start from the cartesian join and toss the join conditions into the WHERE segment, and the server will still be able to sort it out:

var excluded =
    from i in input
    from p in production
    where
        ((p.CustID == i.CustID) || ((p.CustID == null) && (i.CustID == null))) &&
        ((p.CustName == i.CustName) || 
            ((p.CustName == null) && (i.CustName == null))) &&
        ((p.Household == i.Household) ||
            ((p.Household == null) && (i.Household == null)));
    select i.PrimaryKey;

var results =
    from i in input
    where !excluded.Contains(i.PrimaryKey)
    select i;

I'm assuming here that you have some sort of primary key on the table. If you don't, you've got other problems, but you can get around this particular problem using EXCEPT:

var excluded =
    from i in input
    from p in production
    where
        ((p.CustID == i.CustID) || ((p.CustID == null) && (i.CustID == null))) &&
        ((p.CustName == i.CustName) || 
            ((p.CustName == null) && (i.CustName == null))) &&
        ((p.Household == i.Household) ||
            ((p.Household == null) && (i.Household == null)));
    select i;

var results = input.Except(excluded);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜