开发者

SQL: How to query items from one table that are not in another when the syntax is not the same?

I have a question because I'm really bad at SQL. I understand basic functions but when

it gets a bit more complex, I'm completly lost.

here is what I have:

tables: tA, tB

columns: tA: refA tB: refB

basically refA and refB represent the same thing (some id of a form like xxx-xxx-xxx), but refB can have information appended (like xxx-xxx-xxx_Zxxx or xxx-xxx-xxx Zxxx)

here is what I know how to开发者_运维技巧 do:

querying items that are in a table but not in another (when they are exactly the same)

select refA
from tA
where not exists (select *
from tB
where tB.refB = tA.refA
)

What i want to do:

I want a query that will list items from refA that are not in refB. BUT, Problem is if I run a "simple" query with a NOT EXISTS like I just showed, it will return everything, because of the appends. so I thought about using some syntax like this:

SELECT refA
FROM tA
WHERE NOT EXISTS (SELECT * 
FROM tB
WHERE tB.refB LIKE CONCAT(tA.refA,'%'))

but... of course, it doesn't work.

Could someone show me how it should be done, and also explain how it works, so I can learn ?

Thanks in advance !

edit: additional info

I can't use a left() or something alike, because the ref format is similar but not always the same (varies in number of characters).

The only way to detect the end of the id before the append, is that there is either a blank space or an underscore.

edit 2: data sample causing problems (MON, Jan. 10th)

here is some actual data from the tables, which makes most answers people have given here miss some results :/

in tA:

B20-60-04-6A-1

B20-60-04-6A-11

B20-60-04-6A-12

B20-60-04-6A-13

in tB:

B20-60-04-6A-11_XX

B20-60-04-6A-12_XX

B20-60-04-6A-13_XX

problem with mid(), left(), etc. is that if we check "B20-60-04-6A-1" (14 chars) against the 14 first chars, it will return 3 positives, while in fact it is not in tB...

so, how can we proceed ?

Examples of data patterns in tA are like this:

(X, XYZ: charaters. x: alphanumerical)

Xxx-xx-xx-x

Xxx-xx-xx-xx

Xxx-xx-xx-xx-xx

Xxx-xx-xx-xx-xx-x

etc

examples of data patterns in tB:

Xxx-xx-xx-xx-xx-XYZ-xx Z xxx_XX

Xxx-xx-xx-xx-xx-XYZZxxx_XX

Xxx-xx-xx-xx-xx Z xxx_XX

XYZ are always the same 3 characters. When we do not have XYZ, there is always a blank space or an underscore.

so the string of data we compare should be trimmed according to this:

- from start to -XYZ string

- or, if no -XYZ in the string, from start to the first " " or "_"

I'd write that lightning fast in VBA, but in SQL... well, I'll give it a shot, but I'm really bad at it :D


So, first off, you need a function that will change refB to not have the appended information, so it can be compared properly with refA. There will be several approaches, but something like this should work:

Left(tb.RefB, InStr(Replace(tb.RefB+"_", " ", "_"), "_") -1)

That will convert any refB like "123-456 123 EXTRA STUFF" or "123-456_123_EXTRA_STUFF" into "123-456". That result should then be okay to compare directly with a refA.

EDIT: A short explanation of the expression above. What I'm doing is:

  1. Adding an underscore to the end of refB, so that there's always at least one underscore (this copes for the case where refB is the same as refA, e.g. "123" becomes "123_")
  2. Replacing all spaces in refB with underscores (the Replace function). Now we know that the separator is always an underscore, and we also know from step 1 that there will be at least one underscore.
  3. Finding the location of the first underscore (the InStr function). This is the position where refB is split between refA and the additional stuff.
  4. Grabbing all the characters between the start of the string and this first underscore, i.e. the part before the separator.

So, that gives you something like this:

select refA
from tA
where not exists (select *
from tB
where Left(tb.RefB, InStr(Replace(tb.RefB+"_", " ", "_"), "_") -1) = tA.refA
)

I would use this approach rather than comparing with wildcards, or trimming refB to match the length of refA, because of this scenario:

refA
====
123
123-456
123-456-789

refB
====
123-456-789_This_is_a_test

In this case, trimming or wildcard matching refA with refB will result in success for all refAs, because "123*", "123-456*" and "123-456-789*" all match "123-456-789_This_is_a_test".


So you want everything from A where not in B, but where only the start of B's id matches?

select refA
from tA
left outer join tB 
    on tA.refA = left( tB.refB, len(tA.refA)) --trim B's id to the length of A's
where tB.refB is null


Maybe use a left() function, if one exists in access? Like this:

SELECT refA
FROM tA
WHERE NOT EXISTS (SELECT * 
FROM tB
WHERE Left(tB.refB, Len(tA.refA)) = tA.refA)

If, as you said, you have to look for a space or underscore in the refA, you can use this:

SELECT refA
FROM tA
WHERE NOT EXISTS (SELECT * 
FROM tB
WHERE Left(tB.refB, Max(Instr(tA.refA, ' '), Instr(tA.refA, '_'))) = tA.refA)


I'd change the schema. Your second table should have two columns, one containing the first part of the identifier, the other containing the second; if the column was the primary key first, just create a unique multi-column index and disallow NULL values.

You can also add a foreign key constraint this way, and/or optimize the comparisons by introducing a surrogate key in the first table and referencing that from the second.

If you do not have an index on the substring you are trying to match, you will end up with a full scan for each value you are looking for, this is hideously expensive.


I think your suggestion will work in a slightly different format, generally the wild card in Access is *, unless you have set ANSI 92 mode, however you can use ALIKE with % in 'ordinary' mode.

EDIT : DIFFERENT IDEA

SELECT tA.refA
FROM tA
WHERE (((tA.refA) 
   Not In (SELECT Mid(tb.RefB,1,Len(ta.RefA)) FROM tb)));


This is valid syntax and close to the syntax you say you want to write:

SELECT refA 
  FROM tA
 WHERE NOT EXISTS (
                   SELECT *
                     FROM tB
                    WHERE tB.refB ALIKE tA.refA & '%'
                  );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜