开发者

SQL: How to update multiple fields so empty field content is moved to the logically last columns - lose blank address lines

I have three address line columns, alin开发者_如何学编程e1, aline2, aline3 for a street address. As staged from inconsistent data, any or all of them can be blank. I want to move the first non-blank to addrline1, 2nd non-blank to addrline2, and clear line 3 if there aren't three non blank lines, else leave it. ("First" means aline1 is first unless it's blank, aline2 is first if aline1 is blank, aline3 is first if aline1 and 2 are both blank)

The rows in this staging table do not have a key and there could be duplicate rows. I could add a key.

Not counting a big case statement that enumerates the possible combination of blank and non blank and moves the fields around, how can I update the table? (This same problem comes up with a lot more than 3 lines, so that's why I don't want to use a case statement)

I'm using Microsoft SQL Server 2008


Another alternative. It uses the undocumented %%physloc%% function to work without a key. You would be much better off adding a key to the table.

CREATE TABLE #t
(
aline1 VARCHAR(100), 
aline2 VARCHAR(100), 
aline3  VARCHAR(100)
)

INSERT INTO #t VALUES(NULL, NULL, 'a1')
INSERT INTO #t VALUES('a2', NULL, 'b2')

;WITH cte
     AS (SELECT *,
                 MAX(CASE WHEN RN=1 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline1,
                 MAX(CASE WHEN RN=2 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline2,
                 MAX(CASE WHEN RN=3 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline3
         FROM   #t
                OUTER APPLY (SELECT ROW_NUMBER() OVER (ORDER BY CASE WHEN value IS NULL THEN 1 ELSE 0 END, idx) AS
                                   RN, idx, value
                             FROM   (VALUES(1,aline1),
                                           (2,aline2),
                                           (3,aline3)) t (idx, value)) d)
UPDATE cte
SET    aline1 = new_aline1,
       aline2 = new_aline2,
       aline3 = new_aline3  


SELECT *
FROM #t

DROP TABLE #t


Here's an alternative

Sample table for discussion, don't worry about the nonsensical data, they just need to be null or not

create table taddress (id int,a varchar(10),b varchar(10),c varchar(10));
insert taddress
select 1,1,2,3 union all
select 2,1, null, 3 union all
select 3,null, 1, 2 union all
select 4,null,null,2 union all
select 5,1, null, null union all
select 6,null, 4, null

The query, which really just normalizes the data

;with tmp as (
    select *, rn=ROW_NUMBER() over (partition by t.id order by sort)
    from taddress t
    outer apply
    (
        select 1, t.a where t.a is not null union all
        select 2, t.b where t.b is not null union all
        select 3, t.c where t.c is not null
            --- EXPAND HERE
    ) u(sort, line)
)
select t0.id, t1.line, t2.line, t3.line
from taddress t0
left join tmp t1 on t1.id = t0.id and t1.rn=1
left join tmp t2 on t2.id = t0.id and t2.rn=2
left join tmp t3 on t3.id = t0.id and t3.rn=3
--- AND HERE
order by t0.id

EDIT - for the update back into table

;with tmp as (
    select *, rn=ROW_NUMBER() over (partition by t.id order by sort)
    from taddress t
    outer apply
    (
        select 1, t.a where t.a is not null union all
        select 2, t.b where t.b is not null union all
        select 3, t.c where t.c is not null
            --- EXPAND HERE
    ) u(sort, line)
)
UPDATE taddress
set a = t1.line,
    b = t2.line,
    c = t3.line
from taddress t0
left join tmp t1 on t1.id = t0.id and t1.rn=1
left join tmp t2 on t2.id = t0.id and t2.rn=2
left join tmp t3 on t3.id = t0.id and t3.rn=3


Update - Changed statement to an Update statement. Removed Case statement solution

With this solution, you will need a unique key in the staging table.

With Inputs As
    (
    Select PK, 1 As LineNum, aline1 As Value
    From StagingTable
    Where aline1 Is Not Null
    Union All
    Select PK, 2, aline2
    From StagingTable
    Where aline2 Is Not Null
    Union All
    Select PK, 3, aline3
    From StagingTable
    Where aline3 Is Not Null
    )
    , ResequencedInputs As
    (
    Select PK, Value
        , Row_Number() Over( Order By LineNum ) As LineNum
    From Inputs
    )
    , NewValues As
    (
    Select S.PK
        , Min( Case When R.LineNum = 1 Then R.addrline1 End ) As addrline1
        , Min( Case When R.LineNum = 2 Then R.addrline1 End ) As addrline2
        , Min( Case When R.LineNum = 3 Then R.addrline1 End ) As addrline3
    From StagingTable As S
        Left Join ResequencedInputs As R
            On R.PK = S.PK
    Group By S.PK
    )
Update OtherTable
Set addrline1 = T2.addrline1
    , addrline2 = T2.addrline2
    , addrline3 = T2.addrline3
From OtherTable As T
    Left Join NewValues As T2
        On T2.PK = T.PK


R. A. Cyberkiwi, Thomas, and Martin, thanks very much - these were very generous responses by each of you. All of these answers were the type of spoonfeeding I was looking for. I'd say they all rely on a key-like device and work by dividing addresses into lines, some of which are empty and some of which aren't, excluding the empties. In the case of lines of addresses, in my opinion this is semantically a gimmick to make the problem fit what SQL does well, and it's not a natural way to conceptualize the problem. Address lines are not "really" separate rows in a table that just got denormalized for a report. But that's debatable and whether you agree or not, I (a rank beginner) think each of your alternatives are idiomatic solutions worth elaborating on and studying.

I also get lots of similar cases where there really is normalization to be done - e.g., collatDesc1, collatCode1, collatLastAppraisal1, ... collatLastAppraisal5, with more complex criteria about what in excludeand how to order than with addresses, and I think techniques from your answers will be helpful.

%%phsloc%% is fun - since I'm able to create a key in this case I won't use it (as Martin advises). There was other stuff in Martin's stuff I wasn't familiar with too, and I'm still tossing them all around.

FWIW, here's the trigger I tried out, I don't know that I'll actually use it for the problem at hand. I think this qualifies a "bubble sort", with the swapping expressed in a peculiar way.

create trigger fixit on lines 
instead of insert as
declare @maybeblank1 as varchar(max)
declare @maybeblank2 as varchar(max)
declare @maybeblank3 as varchar(max)


set @maybeBlank1 = (select line1 from inserted)
set @maybeBlank2 = (select line2 from inserted)
set @maybeBlank3 = (select line3 from inserted)

declare @counter int
set @counter = 0 

while @counter < 3
begin
    set @counter = @counter + 1
    if @maybeBlank2 = '' 
        begin
            set @maybeBlank2  =@maybeblank3
            set @maybeBlank3 = ''
        end
    if @maybeBlank1 = ''
        begin
            set @maybeBlank1 = @maybeBlank2
            set @maybeBlank2 = ''
        end
end
select * into #kludge from inserted
update #kludge
    set line1 = @maybeBlank1,
    line2 = @maybeBlank2,
    line3 = @maybeBlank3
insert into lines 
    select * from #kludge


You could make an insert and update trigger that check if the fields are empty and then move them.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜