Filtering bad data in SQL Server 2005
Using SQL Server 2005
I have a table with the following columns
id name date value
I would like to select all rows from the table where there are not four consecutive zeroes by date. How would I do that? Below is an example of what I mean.
id name date value
1 a 1/1/2010 5
2 a 1/2/2010 3
3 a 1/3/2010 5
4 a 1/4/2010 0
5 a 1/7/2010 0
6 a 1/8/2010 0
7 a 1/9/2010 2
8 a 1/10/2010 3
9 a 1/11/2010 0
10 a 1/15/2010 0
11 a 1/16/2010 0
12 开发者_开发百科 a 1/17/2010 0
13 a 1/20/2010 4
14 a 1/21/2010 4
I would like the result of the query to include all rows except id 9-12.
This is assuming you ordered the rows by ID but you can simply change the ORDER BY id
to something else and it should still work.
Using the T-SQL CTE found on this Kodyaz Development Resources site I was able to create the below code. I have it working so it deletes the rows where there are two consecutive zeroes, not 4, as I tested it on my code and just changed the table/row names.
WITH CTE as (
SELECT
RN = ROW_NUMBER() OVER (ORDER BY id),
*
FROM tablename
)
SELECT
[Current Row].*
FROM CTE [Current Row]
LEFT JOIN CTE [Previous Row] ON
[Previous Row].RN = [Current Row].RN - 1
LEFT JOIN CTE [Next Row] ON
[Next Row].RN = [Current Row].RN + 1
WHERE
not([Current Row].value = 0 AND [Next Row].value = 0) AND
// this deletes the row where value is zero and the next rows value is zero
not([Previous Row].value = 0 AND [Current Row].value = 0)
// this deletes the row where value is zero and the previous rows value is zero
All you have to do to make it work for your case is putting each possible combination within the WHERE
statement. For example dealing with this row and the 3 next rows equaling 0 OR this row the previous and the 2 next rows.
You don't mention how the name is involved, so I'm assuming that you want this done by name. I'm going to further assume that when you talk about "consecutive" that you mean in date order, not in id order. Finally, I'm also going to assume that you would also exclude 5 zeroes in a row, 6 zeroes in a row, etc.
There may be a simpler way, but this should work:
;WITH Transitions_To_CTE AS
(
SELECT
T1.id,
T1.name,
T1.date,
T1.value
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON
T2.name = T1.name AND
T2.date < T1.date AND
T2.value <> 0
LEFT OUTER JOIN My_Table T3 ON
T3.name = T1.name AND
T3.date > COALESCE(T2.date, '1900-01-01') AND
T3.date < T1.date
WHERE
T1.value = 0 AND
T3.id IS NULL
),
Transitions_From_CTE AS
(
SELECT
T1.id,
T1.name,
T1.date,
T1.value
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON
T2.name = T1.name AND
T2.date > T1.date AND
T2.value <> 0
LEFT OUTER JOIN My_Table T3 ON
T3.name = T1.name AND
T3.date < COALESCE(T2.date, '9999-12-31') AND
T3.date > T1.date
WHERE
T1.value = 0 AND
T3.id IS NULL
),
Range_Exclusions AS
(
SELECT
S.name,
S.date AS start_date,
E.date AS end_date
FROM
Transitions_To_CTE S
INNER JOIN Transitions_From_CTE E ON
E.name = S.name AND
E.date > S.date
LEFT OUTER JOIN Transitions_From_CTE E2 ON
E2.name = S.name AND
E2.date > S.date AND
E2.date < E.date
WHERE
E2.id IS NULL AND
(SELECT COUNT(*) FROM dbo.My_Table T WHERE T.name = S.name AND T.date BETWEEN S.date AND E.date) >= 4
)
SELECT
T.id,
T.name,
T.date,
T.value
FROM
dbo.My_Table T
WHERE
NOT EXISTS (SELECT * FROM Range_Exclusions RE WHERE RE.name = T.name AND T.date BETWEEN RE.start_date AND RE.end_date)
Here's my attempt, use a recursive cte to work out the number of consecutive zeros, then create a sequence of IDs using the level > 4 and then simply do a not in clause on the ids.
with trend --work out number of consecutive zeros using level
as
(Select 1 as level, id, value, id as startid
from IdsAndValues
Union All
Select [Level]+1, P.ID, p.value, t.startid
From IdsAndValues as p
Inner Join trend as t on p.id = t.id+1
Where t.value =0 and p.value=0
)
,IDs --create sequence of ids using startid and id, this allows us to do the not in
as
(
Select startid as ExcludeID ,id
from trend as t--
Where level>=4
Union All
Select ExcludeID +1, id
From ids
where ExcludeID <id
)
Select *
from IdsAndValues
Where id Not in
(Select ExcludeID from IDs)
精彩评论