T-SQL Query to identify varchar fields that consist of a single repeating char/digit?
I need to clean phone numbers stored as varchar. There is bad data where unknown phone numbers are stored as a sequence of a single digit. Eventually more complex (area code & prefix matching) will be done but I want a simply query to obviously bad records.
So for examp开发者_Go百科le:
Valid Phone Number: 3289903829
Invalid Phone Number: 1111111111
Now if the bogus Product ID are the proper length ( 10 digits) it is easy to parse and clean.
SELECT phone
FROM customers
SET phone = NULL
WHERE phone IN ('0000000000','9999999999',''8888888888','7777777777','6666666666','5555555555','4444444444','3333333333','2222222222','1111111111')
However sometimes the bogus phones are of arbitrary length (likely due to typos) so 11 ones or 9 ones, or n ones.
How can I ID strings that consists of all of the same char/digit?1111111 - match
4444 - match
1112 - no match
4445555 - no match
You can get the first character and replicate it:
where phone = replicate(left(phone,1), len(phone))
and phone is not null
Depending on how fast you need it to run, your other option is to populate a temp table and then join your phone number on it. If you are doing it multiple times, you could even create a real table so you don't have to re-create it each run. To make it faster you could also index the field. Your may mileage may vary on fast you need to it to be compared to the number of records you have to compare.
CREATE TABLE #Numbers
(
PhoneNumber VARCHAR(13) NOT NULL
)
DECLARE @run BIT
SET @run = 1
DECLARE @number INT
SET @number = 1
DECLARE @Counter INT
SET @Counter = 1
WHILE(@run = 1)
BEGIN
WHILE(@Counter < 13)
BEGIN
INSERT INTO #Numbers
SELECT REPLICATE(@number,@counter)
SET @Counter = @Counter + 1
END
SET @Counter = 1
SET @number = @number + 1
IF(@number > 9)
BEGIN
SET @run = 0
END
END
SELECT * FROM Phone p JOIN #numbers n ON p.PhoneNumber = n.PhoneNumber
This way you don't have to recalculate the field you are comparing the number to each time.
Maybe you could create a SQL function to do this.
I think the guts of it would look something like this:
DECLARE @field varchar(10)
SET @field = '11111'
DECLARE @len INT
SET @len = LEN(@field)
DECLARE @counter INT
SET @counter = 1
DECLARE @firstChar VARCHAR(1)
SET @firstChar = NULL
DECLARE @currentChar VARCHAR(1)
SET @currentChar = NULL
DECLARE @allSameNumber BIT
SET @allSameNumber = 1
WHILE @counter <= @len AND @allSameNumber = 1 BEGIN
SET @currentChar = SUBSTRING(@field,@counter,1)
IF @firstChar IS NULL BEGIN
SET @firstChar = @currentChar
END
IF NOT ISNUMERIC(@currentChar) = 1 OR NOT @currentChar = @firstChar BEGIN
SET @allSameNumber = 0
END
SET @counter = @counter + 1
END
SELECT @allSameNumber
精彩评论