oracle regex replace (keep only a-z)
I have data in 开发者_如何学JAVAlast_name
column.
This data comes from online and at times users are copy pasting the last name from a word document. This is a problem when a last name has a single quote. Somehow the single quote from a word document is weird.
I want to write an oracle regex replace in my select query such a way that it replaces everything in the last_name
column but just keeps (a-z or A-Z).
Is this doable?
finally I went with this:
REGEXP_REPLACE(mbr_last_name,'[^a-zA-Z'']','') replaced_last_name
I'm keeping a to z A to Z and a single quote
By 'weird' do you mean that it is not the normal single quote? Word has some really stupid characters that it uses by default (called 'Smart quotes (google)'. which look a bit like the standard quotes but have different behaviour when converting to and from ASCII and for example UTF-8
To identify them in SQL do a select using the function ASCIISTR and look for a sub-string in the form <backslash><4 digits>
(This works with a DB with UTF-8 and simialr characters, I'm unsure about its return in a database with ASCII NLS parameters)
select asciistr(COLUMN) from table
I have used the following code to remove them in some recent work I have been doing
update jiraissue
set summary = replace(asciistr(summary), '\2013','-')
where asciistr(summary) like '%\2013%';`)
HTH
You can use the regexp_replace operator for instance:
select regexp_replace('foobar1000!!!!','[[:cntrl:]]|[[[:digit:]]|[[:punct:]]') from dual;
REGEXP_REPLACE('FOOBAR1000!!!!','[[:CNTRL:]]|[[[:DIGIT:]]|[[:PUNCT:]]')
-----------------------------------------------------------------------
foobar
Worthwhile looking at the CONVERT function, which will do a conversion between character sets. You can convert to 'plain' 7-bit ASCII characters to remove anomalies.
For example
select convert('êê','US7ASCII') from dual;
精彩评论