开发者

Case-insensitive REPLACE in MySQL?

MySQL runs pretty much all string comparisons under the default collation... except the REPLACE command. I have a case-insensitive collation and need to run a case-insensitive REPLACE. Is there any way to force REPLACE to use the current collation rather than always doing case-sensitive comparisons? I'm willing to upgrade my MySQL (currently running 5.1) to get added functionality...

mysql> charset utf8 collation utf8_unicode_ci;
Charset changed

mysql> select 'abc' like '%B%';
+------------------+
| 'abc' like '%B%' |
+------------------+
|                1 |
+------------------+

mysql> select replace('aAbBcC', 'a', 'f');
+-----------------------------+
| replace('aAbBcC', 'a', 'f') |
+-----------------------------+
| fAbBcC                      |   <--- *NOT* 'ffbBcC'
+----------开发者_StackOverflow社区-------------------+


If replace(lower()) doesn't work, you'll need to create another function.


My 2 cents.

Since many people have migrated from MySQL to MariaDB, those people will have available a new function called REGEXP_REPLACE. Use it as you would a normal replace, but the pattern is a regular expression.

This is a working example:

UPDATE `myTable`
SET `myField` = REGEXP_REPLACE(`myField`, '(?i)my insensitive string', 'new string') 
WHERE `myField` REGEXP '(?i)my insensitive string'

The option (?i) makes all the subsequent matches case insensitive (if put at the beginning of the pattern like I have then it all is insensitive).

See here for more information: https://mariadb.com/kb/en/mariadb/pcre/

Edit: as of MySQL 8.0 you can now use the regexp_replace function too, see documentation: https://dev.mysql.com/doc/refman/8.0/en/regexp.html


Alternative function for one spoken by fvox.

DELIMITER |
CREATE FUNCTION case_insensitive_replace ( REPLACE_WHERE text, REPLACE_THIS text, REPLACE_WITH text )
RETURNS text
DETERMINISTIC 
BEGIN
    DECLARE last_occurency int DEFAULT '1';

    IF LCASE(REPLACE_THIS) = LCASE(REPLACE_WITH) OR LENGTH(REPLACE_THIS) < 1 THEN
         RETURN REPLACE_WHERE;
    END IF;

    WHILE Locate( LCASE(REPLACE_THIS), LCASE(REPLACE_WHERE), last_occurency ) > 0  DO
      BEGIN
        SET last_occurency = Locate(LCASE(REPLACE_THIS), LCASE(REPLACE_WHERE));
         SET REPLACE_WHERE = Insert( REPLACE_WHERE, last_occurency, LENGTH(REPLACE_THIS), REPLACE_WITH);
         SET last_occurency = last_occurency + LENGTH(REPLACE_WITH);
      END;
    END WHILE;
    RETURN REPLACE_WHERE;
END;
|
DELIMITER ;

Small test:

SET @str = BINARY 'New York';
SELECT case_insensitive_replace(@str, 'y', 'K');

Answers: New Kork


This modification of Luist's answer allows one to replace the needle with a differently cased version of the needle (two lines change).

DELIMITER |
CREATE FUNCTION case_insensitive_replace ( REPLACE_WHERE text, REPLACE_THIS text, REPLACE_WITH text )
RETURNS text
DETERMINISTIC 
BEGIN
  DECLARE last_occurency int DEFAULT '1';

  IF LENGTH(REPLACE_THIS) < 1 THEN
    RETURN REPLACE_WHERE;
  END IF;

  WHILE Locate( LCASE(REPLACE_THIS), LCASE(REPLACE_WHERE), last_occurency ) > 0  DO
    BEGIN
      SET last_occurency = Locate(LCASE(REPLACE_THIS), LCASE(REPLACE_WHERE), last_occurency);
      SET REPLACE_WHERE = Insert( REPLACE_WHERE, last_occurency, LENGTH(REPLACE_THIS), REPLACE_WITH);
       SET last_occurency = last_occurency + LENGTH(REPLACE_WITH);
    END;
  END WHILE;
  RETURN REPLACE_WHERE;
END;
|
DELIMITER ;


I went with http://pento.net/2009/02/15/case-insensitive-replace-for-mysql/ (in fvox's answer) which performs the case insensitive search with case sensitive replacement and without changing the case of what should be unaffected characters elsewhere in the searched string.

N.B. the comment further down that same page stating that CHAR(255) should be changed to VARCHAR(255) - this seemed to be required for me as well.


In the previous answers, and the pento.net link, the arguments to LOCATE() are lower-cased.

This is a waste of resources, as LOCATE is case-insensitive by default:

mysql> select locate('el', 'HELLo');
+-----------------------+
| locate('el', 'HELLo') |
+-----------------------+
|                     2 |
+-----------------------+

You can replace

WHILE Locate( LCASE(REPLACE_THIS), LCASE(REPLACE_WHERE), last_occurency ) > 0 DO

with

WHILE Locate(REPLACE_THIS, REPLACE_WHERE, last_occurency ) > 0 DO

etc.


In case of 'special' characters there is unexpected behaviour:

SELECT case_insensitive_replace('A', 'Ã', 'a')

Gives

a

Which is unexpected... since we only want to replace the à not A

What is even more weird:

SELECT LOCATE('Ã', 'A');

gives

0

Which is the correct result... seems to have to do with encoding of the parameters of the stored procedure...


I like to use a search and replace function I created when I need to replace without worrying about the case of the original or search strings. This routine bails out quickly if you pass in an empty/null search string or a null replace string without altering the incoming string. I also added a safe count down just in case somehow the search keep looping. This way we don't get stuck in a loop forever. Alter the starting number if you think it is too low.

delimiter //

DROP FUNCTION IF EXISTS `replace_nocase`//

CREATE FUNCTION `replace_nocase`(raw text, find_str varchar(1000), replace_str varchar(1000)) RETURNS text
CHARACTER SET utf8
DETERMINISTIC
BEGIN
    declare ret text;
    declare len int;
    declare hit int;
    declare safe int;
    
    if find_str is null or find_str='' or replace_str is null then
        return raw;
    end if;

    set safe=10000;
    set ret=raw;
    set len=length(find_str);
    
    set hit=LOCATE(find_str,ret);
    while hit>0 and safe>0 do
        set ret=concat(substring(ret,1,hit-1),replace_str,substring(ret,hit+len));
        set hit=LOCATE(find_str,ret,hit+1);
        set safe=safe-1;
    end while;
    

    return ret;
END//


This question is a bit old but I ran into the same problem and the answers given didn't allow me to solve it entirely.

I wanted the result to retain the case of the original string.

So I made a small modification to the replace_ci function proposed by fvox :

DELIMITER $$

DROP FUNCTION IF EXISTS `replace_ci`$$

CREATE FUNCTION `replace_ci` (str TEXT, needle CHAR(255), str_rep CHAR(255))
RETURNS TEXT
DETERMINISTIC
BEGIN
DECLARE return_str TEXT DEFAULT '';
DECLARE lower_str TEXT;
DECLARE lower_needle TEXT;
DECLARE tmp_needle TEXT;
DECLARE str_origin_char CHAR(1);
DECLARE str_rep_char CHAR(1);
DECLARE final_str_rep TEXT DEFAULT '';
DECLARE pos INT DEFAULT 1;
DECLARE old_pos INT DEFAULT 1;
DECLARE needle_pos INT DEFAULT 1;

IF needle = '' THEN
    RETURN str;
END IF;

SELECT LOWER(str) INTO lower_str;
SELECT LOWER(needle) INTO lower_needle;
SELECT LOCATE(lower_needle, lower_str, pos) INTO pos;
WHILE pos > 0 DO
    SELECT substr(str, pos, char_length(needle)) INTO tmp_needle;
    SELECT '' INTO final_str_rep;
    SELECT 1 INTO needle_pos;
    WHILE needle_pos <= char_length(tmp_needle) DO
        SELECT substr(tmp_needle, needle_pos, 1) INTO str_origin_char;
        SELECT SUBSTR(str_rep, needle_pos, 1) INTO str_rep_char;
        SELECT CONCAT(final_str_rep, IF(BINARY str_origin_char = LOWER(str_origin_char), LOWER(str_rep_char), IF(BINARY str_origin_char = UPPER(str_origin_char), UPPER(str_rep_char), str_rep_char))) INTO final_str_rep;
        SELECT (needle_pos + 1) INTO needle_pos;
    END WHILE;
    SELECT CONCAT(return_str, SUBSTR(str, old_pos, pos - old_pos), final_str_rep) INTO return_str;
    SELECT pos + CHAR_LENGTH(needle) INTO pos;
    SELECT pos INTO old_pos;
    SELECT LOCATE(lower_needle, lower_str, pos) INTO pos;
END WHILE;
SELECT CONCAT(return_str, SUBSTR(str, old_pos, CHAR_LENGTH(str))) INTO return_str;
RETURN return_str;
END$$

DELIMITER ;

Example of use :

SELECT replace_ci( 'MySQL', 'm', 'e' ) as replaced;

Will return : | replaced | | --- | | EySQL |

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜