Find and replace string in MySQL using data from another table
I have two MySQL tables, and I want to find and replace text strings in one using data in another.
Table texts
:
+---------------------+
| messages |
+---------------------+
| 'thx guys' |
| 'i think u r great' |
| 'thx again' |
| ' u rock' |
+---------------------+
Table dictionary
:
+--------------+---------------+
| bad_spelling | good_spelling |
+--------------+---------------+
| 'thx' | 'thanks'开发者_C百科 |
| ' u ' | ' you ' |
| ' r ' | ' are ' |
+--------------+---------------+
I want SQL to go through and look at every row in messages and replace every instance of bad_spelling with good_spelling, and to do this for all the pairs of bad_spelling and good_spelling.
The closest I have gotten is this:
update texts, dictionary
set texts.message = replace(texts.message,
dictionary.bad_spelling,
dictionary.good_spelling)
But this only changes "thx" to "thanks" (in two rows) and does not go on to replace " u " with " you" or " r " with " are ."
Any ideas how to make it use all the rows in dictionary in the replace statement?
PS forgot to mention that this is a small example and in the real thing I will have a lot of find/replace pairs, which may get added to over time.
I've never used MySql, so this is just a theory based on my other database work. When reading the other answers, trying to use REPLACE(), I thought I could post this and get someone with MySql syntax experience a few ideas to make a set base solution.
here is some SQL Server code to that does most of the work for you:
DECLARE @Source table (Texts varchar(50))
INSERT @Source VALUES ('thx guys')
INSERT @Source VALUES ('i think u r great')
INSERT @Source VALUES ('thx again')
INSERT @Source VALUES ('u rock')
DECLARE @Dictionary table (bad_spelling varchar(50), good_spelling varchar(50))
INSERT @Dictionary VALUES ('thx', 'thanks')
INSERT @Dictionary VALUES ('u', 'you')
INSERT @Dictionary VALUES ('r', 'are')
SELECT
t.Texts,COALESCE(d.good_spelling,c.ListValue) AS WordToUse
FROM @Source t
CROSS APPLY dbo.FN_ListToTable(' ',t.Texts) c
LEFT OUTER JOIN @Dictionary d ON c.ListValue=d.bad_spelling
OUTPUT:
Texts WordToUse
------------------ ---------
thx guys thanks
thx guys guys
i think u r great i
i think u r great think
i think u r great you
i think u r great are
i think u r great great
thx again thanks
thx again again
u rock you
u rock rock
(11 row(s) affected)
It would be better to use a "real" PK than the actual "Texts" in the query above, but the OP doesn't list many columns in that table, so I use "Texts".
Using SQL Server you need to use a some funky XML syntax to join the rows back together (so I won't show that code, as it doesn't matter), but using MySql's GROUP_CONCAT() you should be able to concatenate the word rows back together into phrase rows.
the code for the (SQL Server) split function and how it works can be found here: SQL Server: Split operation
It does not go all the way because even though the replace had been run x times (where x is the number of rows in dictionary) only one update is retained (the last one).
Transactions don't write down intermediate results and therefore can't see them as input values for the next batch of replacements.
As (AFAIK) MySQL does not support recursive queries you'll have to resort to procedural approach.
You need to execute your query many times anyways. Since this is the operation of clean-up
type, which you usually do occasionally, i suggest you perform the following query until there was something updated. I do not know how to do it with MySql
, but in SQL Server
it would be to check the number of rows updated (which is result of this UPDATE
query execution), and run the UPDATE
again, until no rows are updated.
update texts,
dictionary
set texts.message = replace(texts.message, dictionary.bad_spelling, dictionary.good_spelling)
where texts.message <> replace(texts.message, dictionary.bad_spelling, dictionary.good_spelling)
You have to call Replace multiple times on the text:
Update ...
Set texts.message = Replace(
Replace(
Replace( texts.message, 'thx ', 'thanks ' )
, ' u ', ' you ')
, ' r ', ' are ')
EDIT Given that you said you had numerous replacements, you would need to do this in a cursor with multiple UPDATE statement calls. Something like (I haven't tested this at all, so beware):
Create Temporary Table ReplaceValues
(
BeforeText varchar(100) not null
, AfterText varchar(100) not null
)
Insert ReplaceValues(BeforeText, AfterText) Values('thx ', 'thanks ')
Insert ReplaceValues(BeforeText, AfterText) Values(' u ', ' you ')
Insert ReplaceValues(BeforeText, AfterText) Values(' r ', ' are ')
DECLARE done int DEFAULT(0)
DECLARE BeforeValue varchar(100);
DECLARE AfterValue varchar(100);
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
DECLARE ReplaceList CURSOR FOR Select BeforeText, AfterText From ReplaceValues;
OPEN ReplaceList;
REPEAT
If NOT done THEN
FETCH ReplaceList INTO BeforeValue, AfterValue;
Update texts
Set texts.message = REPLACE(texts.message, BeforeValue, AfterValue);
END IF
UNTIL done END REPEAT;
CLOSE ReplaceList;
You could wrap all this up into a procedure so that you can call it again later.
精彩评论