Best way to remove duplicate words from each row in a mysql table
I have a mysql table, each row of which can have an arbitrary number of comma-delimmited words. For example:
TABLE words
"test, dog开发者_C百科, fun, yellow, quick, yellow"
"jogging, academic, fun, jogging, shoulder"
"shoulder, happy, flower, shoulder, shoulder"
I would like to remove the duplicate words in each row. So for example, the above would become:
TABLE words
"test, dog, fun, yellow, quick"
"jogging, academic, fun, shoulder"
"shoulder, happy, flower"
(Note that I only need to remove the duplicates in each row, by itself. I do not need to remove the duplicates between all rows.)
Any suggestions on the best way to accomplish this? Is there a better way than SELECTing and then UPDATEing through the table one row at a time?
Thanks, in advance, for your help.
This is better suited outside of SQL. It's not going to be pretty if you try to interrogate strings using a query. I recommend:
SELECT
ing each row- performing an
$val = explode(', ',$column);
- switch to
$val = array_unique($val);
, then UPDATE
ing to the table with implode(', ',$val);`.
note: you can save yourself some time and do a strcmp($orig,$new)
and only UPDATE
if necessary.
I don't think there's better way than SELECTing and then UPDATEing through the table one row at a time. As I know, SQL just don't support manipulating string like that. You must take a string out to remove duplicate, then insert it again in the table.
Here is a pure mysql version you use a bunch of number for a CROSS JOIN for each word then you just DISTINCT concat the broken words.It would help if you have a primary or unique key in case the rows are identicals.
SELECT GROUP_CONCAT(DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ', ', x.cifre), ', ', -1)) AS words
FROM t
INNER JOIN
(
SELECT 1 + a.i + b.i * 10 cifre, b.i + a.i * 10 sute
FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) a
CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) b
) x
ON (LENGTH(t.col) +1 - LENGTH(REPLACE(t.col, ', ', ''))) >= x.cifre
GROUP BY col
FIDDLE
精彩评论