MySQL Query problem (duplicate results)
Im having a problem finding duplicate results in a mysql database (a cocktail recipe website). Here the setup:
Table 1: 'cocktail'
[cid,c_name] (cid = unique cocktail id, c_name = cocktail name)Table 2: 'ingredients':
[iid,i_name] (iid = unique ingredient id, i_name = ingredi开发者_开发问答ent name)Table 3: 'cocktail_ingredients' (the linking table)
[ciid,cid,iid] (ciid = unique row identifier, cid = cocktail cid, iid = ingredient iid)So one cocktail can have multiple rows in the 'cocktail_ingredients' table (1 to many).
Setup is fine. The problem Im having now is finding if there are duplicate cocktails in my database.
For instance if the cocktail_ingredients table had these entries:
cid | iid
1 | 56 1 | 78 1 | 101 . . . 9 | 56 9 | 78 9 | 101The cocktail is the same (for theoretical purposes here anyway).
If the 'cocktail_ingredients' table had one more row ...
9 | 103
Then it wouldn't be the same, as cocktail number 9 includes an extra ingredient.
So the mysql has to do 2 checks, firstly that the ingredient count is the same, and secondly that every ingredient id (iid) is the same for corresponding cocktails (cid).
Im stumped on this one, any help much appreciated. I'm thinking I might have to head down the PHP route as well to code in something more complex, but I'm struggling there as well so thought this would be a good place to stop and ask.
Thanks a ton
Nick
You may recall from a distant math class that the definition of set equality is that both A abd B are subsets of one another (non-strict) so just create a view or procedure that checks if every thin that is in A is also in B, then check the two cocktails are both subsets of one another. This is far from a complete answer, but it may be enough to get you going ;)
it will probably be easier to do the negation - find an ingredient in A that is not in B. none exist, then A must be a strict subset of B (assuming A and B can't both be empty)
Alternatively do a count of each ingredient in A, each ingredient ion B and each ingredient in A and B then if they are equal they are equivalent cocktails
CREATE VIEW ingredient_count AS
SELECT cid, count(*) as ingredients
FROM cocktail_ingredients
GROUP BY cid
CREATE VIEW shared_ingredients AS
SELECT c1.cid cid1, c2.cid cid2, count(*) as ingredients
FROM cocktail_ingredients as c1 INNER JOIN cocktail_ingredients as c2
ON (c1.cid != c2.cid AND c1.iid = c2.iid)
GROUP BY c1.cid,c2.cid
CREATE VIEW duplicates AS
SELECT cid1,cid2
FROM (ingredient_count AS ic1 INNER JOIN shared_ingredients
ON ic1.cid=cid1) INNER JOIN ingredient_count as ic2
ON ic2.cid=cid2
WHERE ic1.ingredients=ic2.ingredients
AND shared_ingredients=ic1.ingredients
Note this may be much faster in mysql with subselects with sensible where clauses rather than views, but this is easier to understand
You can impose such checking using TRIGGER. But, yet there is a conceptual problem.
Say, you have two cocktails {1 | 56, 78, 101}
and {9 | 56, 78, 101, 103}
and also assume that you have implemented the check.
Now, you are inserting data for 1:
cid | iid
----------
1 | 56
Then, add rest two ingredients...
cid | iid
----------
1 | 56
1 | 78
1 | 101
Fine, now you start adding 9:
cid | iid
----------
1 | 56
1 | 78
1 | 101
9 | 56
You have three more ingredients, so continue adding them:
cid | iid
----------
1 | 56
1 | 78
1 | 101
9 | 56
9 | 78
Two more remaining (101,103) But alas! You cannot add 101! If you try to add 101, then 9 would become identical to 1, which your trigger will prevent you from adding.
When a cocktail is subset of another, you have to add the subset later. I hope I could make you understand this.
You should not put any restriction in database. What I would do in my web application is:
In the cocktail entry/update interface, I would take user input (and not yet insert/update in DB)
When user clicks the save button (I would add a save button), check if the new/updated cocktail becomes copy of another (May be I would write a stored procedure, but it can be found using a select query only)
If the new/updated cocktail is not duplicate of another, insert/update database. If
精彩评论