开发者

Checking that a recipe contains an ingredient - MYSQL

Hey everyone. I'm having a bit of trouble running a query / php combination efficiently. I seem to be just looping over too many result sets in inner开发者_高级运维 loops in my php. I'm sure there is a more efficient way of doing this. Any help very much appreciated.

I've got a table that holds 3500 recipes ([recipe]):

rid | recipe_name

And another table that holds 600 different ingredients ([ingredients])

iid | i_name

Each recipe has x number of ingredients associated to it, and I use a nice joining table to create the association ([recipe_ingredients])

uid | rid | iid

(where uid is just a unique id for the table)

For example:

rid: 1 | recipe_name: Lemon Tart
.....
iid: 99 | i_name: lemon curd
iid: 154 | i_name: flour
.....
1 | 1 | 99
2 | 1 | 154

The query I'm trying to run, allows the user to enter what ingredients they have, and it will tell you anything you can make with those ingredients. It doesn;t have to use all ingredients, but you do need to have all the ingredients for the recipe.

For instance if I had flour, egg, salt, milk and lemon curd I could make 'Pancakes', and 'Lemon Tart' (if we assume lemon tart has no other ingredients:)), but couldn't make 'Risotto' (as I didnt have any rice, or anything else thats needed in it).

In my PHP I have an array containing all the ingredients the user has. At the moment they way I'm running this is going through every recipe (loop 1) and then checking all ingredients in that recipe to see if each ingredient is contained in my ingredients array (loop 2). As soon as it finds an ingredient in the recipe, that isnt in my array, it says "no" and goes onto the next recipe. If it does, it stores the rid in a new array, that I use later to display the results.

But if we look at the efficiency of that, if I assume 3500 recipes, and Ive got 40 ingredients in my array, the worst case scenario is it running through 3500 x 40n, where n = number of ingredients in the recipe. The best case is still 3500 x 40 (doesn't find an ingredient first time for every recipe so exits).

I think my whole approach to this is wrong, and I think there must be some clever sql that I'm missing here. Any thoughts? I can always build up an sql statement from the ingredient array I have ......

Thanks a lot in advance, much appreciated


I'd suggest storing the count of the number of ingredients for the recipe in the recipe table, just for efficiency's sake (it will make the query quicker if it doesn't have to calculate this information every time). This is denormalization, which is bad for data integrity but good for performance. You should be aware that this can cause data inconsistencies if recipes are updated and you are not careful to make sure the number is updated in every relevant place. I've assumed you've done this with the new column set as ing_count in the recipe table.

Make sure you escape the values in for NAME1, NAME2, etc if they are provided via user input - otherwise you are at risk for SQL injection.

select recipe.rid, recipe.recipe_name, recipe.ing_count, count(ri) as ing_match_count
from recipe_ingredients ri 
inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing
on ri.iid = ing.iid
inner join recipe 
on recipe.rid = ri.rid
group by recipe.rid, recipe.recipe_name, recipe.ing_count
having ing_match_count = recipe.ing_count

If you don't want to store the recipe count, you could do something like this:

select recipe.rid, recipe.recipe_name, count(*) as ing_count, count(ing.iid) as ing_match_count
from recipe_ingredients ri 
inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing
on ri.iid = ing.iid
right outer join recipe 
on recipe.rid = ri.rid
group by recipe.rid, recipe.recipe_name
having ing_match_count = ing_count


You could an "IN ANY" type query:

select recipes.rid, count(recipe_ingredients.iid) as cnt
from recipes
left join recipe_ingredients on recipes.rid = recipe_ingredients.rid
where recipes_ingredients in any (the,list,of,ingredients,the,user,hash)
group by recipes.rid
having cnt > some_threshold_amount
order by cnt desc

Doing this off the top of my head, but basically pull out any recipes where at least one of the user-provided ingredients are listed, sort by the total ingredient count, and then only return the recipes where more than a threshold amount of ingredients are present.

I've probably got the threshold bit wrong - sneaky suspicion it'll count the recipes's ingredients, and not the user-provided ones, but the rest of the query should be a good start for what you need.


Question: why isn't your query directly sql? You can optimize by eliminating the wrong recipes:

  • firstly eliminate the recipes that have more ingridients than you user ingredients
  • make a recursive greedy by:
    • pick the first rid|iid
    • if it's in the user ingredients, continue,
    • if not, eliminate from the Recipe_Ingredients table all the rows with rid => new_table
    • restart using the new_table | stop new_table count = 0

It should have the best statistical results.

Hope it helped


Something like this:

SELECT r.*, COUNT(ri.iid) AS count FROM recipe r
   INNER JOIN recipe_ingredient ri ON r.rid = ri.rid
   INNER JOIN ingredient i ON i.iid = ri.iid
   WHERE i.name IN ('milk', 'flour')
   GROUP BY r.rid
   HAVING count = 2

It's pretty easy to understand. count hold the number of ingredients within the list (milk, flour) that were matched for each recipe. If count matches the number of ingredients in the WHERE clause (in this case: 2), then return the recipe.


SELECT irl.ingredient_amount, r . * , i.thumbnail
FROM recipes r
LEFT JOIN recipe_images i ON ( i.recipe_id = r.recipe_id )
LEFT JOIN ingredients_recipes_link irl ON ( irl.recipe_id = r.recipe_id )
WHERE irl.recipe_id
IN (

SELECT recipe_id
FROM `ingredients_recipes_link`
WHERE ingredient_id
IN ( 24, 21, 22 )
HAVING count( * ) =3
)
GROUP BY r.recipe_id
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜