SQL find close matches
I'm trying to build a filtering system for products. Products have many attributes including price, size(cm) and (# of) sides. I want to construct an SQL query that always returns ALL the products but orders them on how closely they meet the search criteria.
For example lets say I have the following products:
Product A Price: 250 Size: 50 Sides: 4
Product B Price: 300 Size: 60 Sides: 3
Product C Price: 200 Size: 60 Sides: 5
And have a search criteria of:
Price: $210 - $260
I want it to return: Product A, Product C, Product B
A is first because it meets all the search criteria. C is ahead of B because $200 is closest to the $210-$260 range than $300.
That's a fairly simp开发者_JAVA百科le case but the search could involve any attribute(s).
The algorithm I have in my head is doing some sort of standard deviation calculation for each attribute which would give you a 'score' of how closely that attribute meets the search criteria, these scores could be summed up across a product to give a total score of how relavant the product is. Is that along the right track?
I realize this is a fairly complicated question, I'm really just looking for some guidance more than anything.
You basically have to come up with a 'distance' function for each row which returns 0 if all criteria match or some positive value indicating how close otherwise. Each of the different columns will have to have some weight as being off by $1 is not as far off as being off by 1 number of sides. For the price if it's below 210 the unweighted distance would be (210 - price), if the price is over 260 the unweighted distance for the price column would be (price - 260). Then you can add up the weighted distances w1 * price_distance + w2 * size_distance + s3 * sides_distance. Then you can do a order by distance.
How about calculating the average, then ordering by deviation from that average? I'm imagining something like this (obviously, your table/column names will be different):
SELECT
p.price,
p.size,
p.sides,
ABS(p.price - 235) AS distance
FROM
products p
ORDER BY
distance;
Of course, you might also want to throw in an IF function to evaluate whether or not the price is in the range, and I'd suggest using parameterized queries on the back in instead of feeding the average directly in.
精彩评论