How to map combinations of things to a relational database?
I have a table whose records represent certain objects. For the sake of simplicity I am going to assume that the table only has one column, and that is the unique ObjectId
. Now I need a way to store combinations of objects from that table. The combinations have to be unique, but can be of arbitrary length. For example, if I have the ObjectId
s
1,2,3,4
I want to store the following combinations:
{1,2}, {1,3,4}, {2,4}, {1,2,3,4}
The ordering is not necessary. My current implementation is to have a table Combinations
that maps ObjectId
s to CombinationId
s. So every combination receives a unique Id:
ObjectId | CombinationId
------------------------
1 | 1开发者_JS百科
2 | 1
1 | 2
3 | 2
4 | 2
This is the mapping for the first two combinations of the example above. The problem is, that the query for finding the CombinationId
of a specific Combination seems to be very complex. The two main usage scenarios for this table will be to iterate over all combinations, and the retrieve a specific combination. The table will be created once and never be updated. I am using SQLite through JDBC. Is there any simpler way or a best practice to implement such a mapping?
The problem is, that the query for finding the CombinationId of a specific Combination seems to be very complex.
Shouldn't be too bad. If you want all combinations containing the selected items (with additional items allowed), it's just something like:
SELECT combinationID
FROM Combination
WHERE objectId IN (1, 3, 4)
GROUP BY combinationID
HAVING COUNT(*) = 3 -- The number of items in the combination
If you need only the specific combination (no extra items allowed), it can be more like:
SELECT combinationID FROM (
-- ... query from above goes here, this gives us all with those 3
) AS candidates
-- This bit gives us a row for each item in the candidates, including
-- the items we know about but also any 'extras'
INNER JOIN combination ON (candidates.combinationID = combination.combinationID)
GROUP BY candidates.combinationID
HAVING COUNT(*) = 3 -- Because we joined back on ALL, ones with extras will have > 3
You can also use a NOT EXISTS here (or in the original query), this seemed easier to explain.
Finally you could also be fancy and have a single, simple query
SELECT combinationID
FROM Combination AS candidates
INNER JOIN Combination AS allItems ON
(candidates.combinationID = allItems.combinationID)
WHERE candidates.objectId IN (1, 3, 4)
GROUP BY combinationID
HAVING COUNT(*) = 9 -- The number of items in the combination, squared
So in other words, if we're looking for {1, 2}, and there's a combination with {1, 2, 3}, we'll have a {candidates, allItems} JOIN
result of:
{1, 1}, {1, 2}, {1, 3}, {2, 1}, {2, 2}, {2, 3}
The extra 3 results in COUNT(*)
being 6 rows after GROUP
ing, not 4, so we know that's not the combination we're after.
This may be heresy, but for your usage scenarios it might work better to use a denormalized structure where you store the combinations themselves as some kind of composite (text) value:
CombinationId | Combination
---------------------------
1 | |1|2|
2 | |1|3|4|
If you make the rule that you always sort the ObjectIds when generating the composite value, it's easy to retrieve the Combination for a given set of Objects.
Another option would be to use relation-valued attributes, which in SQL DBMSs are called multisets or nested tables.
Relation-valued attributes may make sense if there is no identifier for the set of objects other than the set itself. However, I don't think any SQL DBMS permits keys to be declared on columns of that type so that could be a problem if you don't have some alternative key you can use.
http://download.oracle.com/docs/cd/B10500_01/appdev.920/a96594/adobjbas.htm#458790
精彩评论