Efficient method of finding database rows that have *one or more* qualities from a list of seven qualities
For this question, I'm looking to see if anyone has a better idea of how to implement what I'm currently planning on implementing (below):
I'm keeping track of a set of images, using a database. Each image is represented by one row.
I want to be able to search for images, using a number of different search parameters. One of these parameters involves a search-by-color option. (The rest of the search stuff is currently working fine.)
Images in this database can contain up to seven colors:
-Red
-Orange
-Yellow
-Green
-Blue
-Indigo
-Violet
Here are some example user queries:"I want an image that contains red."
"I want an image that contains red and blue."
"I want an image that contains yellow and violet."
"I want an image that contains red, orange, yellow, green, blue, indigo and violet."
And so on. Users make this selection through the use of checkboxes in an html form. They can check zero checkboxes, all seven, and anything in between.I'm curious to hear what people think would be the most efficient way to perform this database search.
I have two possible options right now, but I feel like there must be something better that I'm not thinking of.
(Option 1)
-For each row, simply have seven additional fields in the database, one for each color. Each field holds a 1 or 0 (true/false) value, and I SELECT based on whatever the user has checked off. (I didn't like this solution so much, because it seemed kind of wasteful to add seven additional fields...especially since most pictures in this table will only have 3-4 colors max, though some could have up to 7. So that means I'm storing a lot of zeros.) Also, if I added more searchable colors later on (which I don't think I will, but it's always possible), I'd have to add more fields.(Option 2)
-For each image row, I could have a "colors" text field that stores space-separated color names (or numbers for the sake of compactness). Then I could do a fulltext match against search through the fields, selecting rows that contain "red yellow green" (or "1 3 4"). But I kind of didn't want to do fulltext searching because I already allow a keywor开发者_如何学Pythond search, and I didn't really want to do two fulltext searches per image search. Plus, if the database gets big, fulltext stuff might slow down.Any better options that I didn't think of?
Thanks!
Side Note: I'm using PHP to work with a MySQL database.
You can create a second table called colors.
colors = (color_id, name)
and a relationship table called image_colors.
image_colors = (image_id, color_id)
Then in the image_colors table you add a row for each color of each image.
image_colors
Image_id Color_id
1 1
2 3
2 4
So image 1 has a single color but image 2 has two colors.
To find an image with say color 4 and 5 you can then
select i.fileName, etc
from images i JOIN image_colors c ON
i.image_id = c.image_id
where
c.color_id = 4 OR
c.color_id = 5
The advantage of this solution is the ease of querying.
Assuming that the list of colors will never (or very very rarely) change, it is probably more efficient to add 7 additional columns to the images table. The join and de-duplication for each query that would come out of normalizing them would typically be more expensive than the extra I/O incurred by making the rows wider by 7 fields.
For what it's worth, if you ever get a chance to move to Oracle, bitmap indexes are built for exactly this kind of thing.
You could store an additional INT field and store 1 number representing a Flags bitwise definition of the colors associated. When searching you can just execute the same bitwise operation to query for a specific integer value.
Look at Mark Cain's comment on this MySql reference manual page for a good example of what I'm proposing:
http://dev.mysql.com/doc/refman/5.0/en/bit-functions.html
精彩评论