How to check uniqueness of many-to-many collection?
Here's the desired flow of my PHP app (yes, it's vague, but it's easier that way):
- User submits a set of, let's say, about 5 objects开发者_如何学运维 by integer IDs. (It'll really be more like 15, but let's say 5 for ease.)
- App checks if this collection has been submitted before, and saves it in a MySQL database if not
- App saves these objects in the database, if they haven't been saved before
(Objects and collections are many-to-many, so there is an objects table, a collections table, and a table relating the two.)
A couple sample flows:
- User submits 111, 112, 113, 114
- This set is new! The collection is saved.
- We've seen objects 111 and 112, but fetch and save the data for 113 and 114, since we haven't.
- Another user submits 111, 112, 113, 114
- We've seen this collection before. Don't bother saving.
- Since we've seen the collection, we've obviously seen the objects. Don't bother saving.
Steps 1 and 3 are simple. Step 2 is where I'm not sure how to proceed. It seems unnecessarily database-heavy to be querying the relationship for sets containing those exact IDs, so I'm about to post a few obvious solutions such as a simple ID list and hashing, but I'd also like to know if there are more ideal solutions out there.
Thanks!
One obvious solution is to save a list of children IDs in the collection's database row, e.g 1111,1112,1113,1114,1115
. It's quick, easy, and guarantees uniqueness, but I imagine it would be unnecessarily CPU-heavy to compare lists like this, especially if they get long, not to mention that it's a fair amount of duplication when it comes to storage space.
Another obvious answer is to take the same concatenated string and take the MD5 hash - quick and easy, but runs the (granted, small) risk of collision. Is hashing worth using by nature of shorter strings, even though I could easily have a 100% guarantee of never having collisions by not hashing?
In my application, I will do this step:
- After user submit, and before entering to database, I will fetch the data in the database into an array. In your example above, I will have
$collection = array('111', '112', '113', '114');
I will check the new user input in two step. First step is to see if it already in the database or not. If it not, then insert. Otherwise ignore:
foreach ( $inputs as $input ) { if ( ! in_array($input, $collection) ) { //do insert here } }
Then in second loop, I do it in reverse, to delete the data that not selected by user.
foreach ( $collection as $data ) { if ( ! in_array($data, $inputs) ) { //do delete here } }
In your case, you might or might not need the second loop. I needed this since I make the input as checkboxes, that the user can choose to activate / deactivate, thus I translate it as insert and delete.
精彩评论