How to improve speed of thousand of inserts in a Nested Set model?
I'm using MySQL. I want to insert 100.000 places who belong to a Nested Set Model with different levels (root_id, lft, rgt, level). I开发者_高级运维f I didn't have to worry about the tree, I would do a bulk insert; but in this case, I have to do 3 queries for each object on this order:
foreach ( $places as $place )
{
# get parent place from $ancestor_array ( woeid => ids )
$parent = $ancestors[:parent_woeid]
update place set lft = lft + 2 where root_id = :root_id and lft >= :obj_lft;
update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
insert into place (..., root_id, lft, rgt, level, ...) values (..., :obj_root_id, :obj_lft, :obj_rgt, :obj_level, ...);
...
}
That takes a long time... so it's time to try to be smarter. This is what I'm thinking:
foreach ( $places as $place )
{
# get parent place from $ancestor_array ( woeid => ids )
$parent = $ancestors[:parent_woeid]
$new_admins[] = array('parent_woeid' => :parent_woeid, ...data place....)
$woeids[] = :parent_woeid;
}
# lock + bulk insert of $new_admins with lft=rgt=level=root_id= null + unlock
insert into place (...) values (...), (...), (...), ...., (...)
# get ids of those places
select from place where woeid in(:array_woeids)
# lock + bulk update with 3 updates per place + unlock
update place set lft= lft + 2 where root_id = :root_id and lft >= :obj_lft;
update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
update place set lft=:obj_lft, rgt=:obj_rgt where id=:id
# i have to update the ancestors lft & rgt values on the $ancestors_array,
# because we have them "cached" in the array in order to avoid doing a
# select per inserted place.
What do you think? How would you do it? Would you save all the inserts and updates into a file and use the LOAD DATA INFILE
syntax instead of this option?
Is there any other option I'm missing?
thanks a lot!
Wouldn't it be more efficient to just insert the items and recalculate all the lft/rgt indexes only when all items are inserted?
However to do that you have to know parent ID of each item, so you have something to base the indexes on. I personally had success keeping lft + lrg + parentId with each data item, it makes maintenance much easier (you can do trivial recalculation/integrity check at any time).
精彩评论