开发者

Java Batch Insert into MySQL very slow [duplicate]

This question already has answers here: Closed 10 years ago.

Possible Duplicate:

JDBC batch insert performance

I have the following method in a class:

    public void insertShingleSets(Vector<ShingleSet> shingleSets)
    {
        String sql = "INSERT   INTO tblPostingsShingles("+
                "rowId, " +                    
                "shingle) " +                     
                "VALUES(?,?);";

        PreparedStatement statement = null;
        try {
            statement = conn.prepareStatement(sql);
            for (int i = 0; i < shingleSets.size(); i++)
            {   String id = shingleSets.get(i).getRowId();
                String shingle = shingleSets.get(i).getShingle();
                statement.setInt(1, Integer.parseInt(id));
                statement.setString(2, shingle);
                statement.addBatch();
                if ((i + 1) % 1000 == 0) {
                    System.out.println("doing a batch " + i);         //-------------
                    statement.execute开发者_如何学编程Batch(); // Execute every 1000 items.
                    System.out.println("done batch " + i);            //-------------
                             }
            }
            statement.executeBatch();
        }
       catch (SQLException ex)
            {
                // handle any errors
                System.out.println("SQLException: " + ex.getMessage());
                System.out.println("SQLState: " + ex.getSQLState());
                System.out.println("VendorError: " + ex.getErrorCode());
            }

        finally {
            if (statement != null)
                try { statement.close();
                } catch (SQLException logOrIgnore) {}
            if (conn != null)
                try { conn.close();
                } catch (SQLException logOrIgnore) {}
        }
    }

The time between executing: System.out.println("doing a batch " + i); and executing: System.out.println("done batch " + i); is about 30 seconds which seems rather a lot considering it's only inserting two columns into a three column table (the other column is an autonumber primary key and when starting/testing there were no rows in the table). The only thing I can think of is that before this method is called, another method uses a query that checks in the table tblPostingsShingles to check if certain rowIds are there. However, I would have thought that any locks would be released when that method finishes (it has the same finally clause as this method). Any advice would be greatly appreciated. Dave


You can experiment with batch sizes, but I've had processes that seemed quite fast at 500 a batch and 1000 is not overly large - there's no obvious reason that I can see that that would take so long - it's easily an order of magnitude too slow.

Other possible performance bottlenecks are your mysql configuration and network connection speed. Is this speed any faster than doing a single write at a time?


I would suggest experimenting with a smaller batch size. Generally, when I do batch inserts I limit the size to 20 - 50 rows at a time. Right now with 1000 row batch, your job is likely blocking on database and network I/O. If you cut down the batch size you may be able to speed up the process.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜