Parameters to tune when retrieving a lot of small BLOBs (2-10kb) from Oracle DB?
We have a table in which we store millions of small BLOBs (2-10kb). And we need to access a portion of this table at a time, iterating through it while retrieving all these BLOBs (this "portion" is usually in the order of 10**5 rows).
We use oracle 11g R1.
Right now I am thinking about tuning the OracleConnection.setDefaultRowPrefetch()
and give it a hint it's read-only.
What other tuning could be possible?
Also, does somebody have experience with handling small BLOBs with oracle 11g R2? Somebody told me that it is better optimized compared to R1 for handling small BLOBs, but I was wondering whether it's worth the try.
Any advice is appreciated!
EDIT: All rows in the "portion" will be used -- they will be processed into a special big binary file that will be consumed by another entity. So the iteration itself cannot be avoided开发者_如何学Python.
EDIT: Our current DDL (partial, modified)
TABLESPACE "BLOB_STUFF" LOB(
"STUFF"
)
STORE AS BASICFILE "BLOBS_FILE"(
TABLESPACE "BLOBS" ENABLE STORAGE IN ROW CHUNK 8192 NOCACHE LOGGING STORAGE (INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
);
Something to consider at the Oracle level:
Ensure the LOB column is created (or altered) with the CACHE
and ENABLE STORAGE IN ROW
clauses -- otherwise every read for every row will require a two direct path reads, which will be slooooow. Smaller LOBs will be stored inline, and larger LOBs will then be stored out-of-line.
we need to access a portion of this table at a time, iterating through it while retrieving all these BLOBs
There is a common anti-pattern for OO programmers to write their application so that it just grabs a slack handful of rows from the database and then iterate through them in the middle tier to winnow the rows which are required from the chaff. A more efficient approach is to write a focused query which retrieves just the precise rows.
If this is not what you're doing then you should edit your question to clarify your processing. If it is what you're doing you should explain why you need to iterate through one hundred thousand records at a time.
精彩评论