BLOB write size and CPU

Someone asked to help in identifying a strange problem. The problem found to be a combination of two issues, ASSM and the BLOB loading software, that manifested as an excessive CPU utilization. In this post we should see how different write buffer size can affect write efficiency.

Unfortunately, Tkprof and Statspack would often fail to point in proper direction when it gets to profiling activity performed by a Call-Less Cursor. In this case such cursor was responsible for writing and reading BLOBs. This issue was not specific to JDBC API. Later it was reproduced with DBMS_LOB API.

Just as an additional point to the mentioned above topic, here is an example of Tkprof and Statspack reports from the system where this issue was reproduced with help of single threaded application in an idle environment.

Tkprof

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 16 0.00 0.02 0 0 0 0 Execute 23 3.09 4.57 6 13894 18366 17 Fetch 3 0.00 0.00 0 6 0 3 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 42 3.10 4.59 6 13900 18366 20

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 230212 5.39 5.14 0 0 0 0 Execute 230215 176.87 173.89 47 692034 175390 176321 Fetch 153422 4.22 3.96 5 230821 3 77048 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 613849 186.49 183.01 52 922855 175393 253369

Statspack

Load Profile                            Per Second       Per Transaction
~~~~~~~~~~~~                       ---------------       ---------------
                  Redo size:          1,553,489.88        668,736,513.89
              Logical reads:            630,504.58        271,415,631.42
            Block changes:                782.47            336,833.37
             Physical reads:                171.33             73,752.42
            Physical writes:                181.92             78,311.00
                 User calls:                 42.80             18,424.58
                     Parses:                 44.40             19,114.63
                Hard parses:                  0.00                  1.00
                      Sorts:                  0.39                169.89
                     Logons:                  0.00                  0.84
                   Executes:                 44.94             19,344.89
               Transactions:                  0.00

Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
CPU time                                                     7,764          95.9
log file parallel write                         85,190         147      2    1.8
log file switch (checkpoint incomplete)            246          88    357    1.1
log file switch completion                         502          76    151     .9
control file parallel write                      7,973           9      1     .1
          -------------------------------------------------------------

Instance Activity Stats
Statistic                                      Total     per Second    per Trans
--------------------------------- ------------------ -------------- ------------
...
db block gets                          5,116,321,961      625,543.7 ############
db block gets direct                       1,442,703          176.4     75,931.7
db block gets from cache               5,114,879,258      625,367.3 ############
...
physical writes                            1,487,909          181.9     78,311.0
physical writes direct                     1,444,083          176.6     76,004.4
physical writes direct (lob)               1,442,684          176.4     75,930.7
session logical reads                  5,156,896,997      630,504.6 ############
...

Segments by Logical Reads
                                           Subobject    Obj.       Logical   Pct
Owner      Tablespace Object Name          Name         Type         Reads Total
---------- ---------- -------------------- ------------ ----- ------------ -----
SYS        SYSTEM     SEG$                              TABLE      434,992 23.6
SYS        SYSTEM     TSQ$                              TABLE      432,816 23.5
TEST       TEST       SYS_LOB0000056448C00              LOB        289,808 15.7
SYS        SYSTEM     I_FILE#_BLOCK#                    INDEX      288,448 15.7
SYS        SYSTEM     FILE$                             TABLE      220,416 12.0
          -------------------------------------------------------------

As it can be seen, the number of session logical reads in the test doesn’t align well with figures in Segments by Logical Reads section. Tkprof doesn’t even have a clue about 7K CPU seconds.

But let’s get back to the business. What could be happening that triggered such a vast amount of reads when writing BLOBs? A call to the vendor of the software revealed that the BLOB data is written in chunks and for each chunk the BLOB was closed and reset with an updated position. So, we duplicated this behavior and have gotten following statistics using “runstats” and DbmsOutput.java shared by Tom Kyte and Java API shared by R. M. Menon on AskTom.

BLOB...chunk size                    32,768 32,768 32,768 32,768  32,768 32,768 32,768 32,768  32,768
BLOB...write buffer size             2,048   4,096   8,132   8,192   16,264 16,384 24,396 32,528  32,768
STAT...consistent gets               971     375     119     319     51      234     27      190     222
STAT...consistent gets direct        252     159     52      79      16      32      9       4       20
STAT...db block gets                 6,146   884     423     479     209     268     109     100     208
STAT...db block gets direct          268     140     64      76      32      44      32      16      36
STAT...physical reads direct (lob)   252     124     48      60      16      28      16      0       20
STAT...physical writes direct (lob) 268     140     64      76      32      44      32      16      36
STAT...session logical reads         7,117   1,259   542     798     260     502     136     290     430

BLOB...chunk size                    16,384 16,384 16,384 16,384  16,384 16,384 16,384 16,384  16,384
BLOB...write buffer size             2,048   4,096   8,132   8,192   16,264 16,384 24,396 32,528  32,768
STAT...consistent gets               449     222     99      213     47      74      25      35      63
STAT...consistent gets direct        126     80      24      53      8       22      3       8       19
STAT...db block gets                 3,904   733     397     813     182     295     113     143     250
STAT...db block gets direct          142     78      32      46      16      30      22      16      26
STAT...physical reads direct (lob)   126     62      16      30      0       14      6       0       10
STAT...physical writes direct (lob) 142     78      32      46      16      30      22      16      26
STAT...session logical reads         4,353   955     496     1,026   229     369     138     178     313

BLOB...chunk size                    8,192   8,192   8,192   8,192   8,192   8,192   8,192   8,192   8,192
BLOB...write buffer size             2,048   4,096   8,132   8,192   16,264 16,384 24,396 32,528  32,768
STAT...consistent gets               347     264     95      128     71      91      23      17      39
STAT...consistent gets direct        63      78      16      35      16      27      0       0       7
STAT...db block gets                 2,657   996     349     468     273     369     103     85      161
STAT...db block gets direct          79      47      16      31      16      23      16      16      21
STAT...physical reads direct (lob)   63      31      0       15      0       7       0       0       5
STAT...physical writes direct (lob) 79      47      16      31      16      23      16      16      21
STAT...session logical reads         3,004   1,260   444     596     344     460     126     102     200

By looking at physical reads direct (lob) statistics we can see that there are certain write buffer sizes for each BLOB chunk size that do no produce any additional reading. For our tests with chunk sizes 8K, 16K and 32K it appears that these local minimum points can be derived from BLOB chunk size and certain chunk overhead as in N*(chunk size – K), where N is any natural number and K overhead which dependents on the chunk size as in 8K – 60 Bytes, 16K – 120 Bytes and 32K – 240 Bytes. In these points the amount of physical writes direct (lob) is minimal and what is interesting depends only on the size of the incoming data and the Block Size as in Blob Size/Block Size. From which we can see that each BLOB chunk is written only once.

Here is one more interesting thing to mention, although it is not confirmed. Number of additional operations in other cases exactly matches following scenario that can be clearly described as rewriting BLOB chunks, when ongoing operation reads entire BLOB chunk that is left incompleteby the previous write operation, appends data from current buffer and stores all chunk’s blocks back into the DB. This also triggers significant degradation of LIO per effective amount of written data ratio, which improves with the increase of the write buffer size. This behavior was one of the culprits of the original issue. Of course, not closing BLOB stream on each write would be a better way. However, sometime it isn’t possible to rewrite the code. If an application can be configured to use buffer of a certain length, the issue can be alleviated.

ASSM was playing not the least role in this situation. It has known for causing process “spinning” behavior. In our case the session was sitting in the following stack.

#0 0x000000000159fa03 in kcbgcur ()
#1 0x0000000001007f3b in ktugusc ()
#2 0x0000000001019032 in ktugti () - KTU: Kernel Transaction Undo Get Transaction table entry Information
#3 0x0000000001047e66 in ktbIsItlEntryCommitted ()
#4 0x000000000104db76 in ktrIsItlEntryCommitted ()
#5 0x00000000027cfa2e in kdlxgs_init () - reclaim space from transaction freelist in index
#6 0x0000000000bfab81 in ktsplbfmb ()
#7 0x0000000000bfbd80 in ktsplbrecl ()
#8 0x0000000000bd36e9 in ktspgsp_cbk1 ()
#9 0x00000000027e740f in kdlgsp_init () - Space management batching
#10 0x00000000027e45cb in kdl_write1 ()
#11 0x0000000001d2ece3 in koklwrite () - KOK Lob WRITE
#12 0x00000000022f2094 in kpolob () - KPP Lob operations
#13 0x0000000000711ef8 in opiodr ()
#14 0x00000000039a973b in ttcpip ()
#15 0x000000000070df90 in opitsk ()
#16 0x0000000000710e36 in opiino ()
#17 0x0000000000711ef8 in opiodr ()
#18 0x000000000070bc23 in opidrv ()
#19 0x000000000070a0ce in sou2o ()
#20 0x00000000006d008b in opimai_real ()
#21 0x00000000006cffdc in main ()

Moving hot BLOBs to tablespaces with manual segment space management further improved the situation.

Here are few papers on Metalink that could be helpful in resolving similar BLOB issues.

Note:162345.1 “LOBS – Storage, Read-consistency and Rollback”
Note:66431.1 “LOBS – Storage, Redo and Performance Issues”
Note:268476.1 “LOB Performance Guideline”
Bug: 5253061 “LOB INSERT PERFORMANCE DIFFERS AFTER INSERT DEPENDING ON COMMIT”
Bug: 6128525 “ELAPSED TIME IS TOO HIGHER THAN “WAITED TIME + CPU TIME” IN ASSM LOB ACCESS”
Bug: 5131464 “10.1.0.4 RDBMS 10.1.0.4 SPACE PRODID-5 PORTID-226″

Latest Images

Trending Articles

Latest Images