Jul 9, 2012

Concurrency waits "library cache: mutex x" for sequence

Some background to start with –

We have n-tier J2EE web applications and we use spring jdbc templates. We use connection pool, however we don’t use statement caching. We have a statement like this which runs about 10 to 30 times per second at the moment and commits as well.

INSERT INTO lb_stat (id, lb_id, login_id, read_date, lb_ufid, lb_created) VALUES (lb_status_seq.NEXTVAL, :1, :2, :3, :4, :5);

The table where we insert, is a nested partitioned one (range- hash). We have one unique local index on that table. Currently the table contains around 51 million records.


Most of the times, I see some concurrency waits associated with this insert statement when I look at OEM Top Activity. The waits are on “library cache: mutex x”. Sometimes this problem goes very severe and database load spikes and connection pool exhausts. I made sure we did not do anything wrong which allure Oracle to behave like this. I have checked, as we are using bind variables properly, the statement is not producing many child cursors, and not invalidating those child. People have been saying, there might have mutex problems while there are many child cursors – which is not our case, anyways.

We have been using Oracle version 11.2.0.2.0 fresh - no patch yet. My guess is - probably we hit the bug on this - Bug 13524966 or Bug 12966939 or something similar, there are tons of mutex bug reported to Oracle support.

To analyze further, I looked at the web and found some really nice articles on mutex problem. Here is one -

http://andreynikolaev.wordpress.com/2011/05/01/divide-and-conquer-the-true-mutex-contention

To conclude that it's a bug - I exchanged a couple of mails with the writer who is an expert on mutex problem - thanks to Andrey! :-)

It finally revealed that, when there are many calls on that particular  sequence, the mutex holds on that and holding time sometimes over 1 ms - which is huge in mutex world. We planned to get rid of the sequence in next release. Also, will be upgrading to 11.2.0.3.0 soon.


Here is some handy scripts I collected to initiate the mutex waits investigation.

/* Library Cache Object heavily accessed */

col name format a20
col cursor format a12 noprint
col type format a7
col LOCKED_TOTAL heading Locked format 99999
col PINNED_TOTAL heading Pinned format 99999999
col EXECUTIONS heading Executed format 99999999
col NAMESPACE heading Nsp format 999
set wrap on
set linesize 200

set pagesize 1000
 

select * from (
select case when (kglhdadr =  kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
              kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE
                from x$kglob
                order by kglobt24 desc)
    where rownum <= 10;


ADDRESS          NAME                 HASH_VALUE TYPE    Locked    Pinned  Executed  Nsp                                                                                                               
---------------- -------------------- ---------- ------- ------ --------- --------- ----                                                                                                               
00000005D1875E38 SELECT 'x' FROM DUAL 4261573303 CURSOR  ###### #########   5232320    0                                                                                                               
00000006119FB278 UA_SEQ               1535242682 SEQUENC ###### #########         0    1                                                                                                               
                                                 E                                                                                                                                                     
                                                                                                                                                                                                       
000000061190AF30 select login0_.id as  459939420 CURSOR  ######  65289824   5911243    0                                                                                                               
000000064F4643F0 MGMT_NOTIFY_Q        1523339315 QUEUE   ######  63487340       249   10                                                                                                               
00000005F66D2308 select login0_.id as  459939420 CURSOR  ######  58493935  12382999    0                                                                                                               
000000065FACDD90 select /*+ connect_b  487379649 CURSOR  ######  47798467   6782438    0                                                                                                               
00000005AE8F3F68 LB_STATUS_SEQ        3672626424 SEQUENC ######  40551649         0    1                                                                                                               
                                                 E                                                            
                                                            
/* Mutex ASH Wsits in last 30 min */

col sample_id noprint format 9999999
col sample_time format a8
col session_id heading "SID" format 9999
col session_serial# noprint format 9999
col event format a19
col blocking_sid heading "BLKS" format 9999
col shared_refcount noprint heading "RFC" format 99
col location_id heading "LOC" format 99
col sleeps  noprint format 99999
col mutex_object format a15
set pagesize 1000

set wrap off
 

SELECT sample_id,to_char(sample_time,'hh24:mi:ss') sample_time,session_id,session_serial#,sql_id,event,p1 IDN,
             FLOOR (p2/POWER(2,4*ws)) blocking_sid,MOD (p2,POWER(2,4*ws)) shared_refcount,
             FLOOR (p3/POWER (2,4*ws)) location_id,MOD (p3,POWER(2,4*ws)) sleeps,
             CASE WHEN (event LIKE 'library cache:%' AND p1 <= power(2,17)) THEN  'library cache bucket: '||p1
                 ELSE  (SELECT kglnaobj FROM x$kglob WHERE kglnahsh=p1 AND (kglhdadr = kglhdpar) and rownum=1) END mutex_object
            FROM (SELECT DECODE (INSTR (banner, '64'), 0, '4', '8') ws FROM v$version WHERE ROWNUM = 1) wordsize,
                   v$active_session_history
            WHERE p1text='idn' AND session_state='WAITING'
            AND sample_time BETWEEN (sysdate - 30/1440) AND sysdate
           ORDER BY sample_id;



SAMPLE_T   SID SQL_ID        EVENT                      IDN  BLKS LOC MUTEX_OBJECT                                                                                                                     
-------- ----- ------------- ------------------- ---------- ----- --- ---------------                  

07:26:55  1207 09h36susy1t4n library cache: mute 3672626424   162   0 LB_STATUS_SEQ                                                                                                                    
07:27:05  1487 09h36susy1t4n library cache: mute 3672626424  1476   0 LB_STATUS_SEQ                                                                                                                    
07:28:57  1480               library cache: mute 2321070391   658   0 PROD                                                                                                                           
07:28:57   658               library cache: mute 2321070391  1429   0 PROD                                                                                                                            
07:28:58  1429               library cache: mute 2924456787  1480   0 DATABASE                                                                                                                         
07:29:19   637 09h36susy1t4n library cache: mute 3672626424   435   0 LB_STATUS_SEQ                                                                                                                    
07:32:18   171 09h36susy1t4n library cache: mute 3672626424   373   0 LB_STATUS_SEQ                                                                                                                    
07:33:20  1426 bg7mbt0mq5j75 library cache: mute 2321070391   645   0 PROD                                                                                                                            
07:33:58   931               library cache: mute 2321070391  1003   0 PROD                                                                                                                            
07:33:58   803               library cache: mute 2321070391   931   0 PROD                                                                                                                            
07:34:26   236 09h36susy1t4n library cache: mute 3672626424   715   0 LB_STATUS_SEQ                                                                                                                    
07:38:57  1272               library cache: mute 2321070391   914   0 PROD                                                                                                                            
07:38:57   449               library cache: mute 2321070391   914   0 PROD                                                                                                                            
07:39:43   517 09h36susy1t4n library cache: mute 3672626424   804   0 LB_STATUS_SEQ                                                                                                                    
07:39:46   507 09h36susy1t4n library cache: mute 3672626424  1004   0 LB_STATUS_SEQ                                                                                                                    
07:39:53   665 09h36susy1t4n library cache: mute 3672626424   914   0 LB_STATUS_SEQ                                                                                                                    
07:40:40   592               library cache: mute 2924456787   720   0 DATABASE                                                                                                                         
07:41:06   990 09h36susy1t4n library cache: mute 3672626424   221   0 LB_STATUS_SEQ      
 

                                                                          
/* Mutex impact */

select event, p1,cnt,
    CASE WHEN (event LIKE 'library cache:%' AND p1 <= power(2,17)) THEN
    'library cache bucket: '||p1   ELSE  (SELECT kglnaobj FROM x$kglob WHERE
    kglnahsh=p1 AND (kglhdadr = kglhdpar) and rownum=1) END mutex_object
    from (select p1,event,count(*) cnt from v$active_session_history where
    p1text='idn' and session_state='WAITING'
    group by p1,event)
    order by cnt desc;


EVENT                       P1        CNT MUTEX_OBJECT                                                                                                                                                 
------------------- ---------- ---------- ---------------                                                                                                                                              
library cache: mute 2321070391        286 PROD                                                                                                                                                       
library cache: mute 2924456787        269 DATABASE                                                                                                                                                     
library cache: mute 3672626424        100 LB_STATUS_SEQ                                                                                                                                                
library cache: mute 1535242682         16 UA_SEQ                                                                                                                                                       
library cache: mute  370351507          5 select substrb(                                                                                                                                              
library cache: mute  950094862          4 select min(minb                                                                                                                                              
library cache: mute      57559          3 library cache b                                                                                                                                              
library cache: mute 1455413935          3 select min(minb                                                                                                                                              
library cache: mute  459016664          2 SYS_GROUP                                                                                    



And of course this mutex_stats.sql useful script by Andrey.

SQL> @mutex_stats.sql mutex_identifier=3672626424
Connected.

--------------------------------------------                                  
Statistics for "Library Cache" mutex                                           
idn: 3672626424 address  0x00000005AE8F4098                                    
Interval:   31.8 s,  gets: 1471, sleeps:4300
                                 

Requests rate:                  lambda=        46.3 Hz                         
Sleeps rate:                    omega=        135.3 Hz                         
Utilization:                    rho=    .040000                                
Exclusive Utilization:          rho_x=  .040000                                
Shared Utilization:             rho_s=  .000000                                
Avg. holding time:              S=     864.45 us                               
Service rate:                   mu=          1156.8 Hz                         
Spin inefficiency:              k=    #########                                

Secondary statistics:                                                         
Slps /Miss:               kappa=#########                                      
Spin_gets/miss:           sigma=#########                                      
correction coeff.         eta= 1.0                                             
Secondary sleeps ratio          .745                                           

--------------------------------------------                                  
Avg. number of  sessions waiting:         .02                                  

Library cache object related to  mutex idn 3672626424 :
                     

SEQUENCE: PROD.LB_STATUS_SEQ                                                  

ADDR:              TYPE        PIN/s:    LOCK/s:    EXEC/s: Namespc           
-------------------------------------------------------------------            
00000005AE8F3F68   Parent          15          0          0       1            

Mutex related parameters:
_kgl_mutex_wait_time=0 _wait_yield_hp_mode=yield    
_wait_yield_mode=yield _wait_yield_sleep_freq=100 _wait_yield_sleep_time_msecs=1
_wait_yield_yield_freq=20                                                      

PL/SQL procedure successfully completed.




***

No comments: