Thursday, April 6, 2023

Oracle Deadlock scenario

In todays post I will try to simulate a deadlock scenario in one of my test box and will generate the 10027 trace with level 2 to get more information. Level 2 will give you cached cursors, process state info, session wait history for all sessions and the system state which is not possible in case of level 1. I am going to try with the Level 4 here as I want to get system state dump and don’t want to complicate this scenario.

I am going to create two tables – TableB is child table of TableA and some supporting objects to simulate the deadlock case.
Will try to simulate a deadlock scenario when insert/update/delete happens on TableB we need to sum the amount(amt) and then update it in TableA.total_amt column.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
[oracle@oracleontario ~]$ sqlplus dixdroid/oracle90
 
SQL*Plus: Release 19.0.0.0.0 - Production on Thu Feb 16 03:58:12 2023
Version 19.15.0.0.0
 
Copyright (c) 1982, 2022, Oracle.  All rights reserved.
 
Last Successful login time: Thu Feb 16 2023 03:04:03 -05:00
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0
 
SQL> create table tableA (pk_id number primary key, total_amt number);
 
Table created.
 
SQL> create table tableB (pk_id number primary key, fk_id number references tableA(pk_id) not null, amt number);
 
Table created.
 
SQL>
SQL> CREATE OR REPLACE PACKAGE global_pkg
IS
fk_id tableA.pk_id%TYPE;
END global_pkg; 
/
 
Package created.
 
SQL>
SQL> CREATE OR REPLACE TRIGGER tableB_ROW_TRG
BEFORE INSERT OR UPDATE OR DELETE
ON tableB
FOR EACH ROW
BEGIN
IF INSERTING OR UPDATING
THEN
global_pkg.fk_id := :new.fk_id;
ELSE
global_pkg.fk_id := :old.fk_id;
END IF;
END tableB_ROW_TRG;
/
 
Trigger created.
 
SQL>
SQL>
SQL>
SQL> CREATE OR REPLACE TRIGGER tableB_ST_trg
AFTER INSERT OR UPDATE OR DELETE
ON tableB
BEGIN
IF UPDATING OR INSERTING
THEN
UPDATE tableA
SET total_amt =
(SELECT SUM (amt)
FROM tableB
WHERE fk_id = global_pkg.fk_id)
WHERE pk_id = global_pkg.fk_id;
ELSE
UPDATE tableA
SET total_amt =
(SELECT SUM (amt)
FROM tableB
WHERE fk_id = global_pkg.fk_id)
WHERE pk_id = global_pkg.fk_id;
END IF;
END tableB_ST_trg;
 
Trigger created.
 
SQL>
SQL>
SQL> insert into tableA values (1, 0);
 
1 row created.
 
SQL> insert into tableA values (2, 0);
 
1 row created.
 
SQL> insert into tableB values (123, 1, 100);
 
1 row created.
 
SQL> insert into tableB values (456,1, 200);
 
1 row created.
 
SQL> insert into tableB values (789, 1, 100);
 
1 row created.
 
SQL> insert into tableB values (1011, 2, 50);
 
1 row created.
 
SQL> insert into tableB values (1213,2, 150);
 
1 row created.
 
SQL> insert into tableB values (1415, 2, 50);
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL>

Lets query the table and see the record count and next will delete an entry from tableB and won’t commit. At the same time will check the locks — TM lock on tableA.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
SQL>
SQL> select * from tableA;
 
     PK_ID  TOTAL_AMT
---------- ----------
         1        400
         2        250
 
SQL> delete tableB where pk_id = 1415;
 
1 row deleted.
 
SQL> SELECT sid,
(SELECT username
FROM v$session s
WHERE s.sid = v$lock.sid)
uname,
TYPE,
id1,
id2,
(SELECT object_name
FROM user_objects
WHERE object_id = v$lock.id1)
nm
FROM v$lock
WHERE sid IN (SELECT sid
FROM v$session
WHERE username IN (USER));
 
 
       SID UNAME                          TY        ID1        ID2 NM
---------- ------------------------------ -- ---------- ---------- ------------------------------
       440 DIXDROID                       AE        134          0
       440 DIXDROID                       TX     589853       3078
       440 DIXDROID                       TM      82284          0 TABLEB
       440 DIXDROID                       TM      82282          0 TABLEA

I will now connect to another session (sesion 2) and delete from tableB and that will induce a locking scenario in the database and session 2 will go into hung/waiting state.

1
2
3
4
5
6
7
---- from session 2:
SQL>
SQL>
SQL> delete tableB where pk_id = 1213;
.....
.........
............. <HUNG > <HUNG > <HUNG > <HUNG > <HUNG >

Lets see more stats on the blocking session.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
SQL>
SQL> SELECT (SELECT username
FROM v$session
WHERE sid = a.sid)
blocker,
a.sid,
' is blocking ',
(SELECT username
FROM v$session
WHERE sid = b.sid)
blockee,
b.sid
FROM v$lock a, v$lock b
WHERE a.block = 1 AND b.request > 0 AND a.id1 = b.id1 AND a.id2 = b.id2; 
 
 
BLOCKER                     SID 'ISBLOCKING'  BLOCKEE                               SID
-------------------- ---------- ------------- ------------------------------ ----------
DIXDROID                    440  is blocking  DIXDROID                              427

And as expected 440 SID is now blocking 427. Now lets go back to the original (session 1) and will try to delete from tableB again and this will snipe the session 2 which is still in hung/wait state and will throw a deadlock error (ORA-00060). Though we have already collected blocking information from v$lock + v$session but to get more clarify about blocking sessions, I will set the 10027 trace event with level 2.

This will increase our chance of finding the root cause of the deadlocks. If the setting shall persist across instance startups, you need to use the initialization parameter EVENT.

i.e. EVENT=”10027 trace name contex forever, level 2″ otherwise use ALTER SYSTEM version of it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
------------------------------------------------------------
-- Execute below statement in session1
------------------------------------------------------------
delete tableB where pk_id = 1213;
 
 
 
 
 
---------------------------------------------------------------------------------------
-- Oracle is throwing deadlock error as below in session2
---------------------------------------------------------------------------------------
SQL>
SQL> ALTER SYSTEM SET EVENTS '10027 trace name context forever, level 2';
 
System altered.
 
SQL> delete tableB where pk_id = 1213;
delete tableB where pk_id = 1213
       *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "DIXDROID.TABLEB_ST_TRG", line 11
ORA-04088: error during execution of trigger 'DIXDROID.TABLEB_ST_TRG'
 
-- same captured in alert log too.
2023-02-16T04:02:57.648156-05:00
Errors in file /u01/app/oracle/diag/rdbms/dixitdb/dixitdb/trace/dixitdb_ora_29389.trc:
2023-02-16T04:03:00.019794-05:00
ORA-00060: Deadlock detected. See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors. More info in file /u01/app/oracle/diag/rdbms/dixitdb/dixitdb/trace/dixitdb_ora_29389.trc.

And we done, we were able to simulate the deadlock case in the database. Now lets dig into the trace file generated by the deadlock along with information flushed by 10027 trace event. It has all crucial information associated with the deadlock.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
2023-02-16 04:13:34.115*:ksq.c@13192:ksqdld_hdr_dump():
DEADLOCK DETECTED ( ORA-00060 )
See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors
 
[Transaction Deadlock]
 
 
Deadlock graph:
                                          ------------Blocker(s)-----------  ------------Waiter(s)------------
Resource Name                             process session holds waits serial  process session holds waits serial
TX-00080009-00000C62-00000000-00000000         45     427     X        13193      24     440           X  22765
TX-0004001A-00000C1E-00000000-00000000         24     440     X        22765      45     427           X  13193
 
 
 
 
----- Information for waiting sessions -----
Session 427:
  sid: 427 ser: 13193 audsid: 3390369 user: 115/DIXDROID
    flags: (0x41) USR/- flags2: (0x40009) -/-/INC
    flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
  pid: 45 O/S info: user: oracle, term: UNKNOWN, ospid: 30174
    image: oracle@oracleontario.ontadomain (TNS V1-V3)
  client details:
    O/S info: user: oracle, term: pts/1, ospid: 30173
    machine: oracleontario.ontadomain program: sqlplus@oracleontario.ontadomain (TNS V1-V3)
    application name: SQL*Plus, hash value=3669949024
  current SQL:
  UPDATE TABLEA SET TOTAL_AMT = (SELECT SUM (AMT) FROM TABLEB WHERE FK_ID = :B1 ) WHERE PK_ID = :B1
 
Session 440:
  sid: 440 ser: 22765 audsid: 3380369 user: 115/DIXDROID
    flags: (0x41) USR/- flags2: (0x40009) -/-/INC
    flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
  pid: 24 O/S info: user: oracle, term: UNKNOWN, ospid: 29847
    image: oracle@oracleontario.ontadomain (TNS V1-V3)
  client details:
    O/S info: user: oracle, term: pts/2, ospid: 29846
    machine: oracleontario.ontadomain program: sqlplus@oracleontario.ontadomain (TNS V1-V3)
    application name: SQL*Plus, hash value=3669949024
  current SQL:
  delete tableB where pk_id = 1213
 
.....
.......
 
 
----- Current SQL Statement for this session (sql_id=duw5q5rpd5xvs) -----
UPDATE TABLEA SET TOTAL_AMT = (SELECT SUM (AMT) FROM TABLEB WHERE FK_ID = :B1 ) WHERE PK_ID = :B1
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x8ed34c98        11  DIXDROID.TABLEB_ST_TRG
.......
.........
 
 
----- VKTM Time Drifts Circular Buffer -----
session 427: DID 0001-002D-000000D9     session 440: DID 0001-0018-0000001C
session 440: DID 0001-0018-0000001C     session 427: DID 0001-002D-000000D9
 
Rows waited on:
  Session 427: obj - rowid = 0001416A - AAAUFqAAHAAAMgvAAB
  (dictionary objn - 82282, file - 7, block - 51247, slot - 1)
  Session 440: obj - rowid = 0001416C - AAAUFsAAHAAAMg/AAE
  (dictionary objn - 82284, file - 7, block - 51263, slot - 4)
.....
.......
 
 
Current Wait Stack:
     0: waiting for 'enq: TX - row lock contention'
        name|mode=0x54580006, usn<<16 | slot=0x4001a, sequence=0xc1e
        wait_id=69 seq_num=70 snap_id=1
        wait times: snap=18.173838 sec, exc=18.173838 sec, total=18.173838 sec
        wait times: max=infinite, heur=18.173838 sec
        wait counts: calls=6 os=6
        in_wait=1 iflags=0x15a0
    There is at least one session blocking this session.
      Dumping 1 direct blocker(s):
        inst: 1, sid: 440, ser: 22765
      Dumping final blocker:
        inst: 1, sid: 440, ser: 22765
    There are 1 sessions blocked by this session.
    Dumping one waiter:
      inst: 1, sid: 440, ser: 22765
      wait event: 'enq: TX - row lock contention'
        p1: 'name|mode'=0x54580006
        p2: 'usn<<16 | slot'=0x80009
        p3: 'sequence'=0xc62
      row_wait_obj#: 82284, block#: 51263, row#: 4, file# 7
      min_blocked_time: 0 secs, waiter_cache_ver: 33942
    Wait State:
      fixed_waits=0 flags=0x22 boundary=(nil)/-1
    Session Wait History:
        elapsed time of 0.004677 sec since current wait
........
............
 
 
24: USER ospid 29847 sid 440 ser 22765, waiting for 'enq: TX - row lock contention'
          Cmd: DELETE
          Blocked by inst: 1, sid: 427, ser: 13193
          Final Blocker inst: 1, sid: 427, ser: 13193
.....
.............
 
 
45: USER ospid 30174 sid 427 ser 13193, waiting for 'enq: TX - row lock contention'
          Cmd: UPDATE
          Blocked by inst: 1, sid: 440, ser: 22765
          Final Blocker inst: 1, sid: 440, ser: 22765
.........
.............
 
 
 The history is displayed in reverse chronological order.
 
    sample interval: 1 sec, max history 120 sec
    ---------------------------------------------------
      [19 samples,                                         04:13:16 - 04:13:34]
        waited for 'enq: TX - row lock contention', seq_num: 70
          p1: 'name|mode'=0x54580006
          p2: 'usn<<16 | slot'=0x4001a
          p3: 'sequence'=0xc1e
      [14 samples,                                         04:13:01 - 04:13:15]
        idle wait at each sample
      [session created at: 04:13:01]
    ---------------------------------------------------
    Sampled Session History Summary:
      longest_non_idle_wait: 'enq: TX - row lock contention'
      [19 samples, 04:13:16 - 04:13:34]
    ---------------------------------------------------
.........
..........
 
 
      Virtual Thread:
      kgskvt: 0x9b0e7f10, sess: 0x9ccdd4b8, pdb: 0, sid: 427 ser: 13193
      vc: (nil), proc: 0x9db42028, idx: 427
      consumer group cur: OTHER_GROUPS (pdb 0) (upd? 0)
mapped: DEFAULT_CONSUMER_GROUP, orig:  (pdb 0)
      vt_state: 0x2, vt_flags: 0xE030, blkrun: 0, numa: 1
      inwait: 1, wait event: 307, posted_run: 0
      lastmodrngcnt: 0, lastmodrngcnt_loc: '(null)'
      lastmodrblcnt: 0, lastmodrblcnt_loc: '(null)'
      location where insched last set: kgskbwt
      location where insched last cleared: kgskbwt
      location where inwait last set: kgskbwt
      location where inwait last cleared: NULL
      is_assigned: 1, in_scheduler: 0, insched: 0
      vt_active: 0 (pending: 1)
      vt_pq_active: 0, dop: 0, pq_servers (cur: 0 cg: 0)
      ps_allocs: 0, pxstmts (act: 0, done: 0 cg: 0)
      used quanta (usecs):
      stmt: 57272, accum: 0, mapped: 0, tot: 57272
      exec start consumed time lapse: 164377 usec
      exec start elapsed time lapse: 18179246 usec
      idle time: 0 ms, active time: 57272 (cg: 57272) usec
      last updnumps: 0 usec, active time (pq: 0 ps: 0) ms
      cpu yields:       stmt: 0, accum: 0, mapped: 0, tot: 0
      cpu waits:       stmt: 0, accum: 0, mapped: 0, tot: 0
      cpu wait time (usec):       stmt: 0, accum: 0, mapped: 0, tot: 0
      ASL queued time outs: 0, time: 0 (cur 0, cg 0)
      PQQ queued time outs: 0, time: 0 (cur 0, cg 0)
      Queue timeout violation: 0
      calls aborted: 0, num est exec limit hit: 0
      KTU Session Commit Cache Dump for IDLs:
      KTU Session Commit Cache Dump for Non-IDLs:
      ----------------------------------------

 

No comments:

Post a Comment