Resolving inactive transaction branch

Recently we encountered wait event ‘invalid transaction branch’ and here is how we resolved it.

OS: AIX 7.1
DB: Oracle 11.2.0.3.11 RAC

— identify session waiting for ‘invalid transaction branch’

select sid, serial#, program, machine, sql_id, status, WAIT_TIME_MICRO, event, seconds_in_wait
from v$session
where username='ONL1'
and status='ACTIVE';

1551 35801 MasterMgmt@PRMBMTR01 (TNS V1-V3) PRMBMTR01 35y99s5k1rjh0 ACTIVE 113572 inactive transaction branch

— identify SQL text

set long 999999
select sql_fulltext
from v$sqlstats
where sql_id='35y99s5k1rjh0';

select count(*) into :b0 from TRDX_ENTITY_NTNG_DTLS ,TRDX_ENTITY_MASTER
where ((END_ENTY_SRNO=EMA_ENTY_SRNO and EMA_ENTY_ENCD_SHRT_DESC like :b1)
and END_NTNG_TYPE_INDC=:b2)

— Identify objects involved in the query

select owner, object_name, object_type
from dba_objects
where object_name in ('TRDX_ENTITY_NTNG_DTLS','TRDX_ENTITY_MASTER');

OWNER OBJECT_NAME OBJECT_TYPE
------------------------------ ---------------------------------------- -------------------
ONL1 TRDX_ENTITY_MASTER TABLE
ONL1 TRDX_ENTITY_NTNG_DTLS TABLE

select p.spid, p.tracefile
from v$session s, v$process p
where s.sid=1551
and s.paddr=p.addr;

20971966
/oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/rmbtodb1_ora_20971966.trc
SQL> oradebug setospid 20971966
Oracle pid: 330, Unix process pid: 20971966, image: oracle@PRMBODB01
SQL>
SQL> oradebug dump errorstack 3;
Statement processed.
SQL> oradebug tracefile_name
/oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/rmbtodb1_ora_20971966.trc

ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48b4
<-sskgpwwait()+32<-skgpwwait()+180<-ksliwat()+11032<-kslwaitctx()+180<-kslwait()+112<-k2gInsert()+936 <-- This corresponds to bug description
<-k2lbeg()+1308<-k2sbeg()+388<-kpotxst()+3336<-kpotxse()+4500<-opiodr()+720<-ttcpip()+1028
<-opitsk()+1536<-opiino()+940<-opiodr()+720<-opidrv()+1132<-sou2o()+136<-opimai_real()+608
<-ssthrdmain()+268<-main()+204<-__start()+112
oradebug event 10046 trace name context forever, level 12;
oradebug event 10046 trace name context off;

SQL> select count(*) from dba_2pc_pending;

COUNT(*)
----------
0

SQL> select count(*) from dba_pending_transactions;

COUNT(*)
----------
0

— It seems that this session could wait forever

 https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.basetechref/doc/basetrf2/thread_wait.htm
root@PRMBODB01:/.root>truss -p 20971966
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1
_thread_wait(1152921504606807520) = 1

— Collect systemstate and hanganalyze

conn / as sysdba
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
-- take another systemstate after 1 minute

--oradebug -g all dump systemstate 266
oradebug hanganalyze 3
-- take another hanganalyze after 2 minutes

oradebug close_trace
oradebug flush

— session information

SO: 0x70000051507fda0, type: 4, owner: 0x700000514b53ab8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x700000514b53ab8, name=session, file=ksu.h LINE:12624 ID:, pg=0
(session) sid: 1551 ser: 35801 trans: 0x0, creator: 0x700000514b53ab8
flags: (0x100041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
DID: , short-term DID:
txn branch: 0x7000004fe8cf070
oct: 3, prv: 0, sql: 0x700000525f83ef0, psql: 0x700000525f83ef0, user: 41/ONL1
ksuxds FALSE at location: 0
service name: omktclu1
client details:
O/S info: user: rmbmtr1, term: , ospid: 11927894
machine: PRMBMTR01 program: MasterMgmt@PRMBMTR01 (TNS V1-V3)
application name: MasterMgmt@PRMBMTR01 (TNS V1-V3), hash value=1697589806
Current Wait Stack:
0: waiting for 'inactive transaction branch'
branch#=0x0, waited=0x0, =0x0
wait_id=39357 seq_num=39358 snap_id=1
wait times: snap=0.000934 sec, exc=0.000934 sec, total=0.000934 sec
wait times: max=1.000000 sec, heur=77 min 47 sec
wait counts: calls=1 os=1
in_wait=1 iflags=0x5a8
Wait State:
fixed_waits=0 flags=0x22 boundary=0x0/-1

— process information

SO: 0x700000514b53ab8, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x700000514b53ab8, name=process, file=ksu.h LINE:12616 ID:, pg=0
(process) Oracle pid:330, ser:183, calls cur/top: 0x7000004d90a2f98/0x7000004d90a2f98
flags : (0x0) -
flags2: (0x0), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 27
last post received-location: ksa2.h LINE:289 ID:ksasnr
last process to post me: 700000514b39b18 106 6
last post sent: 0 0 26
last post sent-location: ksa2.h LINE:285 ID:ksasnd
last process posted by me: 700000514b39b18 106 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0x700000534ce2b48
O/S info: user: grid, term: UNKNOWN, ospid: 20971966
OSD pid info: Unix process pid: 20971966, image: oracle@PRMBODB01
Short stack dump:
ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48b4<-sskgpwwait()+32<-skgpwwait()+180<-ksliwat()+11032<-kslwaitctx()+180<-kslwait()+112<-k2gInsert()+936<-k2lbeg()+1308<-k2sbeg()+388<-kpotxst()+3336<-kpotxse()+4500<-opiodr()+720<-ttcpip()+1028<-opitsk()+1536<-opiino()+940<-opiodr()+720<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112

— global transaction id and transaction branch information

SO: 0x7000004fe8cf070, type: 58, owner: 0x70000051507fda0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x700000514b53ab8, name=branch, file=ktccts.h LINE:420 ID:, pg=0
(branch) trn = 0x0, flg = 0x41080, state = 0x00 bno=0 ser=79 evt=79
creator = 0x70000051507fda0 uid = 0x29 serial# = 35801 ttl = 0x0 dtm = 0x0
2PCrole = (nch = 0x0 flg = 0x0) DTP svc = 0x0
gti.fmt: 004d514d gti.tln: 14 gti.bln: 4
gti.tid = 0x56ac2bd620f8cb074d5453514d30
gti.bid = 0x00000001
oradebug setospid 20971966
oradebug tracefile_name
oradebug event 24793 trace name context forever, level 48
oradebug event 24793 trace name context off
*** 2016-03-22 17:15:28.862
K2GTElock enter, id1=0xdc1c327d, id2=0xdbbf6194, tlk=0x0, tlm=00000000, act=1, mod=6
* K2GTElock k2qget, k2gtlm=00000000
k2qget(typ: DX id1: 0xdc1c327d, id2: 0x00000000 mod=6 timeout=6): entry
k2qget(): ksipget() returned 0
k2qget(typ: DX id1: 0xdc1c327d, id2: 0x00000000 mod=6 timeout=6): exit with 0

*** 2016-03-22 17:15:28.862
Dumping DX instance lock value (healthy):
-----------------------------------------------
mgc: QNXD id1: 0xdc1c327d id2: 0xdbbf6194 fmtid: 0x004d514d gtrid: 56ac2bd622f9fa03494e544643514d <-- This is the global transaction id
typ: 0x2 flg: 0x00000000 state: 0x01
tottx: 2 txncnt: 0 ptcnt: 0 ftcnt: 0 lastx: 255 ltxas: 255
totbr: 2 brncnt: 1 fbcnt: 0 bvec: 0x00000001
prep uba: 0x00000000.0000.00 glbl sscn: 0x0000.00000000 tscn: 0x0000.00000000
prep scn: 0xffff.ffffffff cmt scn: 0xffff.ffffffff
Local txn information:
Branch information:
id2: 0xdbbf6194 inst: 1 state: 0x0 uid: 0x00000029
* K2GTElock aft k2qget, k2gtlm=00000006, got lock

— dump distributed transaction (DX) lock info

oradebug lkdebug -O 0xdc1c327d 0xdbbf6194 DX

No new information found.

After comparing short_stack with bug 21354456 and according to the following doc:
Query Over DBLINK Waiting For “‘inactive transaction branch” (ID 2060008.1)

The solution is we need to apply patch 21354456 and the workaround is to kill the waiting session.

Note that in our scenario, DBLINK is not utilised. However, we lodged a SR to oracle and they confirm that the stack is a match with bug 21354456.

Reference:

Manually execute the query and check the execution plan.

alter session set current_schema=ONL1;
set autotrace on
variable b0 number;

select count(*) into :b0 from TRDX_ENTITY_NTNG_DTLS ,TRDX_ENTITY_MASTER where ((END_ENTY_SRNO=EMA_ENTY_SRNO and EMA_ENTY_ENCD_SHRT_DESC like '%j???%' ) and END_NTNG_TYPE_INDC=1)
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
Plan hash value: 2544379060

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 45 | 67 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 45 | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 3 | 135 | 67 (0)| 00:00:01 |
|* 4 | INDEX SKIP SCAN | PK_END | 66 | 528 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_EMA | 1 | | 0 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID| TRDX_ENTITY_MASTER | 1 | 37 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("END_NTNG_TYPE_INDC"=1)
filter("END_NTNG_TYPE_INDC"=1)
5 - access("END_ENTY_SRNO"="EMA_ENTY_SRNO")
6 - filter("EMA_ENTY_ENCD_SHRT_DESC" LIKE '%j???%')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
97 consistent gets
0 physical reads
0 redo size
525 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s