Optimize SQL with subquery factoring

Recently, one of our clients reported one particular statement, which runs faster on one node (within 10 secs) while the response time is longer on the other node (about 1 mins or even longer).

Our Server Enviroment:

OS : AIX 7.1 using PowerVM
DB : Oracle 11.2.0.3.11 RAC

The original SQL statement:

SELECT distinct t1.EIR_ENTY_SRNO,
                EMA.EMA_ENTY_ENTY_CODE,
                EMA.EMA_ENTY_ENCD_SHRT_DESC
  FROM USR.TAB_ENTY_IDNTY_TB         t1,
       USR.TAB_ENTITY_REMASTE         EMA,
       USR.MYTB_USER_DTLS             UDT,
       USR.MYTB_USER_PRVLGE_MASTER    UPM,
       USR.TAB_BUSINESS_MASTER AMM
 WHERE UPM.UPM_USER_SRNO = UDT.UDT_USER_SRNO
   AND AMM.AMM_ACSS_LVL_SRNO = UDT.UDT_USER_SRNO
   AND UDT.UDT_ENTY_SRNO = EMA.EMA_ENTY_SRNO
   AND UPM.UPM_USER_CTGRY_SRNO = 10
   AND EIR_STATUS = 'A'
   AND EIR_IDNTY_INDC = 10
   AND t1.EIR_ENTY_SRNO = EMA.EMA_ENTY_SRNO
   AND SIGN(BITAND(TO_NUMBER(DBMS_LOB.SUBSTR(AMM.AMM_FNCTY_BIT_STREAM,
                                             1,
                                             CEIL(2740 / 8)),
                             'XX'),
                   POWER(2, MOD(2740 - 1, 8)))) = 0;

The first step would be to compare execution plans on the two nodes.

Execution Plan on the 1st node:

Plan hash value: 1299249194

------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                       | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                            |      1 |        |  5908 (100)|      9 |00:00:07.53 |    1355K|       |       |          |
|   1 |  HASH UNIQUE                      |                            |      1 |    286 |  5908   (1)|      9 |00:00:07.53 |    1355K|   843K|   843K| 1019K (0)|
|*  2 |   HASH JOIN                       |                            |      1 |    286 |  5907   (1)|     11 |00:00:07.53 |    1355K|   845K|   845K| 1014K (0)|
|*  3 |    HASH JOIN                      |                            |      1 |    286 |  1198   (1)|     13 |00:00:00.08 |    3311 |   827K|   827K| 1306K (0)|
|   4 |     NESTED LOOPS                  |                            |      1 |        |            |    257 |00:00:00.01 |     577 |       |       |          |
|   5 |      NESTED LOOPS                 |                            |      1 |    370 |   477   (0)|    257 |00:00:00.01 |     423 |       |       |          |
|   6 |       NESTED LOOPS                |                            |      1 |     72 |   129   (0)|     68 |00:00:00.01 |     366 |       |       |          |
|*  7 |        TABLE ACCESS FULL          | TAB_ENTY_IDNTY_TB          |      1 |     72 |    58   (0)|     72 |00:00:00.01 |     234 |       |       |          |
|   8 |        TABLE ACCESS BY INDEX ROWID| TAB_ENTITY_REMASTE         |     72 |      1 |     1   (0)|     68 |00:00:00.01 |     132 |       |       |          |
|*  9 |         INDEX UNIQUE SCAN         | PK_EMA                     |     72 |      1 |     0   (0)|     68 |00:00:00.01 |      64 |       |       |          |
|* 10 |       INDEX RANGE SCAN            | IND_UDT_114                |     68 |      7 |     1   (0)|    257 |00:00:00.01 |      57 |       |       |          |
|  11 |      TABLE ACCESS BY INDEX ROWID  | MYTB_USER_DTLS             |    257 |      5 |     6   (0)|    257 |00:00:00.01 |     154 |       |       |          |
|* 12 |     INDEX FAST FULL SCAN          | PK_UPM                     |      1 |  79216 |   720   (2)|     42 |00:00:00.07 |    2734 |       |       |          |
|* 13 |    TABLE ACCESS FULL              | TAB_BUSINESS_MASTER        |      1 |    102K|  4707   (1)|    102K|00:00:07.32 |    1352K|       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("AMM"."AMM_ACSS_LVL_SRNO"="UDT"."UDT_USER_SRNO")
   3 - access("UPM"."UPM_USER_SRNO"="UDT"."UDT_USER_SRNO")
   7 - filter(("EIR_STATUS"='A' AND "EIR_IDNTY_INDC"=10))
   9 - access("T1"."EIR_ENTY_SRNO"="EMA"."EMA_ENTY_SRNO")
  10 - access("UDT"."UDT_ENTY_SRNO"="EMA"."EMA_ENTY_SRNO")
  12 - filter("UPM"."UPM_USER_CTGRY_SRNO"=10)
  13 - filter(SIGN(BITAND(TO_NUMBER(RAWTOHEX("DBMS_LOB"."SUBSTR"(INTERNAL_FUNCTION("AMM"."AMM_FNCTY_BIT_STREAM"),1,343)),''),8))=0)

Note
-----
   - cardinality feedback used for this statement

Execution Plan on the 2nd node:

Plan hash value: 1299249194

------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                       | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                            |      1 |        |  5908 (100)|      9 |00:02:07.89 |    1355K|       |       |          |
|   1 |  HASH UNIQUE                      |                            |      1 |    286 |  5908   (1)|      9 |00:02:07.89 |    1355K|   843K|   843K|  999K (0)|
|*  2 |   HASH JOIN                       |                            |      1 |    286 |  5907   (1)|     11 |00:02:07.89 |    1355K|   845K|   845K| 1023K (0)|
|*  3 |    HASH JOIN                      |                            |      1 |    286 |  1198   (1)|     13 |00:00:00.10 |    3311 |   827K|   827K| 1257K (0)|
|   4 |     NESTED LOOPS                  |                            |      1 |        |            |    257 |00:00:00.01 |     577 |       |       |          |
|   5 |      NESTED LOOPS                 |                            |      1 |    370 |   477   (0)|    257 |00:00:00.01 |     423 |       |       |          |
|   6 |       NESTED LOOPS                |                            |      1 |     72 |   129   (0)|     68 |00:00:00.01 |     366 |       |       |          |
|*  7 |        TABLE ACCESS FULL          | TAB_ENTY_IDNTY_TB          |      1 |     72 |    58   (0)|     72 |00:00:00.01 |     234 |       |       |          |
|   8 |        TABLE ACCESS BY INDEX ROWID| TAB_ENTITY_REMASTE         |     72 |      1 |     1   (0)|     68 |00:00:00.01 |     132 |       |       |          |
|*  9 |         INDEX UNIQUE SCAN         | PK_EMA                     |     72 |      1 |     0   (0)|     68 |00:00:00.01 |      64 |       |       |          |
|* 10 |       INDEX RANGE SCAN            | IND_UDT_114                |     68 |      7 |     1   (0)|    257 |00:00:00.01 |      57 |       |       |          |
|  11 |      TABLE ACCESS BY INDEX ROWID  | MYTB_USER_DTLS             |    257 |      5 |     6   (0)|    257 |00:00:00.01 |     154 |       |       |          |
|* 12 |     INDEX FAST FULL SCAN          | PK_UPM                     |      1 |  79216 |   720   (2)|     42 |00:00:00.10 |    2734 |       |       |          |
|* 13 |    TABLE ACCESS FULL              | TAB_BUSINESS_MASTER        |      1 |    102K|  4707   (1)|    102K|00:02:07.57 |    1352K|       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("AMM"."AMM_ACSS_LVL_SRNO"="UDT"."UDT_USER_SRNO")
   3 - access("UPM"."UPM_USER_SRNO"="UDT"."UDT_USER_SRNO")
   7 - filter(("EIR_STATUS"='A' AND "EIR_IDNTY_INDC"=10))
   9 - access("T1"."EIR_ENTY_SRNO"="EMA"."EMA_ENTY_SRNO")
  10 - access("UDT"."UDT_ENTY_SRNO"="EMA"."EMA_ENTY_SRNO")
  12 - filter("UPM"."UPM_USER_CTGRY_SRNO"=10)
  13 - filter(SIGN(BITAND(TO_NUMBER(RAWTOHEX("DBMS_LOB"."SUBSTR"(INTERNAL_FUNCTION("AMM"."AMM_FNCTY_BIT_STREAM"),1,343)),''),8))=0)

Note
-----
   - cardinality feedback used for this statement

We can see that they are the same plan. Note id 13 in the plan i.e TABLE ACCESS FULL TAB_BUSINESS_MASTER. It took 7.32 seconds on the 1st node while the elapsed time was 2 mins and 7.57 seconds on the 2nd node. In this step, Oracle applies several convert functions (BITAND, TO_NUMBER, RAWTOHEX, etc) on 102K rows from table TAB_BUSINESS_MASTER. Apprently, this is the performance hog.

Since they are the same plan, the next step would be to collect 10046 trace to determine why id 13 took so long on the 2nd node.

trace exerpt from the 1st node:

FETCH #4573832368:c=5317037,e=9408759,p=0,cr=123269,cu=1232527,mis=0,r=1,dep=0,og=1,plh=1299249194,tim=18845139364346
WAIT #4573832368: nam='SQL*Net message from client' ela= 224 driver id=1650815232 #bytes=1 p3=0 obj#=23038 tim=18845139365185
WAIT #4573832368: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=23038 tim=18845139365221
FETCH #4573832368:c=53,e=87,p=0,cr=0,cu=0,mis=0,r=8,dep=0,og=1,plh=1299249194,tim=18845139365301
STAT #4573832368 id=1 cnt=9 pid=0 pos=1 obj=0 op='HASH UNIQUE (cr=123269 pr=0 pw=0 time=9408829 us cost=5908 size=318604 card=286)'
STAT #4573832368 id=2 cnt=11 pid=1 pos=1 obj=0 op='HASH JOIN (cr=123269 pr=0 pw=0 time=9408484 us cost=5907 size=318604 card=286)'
STAT #4573832368 id=3 cnt=13 pid=2 pos=1 obj=0 op='HASH JOIN (cr=3311 pr=0 pw=0 time=119295 us cost=1198 size=23738 card=286)'
STAT #4573832368 id=4 cnt=257 pid=3 pos=1 obj=0 op='NESTED LOOPS (cr=577 pr=0 pw=0 time=5143 us)'
STAT #4573832368 id=5 cnt=257 pid=4 pos=1 obj=0 op='NESTED LOOPS (cr=423 pr=0 pw=0 time=3835 us cost=477 size=27750 card=370)'
STAT #4573832368 id=6 cnt=68 pid=5 pos=1 obj=0 op='NESTED LOOPS (cr=366 pr=0 pw=0 time=3088 us cost=129 size=4680 card=72)'
STAT #4573832368 id=7 cnt=72 pid=6 pos=1 obj=23555 op='TABLE ACCESS FULL TAB_ENTY_IDNTY_TB (cr=234 pr=0 pw=0 time=2374 us cost=58 size=720 card=72)'
STAT #4573832368 id=8 cnt=68 pid=6 pos=2 obj=23485 op='TABLE ACCESS BY INDEX ROWID TAB_ENTITY_REMASTE (cr=132 pr=0 pw=0 time=648 us cost=1 size=55 card=1)'
STAT #4573832368 id=9 cnt=68 pid=8 pos=1 obj=24599 op='INDEX UNIQUE SCAN PK_EMA (cr=64 pr=0 pw=0 time=345 us cost=0 size=0 card=1)'
STAT #4573832368 id=10 cnt=257 pid=5 pos=2 obj=24571 op='INDEX RANGE SCAN IND_UDT_114 (cr=57 pr=0 pw=0 time=604 us cost=1 size=0 card=7)'
STAT #4573832368 id=11 cnt=257 pid=4 pos=2 obj=23471 op='TABLE ACCESS BY INDEX ROWID MYTB_USER_DTLS (cr=154 pr=0 pw=0 time=1052 us cost=6 size=50 card=5)'
STAT #4573832368 id=12 cnt=42 pid=3 pos=2 obj=24070 op='INDEX FAST FULL SCAN PK_UPM (cr=2734 pr=0 pw=0 time=112815 us cost=720 size=633728 card=79216)'
STAT #4573832368 id=13 cnt=102698 pid=2 pos=2 obj=23038 op='TABLE ACCESS FULL TAB_BUSINESS_MASTER (cr=119958 pr=0 pw=0 time=9142972 us cost=4707 size=105881638 card=102698)'

trace exerpt from the 2nd node:

FETCH #4575578224:c=22742761,e=62858539,p=0,cr=123269,cu=1232527,mis=0,r=1,dep=0,og=1,plh=1299249194,tim=14429311445421
WAIT #4575578224: nam='SQL*Net message from client' ela= 429 driver id=1650815232 #bytes=1 p3=0 obj#=23038 tim=14429311446825
WAIT #4575578224: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=23038 tim=14429311446887
FETCH #4575578224:c=20,e=115,p=0,cr=0,cu=0,mis=0,r=8,dep=0,og=1,plh=1299249194,tim=14429311446991
STAT #4575578224 id=1 cnt=9 pid=0 pos=1 obj=0 op='HASH UNIQUE (cr=123269 pr=0 pw=0 time=62858648 us cost=5908 size=318604 card=286)'
STAT #4575578224 id=2 cnt=11 pid=1 pos=1 obj=0 op='HASH JOIN (cr=123269 pr=0 pw=0 time=62858005 us cost=5907 size=318604 card=286)'
STAT #4575578224 id=3 cnt=13 pid=2 pos=1 obj=0 op='HASH JOIN (cr=3311 pr=0 pw=0 time=137297 us cost=1198 size=23738 card=286)'
STAT #4575578224 id=4 cnt=257 pid=3 pos=1 obj=0 op='NESTED LOOPS (cr=577 pr=0 pw=0 time=5434 us)'
STAT #4575578224 id=5 cnt=257 pid=4 pos=1 obj=0 op='NESTED LOOPS (cr=423 pr=0 pw=0 time=4188 us cost=477 size=27750 card=370)'
STAT #4575578224 id=6 cnt=68 pid=5 pos=1 obj=0 op='NESTED LOOPS (cr=366 pr=0 pw=0 time=3269 us cost=129 size=4680 card=72)'
STAT #4575578224 id=7 cnt=72 pid=6 pos=1 obj=23555 op='TABLE ACCESS FULL TAB_ENTY_IDNTY_TB (cr=234 pr=0 pw=0 time=2471 us cost=58 size=720 card=72)'
STAT #4575578224 id=8 cnt=68 pid=6 pos=2 obj=23485 op='TABLE ACCESS BY INDEX ROWID TAB_ENTITY_REMASTE (cr=132 pr=0 pw=0 time=708 us cost=1 size=55 card=1)'
STAT #4575578224 id=9 cnt=68 pid=8 pos=1 obj=24599 op='INDEX UNIQUE SCAN PK_EMA (cr=64 pr=0 pw=0 time=357 us cost=0 size=0 card=1)'
STAT #4575578224 id=10 cnt=257 pid=5 pos=2 obj=24571 op='INDEX RANGE SCAN IND_UDT_114 (cr=57 pr=0 pw=0 time=763 us cost=1 size=0 card=7)'
STAT #4575578224 id=11 cnt=257 pid=4 pos=2 obj=23471 op='TABLE ACCESS BY INDEX ROWID MYTB_USER_DTLS (cr=154 pr=0 pw=0 time=959 us cost=6 size=50 card=5)'
STAT #4575578224 id=12 cnt=42 pid=3 pos=2 obj=24070 op='INDEX FAST FULL SCAN PK_UPM (cr=2734 pr=0 pw=0 time=130178 us cost=720 size=633728 card=79216)'
STAT #4575578224 id=13 cnt=102698 pid=2 pos=2 obj=23038 op='TABLE ACCESS FULL TAB_BUSINESS_MASTER (cr=119958 pr=0 pw=0 time=62559987 us cost=4707 size=105881638 card=102698)'

The cr stat is the same for row source operation “TABLE ACCESS FULL TAB_BUSINESS_MASTER” and the difference lies in time ( 9142972 us vs 62559987 us ).

The 2nd observation is that the performance degrade was not the consequence of cache fushion : we didn’t see any gc related wait in the trace file. Further evidence from x$bh shows that buffers are already present in the 2nd nodes buffer cache.

After collecting ASH information, we confirmed that the SQL execution didn’t encounter any wait event : event is null for the session in the corresponding period.

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.04          0          4          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     22.74      62.85          0     123269    1232527           9
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     22.75      62.90          0     123273    1232527           9

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       3        0.00          0.00
  SQL*Net message from client                     3       61.52         73.70
  library cache lock                              5        0.00          0.00
  library cache pin                               5        0.00          0.00
  latch: cache buffers lru chain                  8        0.00          0.00
  latch: cache buffers chains                     2        0.00          0.00
  latch free                                      2        0.00          0.00
  gc current block 2-way                          1        0.00          0.00
  gc cr block 2-way                               1        0.00          0.00
  latch: object queue header operation            2        0.00          0.00

From tkprof output, we can see that CPU time is 22.75 seconds while elapsed time is 62.90 seconds. No disk reads. We don’t know for sure why the time difference is so dramatic.

Until now, our best guess is that there is CPU usage bottleneck on the 2nd node. One justification is that the overall CPU usage on the 2nd node was more than 60% most of the time. However, we cannot confirm this as yet and are still waiting for reply from MOS support.

In addition, after spotting the culprit, we can now design new strategies to reduce CPU usage. We found that the original query returns 9 rows and the BITAND operation only rules out 4 rows. In other words, the temporary result without the BITAND operation would return 13 rows. Applying functions to 13 rows would yield significant performance benefit than 102K rows.

Here is the better version of the SQL.

WITH AMMT AS (
	SELECT /*+ MATERIALIZE */ t1.EIR_ENTY_SRNO EIR_ENTY_SRNO,
			EMA.EMA_ENTY_ENTY_CODE EMA_ENTY_ENTY_CODE,
			EMA.EMA_ENTY_ENCD_SHRT_DESC EMA_ENTY_ENCD_SHRT_DESC,
			AMM.AMM_FNCTY_BIT_STREAM AMM_FNCTY_BIT_STREAM
		FROM USR.TAB_ENTY_IDNTY_TB         t1,
			USR.TAB_ENTITY_REMASTE         EMA,
			USR.MYTB_USER_DTLS             UDT,
			USR.MYTB_USER_PRVLGE_MASTER    UPM,
			USR.TAB_BUSINESS_MASTER AMM
		WHERE UPM.UPM_USER_SRNO = UDT.UDT_USER_SRNO
			AND AMM.AMM_ACSS_LVL_SRNO = UDT.UDT_USER_SRNO
			AND UDT.UDT_ENTY_SRNO = EMA.EMA_ENTY_SRNO
			AND UPM.UPM_USER_CTGRY_SRNO = 10
			AND EIR_STATUS = 'A'
			AND EIR_IDNTY_INDC = 10
			AND t1.EIR_ENTY_SRNO = EMA.EMA_ENTY_SRNO
)
SELECT  distinct EIR_ENTY_SRNO,
                EMA_ENTY_ENTY_CODE,
                EMA_ENTY_ENCD_SHRT_DESC
FROM	AMMT
WHERE SIGN(BITAND(TO_NUMBER(DBMS_LOB.SUBSTR(AMMT.AMM_FNCTY_BIT_STREAM,
                                             1,
                                             CEIL(2740 / 8)),
                             'XX'),
                   POWER(2, MOD(2740 - 1, 8)))) = 0;

And the new SQL could complete within 0.3 seconds on both nodes. Not bad, isn’t it?

Reference:
https://oracle-base.com/articles/misc/with-clause
WITH Clause : Subquery Factoring

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s