Question

Same query with same execution plan performs different in different Oracle environments

I have the following query:

SELECT t.*
FROM 
    (SELECT t.id, t.transaction_date
     FROM transactions t 
     ORDER BY t.transaction_date DESC, t.id DESC
     FETCH NEXT 11 ROWS ONLY) transactions_table 
JOIN 
    transactions t ON transactions_table.id = t.id
ORDER BY 
    t.transaction_date DESC, t.id DESC;

ID column is the PRIMARY KEY for the table, I have the following CREATE INDEX statement:

CREATE INDEX transaction_date_idx ON transactions (transaction_date DESC, id);

The execution plan is as follows:

PLAN_TABLE_OUTPUT
Plan hash value: 3772986339
 
---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                              | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                                   |    11 |   167K|       | 35981   (2)| 00:00:02 |
|   1 |  SORT ORDER BY                |                                   |    11 |   167K|       | 35981   (2)| 00:00:02 |
|   2 |   NESTED LOOPS                |                                   |    11 |   167K|       | 35980   (2)| 00:00:02 |
|   3 |    NESTED LOOPS               |                                   |    11 |   167K|       | 35980   (2)| 00:00:02 |
|*  4 |     VIEW                      |                                   |    11 |   286 |       | 35958   (2)| 00:00:02 |
|*  5 |      WINDOW SORT PUSHED RANK  |                                   |  4345K|   107M|   150M| 35958   (2)| 00:00:02 |
|   6 |       INDEX FAST FULL SCAN    | TRANSACTIONS_TRANSACTION_DATE_IDX |  4345K|   107M|       |  3593   (2)| 00:00:01 |
|*  7 |     INDEX UNIQUE SCAN         | PK_TRANSACTIONS                   |     1 |       |       |     1   (0)| 00:00:01 |
|   8 |    TABLE ACCESS BY INDEX ROWID| TRANSACTIONS                      |     1 | 15582 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
"   4 - filter(""from$_subquery$_003"".""rowlimit_$$_rownumber""<=11)"
"   5 - filter(ROW_NUMBER() OVER ( ORDER BY SYS_OP_DESCEND(""TRANSACTION_DATE""),INTERNAL_FUNCTION(""T"".""ID"") DESC "
              )<=11)
"   7 - access(""T2"".""ID""=""from$_subquery$_003"".""ID"")"
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

I have 2 environments with the same data (~4 million rows in transactions table). The first environment the query runs as expected (~500ms). In the second environment it takes about 30 seconds! The execution plan for the both environments are exactly the same, except for the:

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

not printed on the fast-returning environment. When I compare the inner queries:

SELECT t.id, t.transaction_date
FROM transactions t 
ORDER BY t.transaction_date DESC, t.id DESC
FETCH NEXT 11 ROWS ONLY

it performs nearly the same in both environments (both fast). I believe things get slow when I run the JOIN part. Why?

2 77 2

1 Jan 1970

Solution

The note dynamic statistics used: dynamic sampling (level=2) implies that statistics are missing from the table on one of your environment. Gather statistics with a PL/SQL block like this:

begin
    dbms_stats.gather_table_stats(ownname => user, tabname => 'TRANSACTIONS');
end;
/

If that fixes the problem, you'll want to investigate why the statistics weren't automatically gathered. By default, an Oracle database will automatically gather statistics on a table that has new data or has changed by more than 10%, every day. Unfortunately, a lot of organizations like to disable the default autotask and implement their own custom solution.

If that doesn't fix the problem, or if you're curious why it solved the problem, check the execution plan with actual numbers instead of using the explain plan guesses. Gather data using select dbms_sqltune.report_sql_monitor(sql_id => 'your SQL_ID') from dual; and add the results to your question. The SQL monitoring report will tell you which operation is takes the most time, and what the waits are.

For this problem, I would guess that your "Activity (%)" will not add up to 100%. When that happens, it implies that the execution time was spent running a recursive query; a query meant to provide helpful information can rarely take longer than the actual query itself. Specifically, in rare cases, it's possible that the dynamic sampling query is slow. Maybe gathering statistics will avoid needing to run an expensive dynamic sampling query.

Also, I agree with Paul W's idea about rewriting the query. Replacing a self-join with a fancier query can often save a lot of time.

2024-07-19

Jon Heller

Solution

Most likely Oracle is doing a hash join in one env, vs. a nested loops in the other. It's very common for a query to behave differently when executed in a different environment. There are a lot of factors that the optimizer considers when developing its plan, so you can't expect the same plan every time in two different databases.

Here's a couple things to try:

Do you really want FETCH NEXT rather than FETCH FIRST?

You probably shouldn't be using that inner query block at all. Simplify:

SELECT t.*
FROM transactions t
ORDER BY t.transaction_date DESC, t.id DESC
FETCH FIRST 11 ROWS ONLY

For maximum speed, redefine the index as (transaction_date,id). That means dropping the DESC keyword from the index definition. Then hint:
```
SELECT /*+ INDEX_DESC(t) */ t.*
FROM transactions t
ORDER BY t.transaction_date DESC, t.id DESC
FETCH FIRST 11 ROWS ONLY
```

Check your plan and make sure it says INDEX FULL SCAN DESCENDING (not INDEX FAST FULL SCAN) and you see WINDOW NOSORT STOPKEY operation above it (not WINDOW SORT PUSHED RANK). You're trying to avoid the sort.

If you do need to do this within a nested query block because you require joining to other tables in the parent block, try something like this:

SELECT /*+ USE_NL(t t2 t3) */ t.*,t2.*,t3.*
  FROM (SELECT /*+ NO_MERGE INDEX_DESC(t) */ ROWID row_id, t.id, t.transaction_date
         FROM transaction t 
        ORDER BY t.transaction_date DESC, t.id DESC
        FETCH FIRST 11 ROWS ONLY) transactions_table
       JOIN transaction t ON transactions_table.row_id = t.ROWID -- if self-joining
       LEFT OUTER JOIN transaction t2 ON t2.id = transactions_table.id -- other table join
       LEFT OUTER JOIN transaction t3 ON t3.id = transactions_table.id -- other table join
 ORDER BY transactions_table.transaction_date DESC, transactions_table.id DESC

Notice the USE_NL and NO_MERGE hints. The NO_MERGE prevents Oracle from view merging which would invalidated your INDEX_DESC hint, and USE_NL promises Oracle that it really is best to use nested loops rather than a hash join on those joins to the other tables. Note that I also optionally used ROWID for a self-join. That avoids hitting the index again, though it's likely that index was the same one used in the inner query so the leaf blocks are cached and the benefit of skipping it on the second pass might be imperceptible.

The self-join is really unnecessary, as you can return all the columns you want from the inner query. The only situation where I find it helpful to self-join like this is when I might disqualify rows based on criteria from other tables, and I want to avoid the cost of reading blocks from the table segment until I've done so. Restricting the inner query to reference only columns that are part of the index is what would enable that performance trick.

2024-07-18

Paul W