This is a case i was working on a few days ago.I was asked to tune a specific query on our production server, so here is a quick test case to model the problem (i only kept the interesting part of the original query which was much more longer)
Test case : (ORACLE 220.127.116.11.6)
DROP TABLE test_a1; DROP TABLE test_a2; CREATE TABLE test_a1 AS SELECT level a, level AS b FROM DUAL where level > 3000 CONNECT BY level <6000; CREATE TABLE test_a2 AS SELECT level AS a, level AS b FROM DUAL where LEVEL > 100 CONNECT BY level < 3100; exec dbms_stats.gather_table_stats(null,'TEST_A1'); exec dbms_stats.gather_table_stats(null,'TEST_A2'); create or replace function cpu_intensive_function(a number) return number deterministic as n NUMBER := 0; BEGIN dbms_application_info.set_client_info(userenv('client_info')+1 ); FOR f IN 1..100000 LOOP n := MOD (n,999999) + SQRT (f); END LOOP; return n+a; END;
SELECT t1.a a,t1.b b,t2.a a2,t2.b b2 FROM test_a1 t1, test_a2 t2 WHERE t1.a = t2.a and cpu_intensive_function(t1.b) = 1;
At first this may seem a good execution plan.We have two tables with 2999 rows each,we use the filter function “cpu_intensive_function”on column “b” to minimize throughway and eliminate the maximum number of rows as soon as possible.
But at runtime the execution of the deterministic function “cpu_intensive_function” 2999 times (for each row of table TEST_A1 as it was a unique value so the caching is not effective) ,which is cpu intensive , took a significant amount of time.
Although we are eliminating rows as soon as possible,this step is very costly so why not try to differ it knowing that the join between this two tables on column “a” will return only 99 rows (In this case this function will be executed only 99 times minimizing significantly the execution time).
The question is how to do it ?
To be able to differ the execution of the function i have rewritten the query this way :
SELECT h.a,h.b,h.a2,h.b2 FROM (SELECT /*+ NO_MERGE */ t1.a a,t1.b b,t2.a a2,t2.b b2, ROWNUM FROM test_a1 t1, test_a2 t2 WHERE t1.a = t2.a) h WHERE cpu_intensive_function (h.b) = 1;
I added the ROWNUM pseudo-column to prevent simple filter push-down from happening as it’s a no brainer transformation.(It’s also possible to use /*+ opt_param(‘_optimizer_filter_pushdown’,’false’) */ but i like the first alternative.For more info)
This query was mush faster, the execution time dropped from 5Min to 10Sec !
That’s it 😀
2 thoughts on “Deferred predicate evaluation”
that’s a great way to solve this particular issue!
Thank you for sharing.
Have you tried to associate the cpu_intensive_function with proper ASSOCIATE STATISTICS to tell the optimizer about the real cost of the function? This might help all queries using cpu_intensive_function – not only this specific query.
Hi Martin, thank you 🙂 yes the function used on production have the associated statistics correctly set but for this particular case it havent done any good.