As explained in my previous post there is some issues with uprobes and the recent kernel/oracle version.Based on the workaround that i described i will show in this short blog post how we can put a probe point on oracle function using Linux Perf. Sadly i haven’t figured out a way to do that using systemtap (Special thanks to Frank Ch. Eigler for his help)
If you are using dynamic tracing tools like systemtap/perf for user space probing (based on uprobes/uretprobes) with recent oracle/kernel version you may have hit this issue.As stated by Luca Canali Ref
“Issues with uprobes and Oracle versions: uprobes works OK for tracing Oracle 11.2. However, for Oracle 12.1 I find that uprobes works OK on RHEL7.0 (kernel 3.10.0-123) and UEK (kernel 3.8.x), but does not work for kernels that ship with RHEL 7.1,7,2 and anything higher (including UEK4). When testing the easiest is to use Oracle 11.2 or if you want to test Oracle 12.1 use UEK3 or RHEL 7.0 kernel. More investigations are needed on this topic.”
Let’s check :
When troubleshooting a performance problem or investigating oracle internal using dynamic tracing tools like systemtap,it’s often useful to have the session address at hand. In fact, having the session address we can access many useful information such as : wait_event,p1 and p2 value,sql_id,and many other fields as stored in X$KSUSE (underlying table to V$SESSION). Luca Canali have already done a great work ,he identified that when the function “kskthewt” is called at the end of a wait event the register R13 (tested with Oracle 188.8.131.52 on RHEL6.5 and with Oracle 184.108.40.206 on OEL7 respectively) is pointing to the session addr with some offset and he manged also to determine the offset of the different column of X$KSUSE using X$KQFCO and X$KQFTA as in here.
The question is : Can we determine the session address without probing any function call ?
One way to answer this question is to determine how the value stored in the register R13 was set in the function “kskthewt”. Time to disassemble !
NOTE : This post contain no disassembly code of the oracle executable just the finding !
For basic info on reverse engineering please take look at my previous post.
Using FlameGaphs for investigating performance problem can be a valuable asset for quick resolution and identification of the root cause. This type of analysis may be needed when the traditional oracle instrumentation are not enough.
This post is based and inspired by the awesome work of Brendan Gregg ,Luca Canali and Frits Hoogland in this area. Please check the references at the end of the post for more info (Worth reading !)
What i will cover here is a tiny script i written for generating 3 types of extended flamegaph using the build in perf tool. I said extend because they actually include the oracle wait events.
- Off cpu
- On cpu
- HOT/COLD flame graph
As the blog post name suggest this article is about writing a “mini” program for displaying and filtering statement execution issued from a specific IP address (parameter 1) on a specific database (parameter 2). This is heavily based on the great work done by Luca Canali here (Must read !). Continue reading
The introduction of adaptive-ticks CPUs (Dynamic ticks) in UEK R4 represent a significant step forward on getting rid of the timer tick/interrupt (Full tickless operation).
“That interrupt is the CPU’s cue to reconsider which process should be running, catch up with read-copy-update (RCU) callbacks, and generally handle any necessary housekeeping.” Ref
This periodic timer interpret “interference” (based on CONFIG_HZ) have an effect on performance and power consumption which makes developers wish to abolish it. In previous version, a partial solution was used that consist on disabling timer tick for idle CPU which is controlled by the configuration option CONFIG_NO_HZ.This mode reduced considerably the power usage as it allowed idle CPU to stay in deeper c-state. What adaptive ticks feature bring us is the possibility to reduce the timer interrupt to 1 tick/second (1HZ) for non idle CPU having only one runnable task. This feature minimizes kernel overhead (up to 1%) and the potential latency problem , it was primarily targeted for High-performance computing (HPC) and real time application so it’s not necessary for every one.
For more info on Nearly full tickless operation please check this articles as most of the information here come from there :
TIME TO TEST :
- ORACLE 220.127.116.11.6
- OEL : 4.1.12-32.1.2.el6uek.x86_64
- 4 CPU : 2 sockets / 2 cores per socket Continue reading