In my last blog post i described a geeky way to trace back the responsible PL/SQL code for a particular memory request into the PGA. It was based essentially on a dynamic tracing tool “systemtap” to probe on specific functions entry such as KGHAL memory allocator functions (Based on Stefan Koehler dtrace script ) and relied on calling an internal oracle function “pfrln0lookup” using “oradebug call” to get the actual PL/SQL line number.It would have been safer if we reverse engineered the function “pfrln0lookup” to extract only the thing that matter and avoid calling it but this need time and is forbidden :p !
So here i will describe a safer and simpler approach relying only on collecting multiple errorstack dump samples and a little shell script to parse the trace file !
The last time i used systemtap to put a probe on KGHALF function (“non-recoverably allocate a freeable chunk of memory” using Frits Hoogland ora_functions) and checked the memory allocation reason which is passed in the sixth argument (register r9) (Based on Tanel Poder research) .Let’s take a deeper look :
A quick check reveled that this address 0x12239f80 was mapped to the .rodata section in the ELF executable (readelf -e oracle)
Let’s check the address content :
gdb /app/oracle/product/12.2.0/dbhome_1/bin/oracle -ex ‘printf “%s”,0x12239f80’ -ex ‘quit’
“frame segment” (Memory allocation reason)
We can do the same using objdump :
objdump -s -j .rodata /app/oracle/product/12.2.0/dbhome_1/bin/oracle | grep -i 12239f80
“12239f80 6672616d 65207365 676d656e 74000000 frame segment…”
But how the errorstack dump is useful to us here ?
Basically we can get the same info using the section Call Stack Trace in errorstack dump :
So here we have our function KGHALF and it’s sixth argument in hex “012F7BB6C” With it’s value set to :
gdb /app/oracle/product/12.2.0/dbhome_1/bin/oracle -ex ‘printf “%s”,0x012F7BB6C’ -ex ‘quit’
“pmuccst: adt/record” (Memory allocation reason)
We also have the required information about the particular PL/SQL code that was running at the moment we requested this memory allocation :
So basically using multiple errorstack samples (oradebug pdump) can help us trace back the responsible PL/SQL code for a particular memory request !
In the example described in my previous article i have identified “pmuccst: adt/record” as the top memory allocation reason using the script smem_detail.sql by Tanel Poder ! Let’s check where in memory this string is defined (.rodata section) :
objdump -s -j .rodata /app/oracle/product/12.2.0/dbhome_1/bin/oracle | grep “adt/record”
“12f7bb70 6373743a 20616474 2f726563 6f726400 cst: adt/record.”
Missing 4 char (4 Bytes “pmuc“) so the base address is “0x12f7bb6C” and that’s all we need !
The script plsql_memory_leak.sh take 4 arguments :
And here is an extract from the script output :
./plsql_memory_leak.sh ospid interval ndumbs allocation_reason_desc_address
Clearly we must focus our attention at line 7 and 9 !
That’s it 😀
DOWNLOAD : plsql_memory_leak.sh