Troubleshooting Latch Contention using sytemtap

The purpose of this blog post is to show how we can troubleshoot contention on  a specific latch using a systemtap script. This post is highly inspired by the “latchprof” script developed by Tanel Poder and his systematic approach for latch contention troubleshooting (For more info please check latch-contention-troubleshooting .)

This is what we are going to achieve :

Tested in : oracle 11.2.0.4/OEL6/UEK4

stap -v monitor_latch.stp  “latch_address” “latch#” “refresh_time”

capture-01

This script show a breakdown of latch holder by pid/session id/sql_hash for “cache buffers chains” latch with address “0x000000009F69FF60”

Part 1 : Monitoring latch acquisition /release

To monitor the latch activity i used a hardware breakpoint that will fire whenever the latch address is modified.The number of hardware breakpoint that we can use is limited as it make use of dedicated registers( usually limited to 4 on x86 for more info ) .So we can not monitor many latch address using hardware breakpoint (I limited my self to one).

But how to know if the latch is acquired or released at every modification ?

Whenever the latch is acquired or released it will modify the first word pointed out by the latch address  as stated by Andrey Nikolaev to reflect the PID of the holding process or the number of process holder depending on the latch type/acquisition mode.Also  as demonstrated on my previous post the number of gets will be incremented at release time.

Assuming that the latch address is modified only when the latch is acquired or released we can state that if :

  • The address is modified by  a process X and nb of gets does not change => Latch acquired
  • The address is modified by  a process X and nb of gets does change=> Latch released

We can access the latch “gets” value at a specific offset from the latch address.This offset has different value for shared and exclusive latch.

Exclusive latch memory layout :

oradebug peek 200222A0 24
[200222A0, 200222B8) = 00000016 00000001 000001D0 00000007
pidˆ               gets        latch#          level#

Shared latch memory layout :
oradebug peek 0x6000AEA8 24
[6000AEA8, 6000AEC0) = 00000002 00000000 00000001 00000007
ˆNproc       ˆX flag                gets    latch#

Reference : Andrey Nikolaev

I used the latch# to determine the offset of the gets ,that’s why it’s passed as a parameter to the script.

Part 2 : Getting session addr/SID/sql hash

To get the session address i used the technique described on my previous post.So i extracted it from the global symbol “ksupga_with some offset and then used x$kqfco and x$kqfta to extract the offset of the other fields (SQL_HASH/SID)

DOWNLOAD : monitor_latch.stp

That’s it 😀

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s