Tracing PL/SQL subprogram calls with parameters values [Dynamic tracing]

The purpose of this blog post is demonstrate again the power of Linux dynamic tracing/instrumentation tools.

In my last blog post Enhancing DBMS_OUTPUT using systemtap i showed how we can track  the parameter values passed to “dbms_output.put_line” routine using systemtap.That was a very simple example because we already know the type of the arguments passed (a simple VARCHAR2) and also because there is only ONE parameter.

Tracking PL/SQL routine calls arguments using dynamic tracing utility like perf or systemtap can become quite complex depending on many things like :

  • Argument types
  • Argument number
  • Argument passed  By Value/By reference
  • Subprograms type (nested/package/standalone subprogram)
  • Optimization level (ex: inlining of call of procedure)

Time for the serious stuff  with dynamic tracing tool PERF ! 

Continue reading

Enhancing DBMS_OUTPUT using systemtap

This is a short and quick note to show how we can enhance DBMS_OUTPUT capabilities using a small systemtap script without modifying the source code.Basically it will allow us to display the DBMS_OUTPUT message incrementally (the program don’t need to finish it’s execution) by attaching to an already running session (no need to enable DBMS_OUTPUT). The output can also be easily redirected to a file.

The idea is to try to access the function parameters.This can become complex in case of different arguments types and number but in our case there is only one argument of type varchar2.

Continue reading

Playing with SLOB and hardware prefetchers ! Are they effective ?

Hardware prefetching can reduce the effective memory latency for data and instruction accesses improving performance (reduces cache-miss exposure) but it can also cause  performance degradation in some cases.  (For more information see here )
My current processor intel skylake i5-6500 support 4 types of h/w prefetchers for prefetching data. There are 2 prefetchers associated with L1-data cache (also known as DCU) and 2 prefetchers associated with L2 cache.This hardware prefetcher can be enable/disabled using Model Specific Register (MSR)
Capture
Let’s test how effective they are using SLOB !

Continue reading

Memory bandwidth vs latency response curve

Memory bound applications are sensitive to memory latency and bandwidth that’s why it’s important to measure and monitor them.Even if this two concepts are often described  independently they are inherently interrelated.

According to Bruce Jacob in ” The memory system: you can’t avoid it, you can’t ignore it, you can’t fake it” the bandwidth vs latency response curve for a system has three regions.

  • Constant region: The latency response is fairly constant for the first 40% of the sustained bandwidth.
  • Linear region:  In between 40% to 80% of the sustained bandwidth, the latency response increases almost linearly with the bandwidth demand of the system due to contention overhead by numerous memory requests.
  • Exponential region:  Between 80% to 100% of the sustained bandwidth,  the memory latency is dominated by the contention latency which can be as much as twice the idle latency or more.
  • Maximum sustained bandwidth :  Is 65% to 75% of the theoretical maximum bandwidth.
 Armed with Intel Memory Latency Checker (MLC) let’s check our current system !

Continue reading