Mapping table blocks to NUMA nodes : Scattered or Clustered ?

Depending on the location of table blocks in the buffer cache performance may be affected (Ex : Tables blocks scattered across all the nodes or clustered on one node). In this blog post i will show quickly how to map table blocks to NUMA nodes based on my previous work done in this blog post.(Mapping ORACLE SGA components to numa nodes using NUMA API)

Manual balancing of SGA components across numa nodes [NUMA API:move_pages()]

In my previous blog post i showed how we can display memory components  (Buffer cache,Shared pool,Large Pool,etc) distribution across the different NUMA nodes using the NUMA API. But what to do if we want to have more control ? Can we for example isolate a specific SGA components in a specific set of nodes ?

Suppose for example that you are using the IN-MEMORY column store and only a few user are relying heavily on it.Would it be useful to collocate them on a specific set of nodes to improve memory access latency.For sure it depend  ! But we can do it ! Using the NUMA API and specifically the function “move_pages” we can distribute the memory pages  across NUMA nodes as we want !

Automatic NUMA Balancing which is enabled by default on UEK R4  rely on a similar mechanism for moving the memory pages closer to where the task is executing.(For more info check this) but it does not support for now the migration of  Huge Pages (hugetlbfs)

[root@svltest ~]# sysctl -a | grep numa_balancing
kernel.numa_balancing = 1
kernel.numa_balancing_scan_delay_ms = 1000
kernel.numa_balancing_scan_period_max_ms = 60000
kernel.numa_balancing_scan_period_min_ms = 1000
kernel.numa_balancing_scan_size_mb = 256

This is what we are going to achieve in this blog post :

Capture 0

Capture 20

Combining SQL TRACE & SYSTEMTAP Part 2: No more Unaccounted-for Time due to time spent on CPU run queue

I my previous post i showed how we can eliminate one of the causes for Unaccounted-for Time,which is CPU double-counting, from SQL trace file using systemtap. But we can do more,The other important causes of missing data in an Extended SQL trace file is “Time Spent Not Executing” (Cary Millsap) which is time spent on CPU run queue.So how to measure it ?

Here is an excerpt of what we are going to achieve :

Old trace file :

Capture 12

New trace file  showing cpu consumption inside wait event and time spent on CPU run queue :

Capture 11

TrcExtProf.sql the raw trace file (10046) profiler based on external tables + regexp

UPDATE 21/09/2015 : For the new version of TrcExTprof  click here.

There is already many great free trace file profiler that you can find like tkprof ,Trace Analyzer ,tvdxtat , parsetrc and OraSRP for a description of some of them here why another profiler ?

SYSTEMTAP Oracle session perf (CPU + WAITS) Direct SGA access (StapOra V0.2)

UPDATE 26/05/2015 :  For the new version of StapOra including bug fixes and enhancement please click here

In the previous post i have developed a systemtap script to monitor CPU usage (Oracle CPU monitor version 0.1). So here i’am going to extend the script to include oracle wait events and CPU usage from the point of view of the oracle database using direct SGA access.

Here is a quick overview of the systemtap script (Renamed StapOra V0.2 )

  • Top wait events
  • Time spent on the run queue
  • IO wait time
  • Top kernel function
  • Top user function
  • Consistent Read by object
  • Consistent Read elapsed time and cpu time
  • Number of context switches

I will Explain here only the new added part and how it was developed:

