Playing with adaptive-ticks CPUs [CONFIG_NO_HZ_FULL] in UEK 4 and ORACLE database

The introduction of adaptive-ticks CPUs (Dynamic ticks) in UEK R4 represent a significant step forward on getting rid of the timer tick/interrupt (Full tickless operation).

“That interrupt is the CPU’s cue to reconsider which process should be running, catch up with read-copy-update (RCU) callbacks, and generally handle any necessary housekeeping.” Ref

This periodic timer interpret “interference” (based on CONFIG_HZ) have an effect on performance and power consumption which makes developers wish to abolish it. In previous version, a partial solution was used that consist on disabling timer tick for idle CPU which is controlled by the configuration option CONFIG_NO_HZ.This mode  reduced considerably the power usage as it allowed idle CPU to stay in deeper c-state. What adaptive ticks feature bring us is the possibility to reduce the timer interrupt to 1 tick/second (1HZ)  for non idle CPU having only one runnable task. This feature minimizes kernel overhead (up to 1%) and the potential latency problem , it was primarily targeted for  High-performance computing (HPC) and real time application so it’s not necessary for every one.

For more info on Nearly  full tickless operation please check this articles as most of the information here come from there :

TIME TO TEST :

  • ORACLE 12.1.0.2.6
  • OEL : 4.1.12-32.1.2.el6uek.x86_64
  • 4 CPU : 2 sockets / 2 cores per socket

Check kernel config parameters :

Capture 1

Configure kernel command line  to enable dynamic tickless (nohz_full which is disabled by default) on CPU 3 and isolate CPU 2 and 3  from user-space tasks for testing (isolcpus — Isolate CPUs from the kernel scheduler.) :

Capture 2

Move all  RCU threads to a non-latency-sensitive core (CPU 0 in this case) as CPU will exit adaptive-ticks mode when it enqueues an RCU callback  :

Capture 10

Open a new sqlplus session and then isolate the ORACLE process on CPU 3 using the command taskset :

Capture 4

As we see the previous cpu affinity was for cpu 0 and 1 as we have used the parameter “isolcpus” to isolate the other processors from user-space tasks.

I used Julian Dyke CPU Test script for testing you can find it here

Use the flowing perf script to check for the number of timer tick when running the CPU test from the sqlplus session.

perf stat -C 3 -e irq_vectors:local_timer_entry sleep 2

Capture 6

So there is clearly a significant reduction on the number of timer tick per second.The execution time of the test script (after many execution) was 12.31 Sec.

There is two ticks per second :

  • “Timer tick still needs to happen at least once per second to keep the scheduler happy” ref
  • The other tick is caused by kernel parameter “vm.stat_interval” : The time interval between which vm statistics are updated.  The default is 1 second.

If we increase the value of “vm.stat_interval” for example set it to 60 we obtain our 1 tick/second :

Capture 20

Let’s now isolate the oracle process on CPU 2 using the command “taskset” and redo the same test :

Capture 7

The number of timer tick is 1000 interrupt/second and the execution time of the test script is slightly higher 12.40 SEC.

This feature still have some limitation (ex: single-runnable-process requirement) but it’s a good step on the right direction.

That’s it 😀

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s