Assessing impact of Major Page Fault on ORACLE database [Systemtap in action]

“A more severe memory latency is a major page fault. These can occur when the system has to synchronize memory buffers with the disk, swap memory pages belonging to other processes, or undertake any other Input/Output activity to free memory. This occurs when the processor references a virtual memory address that has not had a physical page allocated to it. The reference to an empty page causes the processor to execute a fault, and instructs the kernel code to allocate a page and return, all of which increases latency dramatically.” Chapter 2. Memory allocation

As stated in the previous definition a high number of Major Page Fault can cause a serious degradation in server performance due to the added disk latency to the interrupted program  execution. This can occur in case of high memory utilization or when the parameter swappiness is set to a high value.

https://en.wikipedia.org/wiki/Swappiness

vm.swappiness = 0 The kernel will swap only to avoid an out of memory condition, when free memory will be below vm.min_free_kbytes limit. See the “VM Sysctl documentation”.
vm.swappiness = 1 Kernel version 3.5 and over, as well as kernel version 2.6.32-303 and over: Minimum amount of swapping without disabling it entirely.
vm.swappiness = 10 This value is sometimes recommended to improve performance when sufficient memory exists in a system.
vm.swappiness = 60 The default value.
vm.swappiness = 100 The kernel will swap aggressively.

So how to assess the impact of a Major Page Fault on an Oracle session ?

Test case :

Oracle 12.1.0.2.6
OEL6 Memory : 2GB
Extract from initialization parameter file :
*.compatible=’12.1.0.2.0′
*.db_block_size=8192
*.inmemory_size=309715200
*.pga_aggregate_target=200M
*.sga_target=1500M
*.processes=300

In this test case the SGA is not locked in memory (i am not using HugePage nor the parameter lock_sga)

This is the systemtap script that we will use for tracking Major Page Fault :

#! /usr/bin/env stap

global fault_entry_time, fault_address, fault_access,total_response_time

probe vm.pagefault {
 if (pid() == target() ) {
  fault_entry_time = gettimeofday_us()
  fault_address = address
  fault_access = write_access ? "w" : "r"
 }
}

probe vm.pagefault.return {
 if (pid() == target() ) {
          if (vm_fault_contains(fault_type,VM_FAULT_MAJOR)) {
                e = gettimeofday_us() - fault_entry_time
                total_response_time = total_response_time + e
                printf("Major fault on %x of type %s ,duration(us) :%d Total duration :%d\n",fault_address, fault_access,e,total_response_time)
          }
  }
}

Step 1 :
create a big table (big_table) and load it in the buffer cache.
Launch the systemtap script :
Capture 1

Then run select count(*) from big_tables;

Capture 2

More that 600MB of data was accessed in the buffer cache without triggering physical read nor major page fault.

Step 2 :
Let’s now run a pl/sql program that will consume a lot of memory forcing the paging out of many memory pages.

 declare
      TYPE T_REC_EMP IS RECORD (
       Num number,
       Nom varchar2(4000),
       Job varchar2(4000) );
     -- Table de records --
     TYPE TAB_T_REC_EMP IS TABLE OF T_REC_EMP index by binary_integer ;
      t_rec TAB_T_REC_EMP ; -- variable tableau d'enregistrements
   Begin

    for i in 1..1200000 loop
     t_rec(i).Num := i ;
     t_rec(i).Nom := lpad('Scott',4000,'*') ;
     t_rec(i).job := lpad('GASMAN',4000,'*') ;
	end loop ;
   End;
   /

Shared memory status before :
Capture 5. PNG
Shared memory status after :
Capture 6
vmstat output :

Capture 3

We can see that most of the SGA was paged out.

Step 3 :
Rerun the previous query and observe :

Capture 10

The same query is now taking 22 second to execute ! Using the traditional way v$ views or trace files will only reveal that this session is consuming a lot of CPU power.But what really happened ? this is the output of the systemtap script :

Capture 11

Major Page Fault have consumed 20 Second of execution time !

Output of vmstat and ipcs -um

Capture 14

Capture 12

The pages are now loaded again in memory so a second execution will not trigger a Major Page Fault unless they are paged out again.

We can locate the page using the address indicated in systemtap script output.Example using Tanel Poder “fcha.sql”

Capture 20

The address selected point clearly to a location in the SGA.

I hope that by now you see the power of dynamic tracing tools.

That’s it 😀

One thought on “Assessing impact of Major Page Fault on ORACLE database [Systemtap in action]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s