“A more severe memory latency is a major page fault. These can occur when the system has to synchronize memory buffers with the disk, swap memory pages belonging to other processes, or undertake any other Input/Output activity to free memory. This occurs when the processor references a virtual memory address that has not had a physical page allocated to it. The reference to an empty page causes the processor to execute a fault, and instructs the kernel code to allocate a page and return, all of which increases latency dramatically.” Chapter 2. Memory allocation
As stated in the previous definition a high number of Major Page Fault can cause a serious degradation in server performance due to the added disk latency to the interrupted program execution. This can occur in case of high memory utilization or when the parameter swappiness is set to a high value.
https://en.wikipedia.org/wiki/Swappiness
vm.swappiness = 0 |
The kernel will swap only to avoid an out of memory condition, when free memory will be below vm.min_free_kbytes limit. See the “VM Sysctl documentation”. |
vm.swappiness = 1 |
Kernel version 3.5 and over, as well as kernel version 2.6.32-303 and over: Minimum amount of swapping without disabling it entirely. |
vm.swappiness = 10 |
This value is sometimes recommended to improve performance when sufficient memory exists in a system. |
vm.swappiness = 60 |
The default value. |
vm.swappiness = 100 |
The kernel will swap aggressively. |
So how to assess the impact of a Major Page Fault on an Oracle session ?
Test case :
Oracle 12.1.0.2.6
OEL6 Memory : 2GB
Extract from initialization parameter file :
*.compatible=’12.1.0.2.0′
*.db_block_size=8192
*.inmemory_size=309715200
*.pga_aggregate_target=200M
*.sga_target=1500M
*.processes=300
In this test case the SGA is not locked in memory (i am not using HugePage nor the parameter lock_sga)
This is the systemtap script that we will use for tracking Major Page Fault :
#! /usr/bin/env stap global fault_entry_time, fault_address, fault_access,total_response_time probe vm.pagefault { if (pid() == target() ) { fault_entry_time = gettimeofday_us() fault_address = address fault_access = write_access ? "w" : "r" } } probe vm.pagefault.return { if (pid() == target() ) { if (vm_fault_contains(fault_type,VM_FAULT_MAJOR)) { e = gettimeofday_us() - fault_entry_time total_response_time = total_response_time + e printf("Major fault on %x of type %s ,duration(us) :%d Total duration :%d\n",fault_address, fault_access,e,total_response_time) } } }
Step 1 :
create a big table (big_table) and load it in the buffer cache.
Launch the systemtap script :
Then run select count(*) from big_tables;
More that 600MB of data was accessed in the buffer cache without triggering physical read nor major page fault.
Step 2 :
Let’s now run a pl/sql program that will consume a lot of memory forcing the paging out of many memory pages.
declare TYPE T_REC_EMP IS RECORD ( Num number, Nom varchar2(4000), Job varchar2(4000) ); -- Table de records -- TYPE TAB_T_REC_EMP IS TABLE OF T_REC_EMP index by binary_integer ; t_rec TAB_T_REC_EMP ; -- variable tableau d'enregistrements Begin for i in 1..1200000 loop t_rec(i).Num := i ; t_rec(i).Nom := lpad('Scott',4000,'*') ; t_rec(i).job := lpad('GASMAN',4000,'*') ; end loop ; End; /
Shared memory status before :
Shared memory status after :
vmstat output :
We can see that most of the SGA was paged out.
Step 3 :
Rerun the previous query and observe :
The same query is now taking 22 second to execute ! Using the traditional way v$ views or trace files will only reveal that this session is consuming a lot of CPU power.But what really happened ? this is the output of the systemtap script :
Major Page Fault have consumed 20 Second of execution time !
Output of vmstat and ipcs -um
The pages are now loaded again in memory so a second execution will not trigger a Major Page Fault unless they are paged out again.
We can locate the page using the address indicated in systemtap script output.Example using Tanel Poder “fcha.sql”
The address selected point clearly to a location in the SGA.
I hope that by now you see the power of dynamic tracing tools.
That’s it 😀
[…] Should you be interested by this subject, then do read Hatem Mahmoud post. […]