This is a followup to my previous post on Deeper look at CPU utilization : The power of PMU events .
So let’s go back to my previous example using the General Exploration View of Intel VTune :
One of the highlighted metric is “DTLB Overhead” and it’s an estimation of the performance penalty paid for missing the first-level data TLB (DTLB) that includes hitting in the second-level data TLB (STLB) as well as performing a hardware page walk on an STLB miss.
One of the recommendation is to use the famous Large Page sizes ! So let’s do it !
After restarting my database instance with Large Page this time and rerunning my program this is what we get :
Using Large Page size has significantly reduced the “DTLB Overhead” metric and it’ is no more highlighted ! We also have slightly improved our CPI from 0.553 to 0.532 and reduced our execution time from 5.37 to 5.20.
This was just a very simple example of the utilization of TMAM !
That’s it 😀