Memory bound applications are sensitive to memory latency and bandwidth that’s why it’s important to measure and monitor them.Even if this two concepts are often described independently they are inherently interrelated.
According to Bruce Jacob in ” The memory system: you can’t avoid it, you can’t ignore it, you can’t fake it” the bandwidth vs latency response curve for a system has three regions.
- Constant region: The latency response is fairly constant for the first 40% of the sustained bandwidth.
Linear region: In between 40% to 80% of the sustained bandwidth, the latency response increases almost linearly with the bandwidth demand of the system due to contention overhead by numerous memory requests.
Exponential region: Between 80% to 100% of the sustained bandwidth, the memory latency is dominated by the contention latency which can be as much as twice the idle latency or more.
Maximum sustained bandwidth : Is 65% to 75% of the theoretical maximum bandwidth.
This is a followup to my previous posts on Deeper look at CPU utilization :
- Deeper look at CPU utilization : The power of PMU events
- Deeper look at CPU utilization : TMAM Example
Following a comment from Kevin Closson here is the hierarchical execution cycles breakdown based on the TMAM method before and after enabling HUGEPAGES when running SLOB for testing Logical I/O.
This will let’s us identify our micro-architectural bottlenecks and correctly characterize the SLOB workloads ! Continue reading
This is a followup to my previous post on Deeper look at CPU utilization : The power of PMU events .
So let’s go back to my previous example using the General Exploration View of Intel VTune :
Suppose we have a CPU bound application/query/program. How to know what my CPU is really doing ? What’s my CPU bottleneck ? How much my CPU are stalled ? For what resource ? How to characterizes my Workloads ?
Answering this question can helps direct performance tuning !
Let’s take a sample program to analyze :
This is my second post under the theme of how to extend our capabilities to trace and profile PL/SQL code.This time motivated by a comment from Luca Canali on my previous post :
So based on my previous work on geeky PL/SQL tracer let’s see how we can obtain a geeky PL/SQL on-CPU Flame Graph !
This blog post is about how to extend our capabilities to trace and profile PL/SQL code.It’s primarily motivated by few tweets from Franck Pachot and of course because it’s FUN !
So in the first part of this series we are going to answer to this questions : Can we map those underling function to the source PL/SQL object and line number ? Can we obtain a full trace ? Of course yes otherwise there will be no blog post :p