Memory bound applications are sensitive to memory latency and bandwidth that’s why it’s important to measure and monitor them.Even if this two concepts are often described independently they are inherently interrelated.
According to Bruce Jacob in ” The memory system: you can’t avoid it, you can’t ignore it, you can’t fake it” the bandwidth vs latency response curve for a system has three regions.
- Constant region: The latency response is fairly constant for the first 40% of the sustained bandwidth.
-
Linear region: In between 40% to 80% of the sustained bandwidth, the latency response increases almost linearly with the bandwidth demand of the system due to contention overhead by numerous memory requests.
-
Exponential region: Between 80% to 100% of the sustained bandwidth, the memory latency is dominated by the contention latency which can be as much as twice the idle latency or more.
-
Maximum sustained bandwidth : Is 65% to 75% of the theoretical maximum bandwidth.
I’am using the same system configuration as on my previous post Deeper look at CPU utilization : The power of PMU events
TEST env : OEL 7.0 / kernel-3.10 /Intel i5-6500 /2*DDR3-1600 (4GB*2)
Matrix of idle memory latencies for request originating from each of the sockets and addressed to each of the available sockets : (I have only one socket in my system)
Latencies at different b/w points :
This graph clearly visualize how memory latency is affected by the increase of the memory bandwidth consumption.
This was the result collected for a single socket system. Sadly i don’t have a multi-socket system with Non-Uniform Memory Access (NUMA) for testing, so i’am going to use the result obtained by Luca Canali here using MLC on a dual socket system with Intel Xeon CPU E5-2630 v4 (16 DIMMs DDR4 of 32GB each).
Note 1 : To understand how latency vs. b/w data is collected please take a look at the readme manual of MLC.
That’s it 😀
Ref: