Hardware prefetching can reduce the effective memory latency for data and instruction accesses improving performance (reduces cache-miss exposure) but it can also cause performance degradation in some cases. (For more information see here )
My current processor intel skylake i5-6500 support 4 types of h/w prefetchers for prefetching data. There are 2 prefetchers associated with L1-data cache (also known as DCU) and 2 prefetchers associated with L2 cache.This hardware prefetcher can be enable/disabled using Model Specific Register (MSR)

Let’s test how effective they are using SLOB !
To control hardware prefetching feature we can use likwid tool
Here is an extract from the slob config used for testing LIO :
And here is the result (There is no apparent benefit from disabling them) :
For an example of workload that will benefit from disabling some hardware prefetcher see this document on Apache Spark
“Disabling next-line L1-D and Adjacent Cache line L2 prefetchers can improve the performance by up-to 14% and 4% respectively.”
That’s it 😀