Tanel Poder have just shared an awesome tool Linux Process Snapper 🙂 Which is as he described “a Linux
/proc profiler that works by sampling Linux task states and other metrics from
/proc/PID/task/TID pseudofiles” . What i like about the tool is the easy of use and also that it allow Off-Cpu analysis (For more info about Off-Cpu analysis please take a look at Brendan Gregg Blog )
Suppose that our system is hit by sudden slow down and after a quick check we have identified that we have many process in “uninterruptible sleep” (in the middle of doing work) mode. We may be tempted to conclude that those sleeps are associated with disk access but in reality it may not be the case (Even if it is usually the case). As Brendan Gregg explained in his blog post “TASK_UNITERRUPTIBLE” matches more things today such as waiting on uninterruptible locks.
Using pSnapper we can actually display the function where the process is actually sleeping using wchan which can actually give as some hint (Using task_struct-> in_iowait would allow us to know if it’s related to I/O but it seem that it’s not exposed through /proc pseudofiles).
For example in this great investigation done by Nikolay Savvinov many process where in an uninterruptible lock sleep waiting on the function “wait_on_page_bit” (“waiting for a page to unlock or stop being under writeback etc” for more info check ref) . pSnapper would have been useful in this case !
To figure out which path lead to that entry “wait_on_page_bit” including user/kernel space stack we could use the following perf command :
perf record -g -a -e probe:wait_on_page_bit -- sleep 10
And then draw a FlameGraph.
In this particular case the problem was related to Automatic NUMA balancing as there is many threads waiting for Pages that are in the process of being migrated.
That’s it and thank you Tanel for this great tool 🙂
By the-way pSnapper will allow also OFF-CPU/ON-CPU kernel stack sampling !