Linux PSI
From performance perspective one of the simplest question is how well your system is handling current load ? You can answer this question by referring to many performance indicators like cpu usage, load average, io queue size, memory usage etc. Most of them can led you to summarize that it depends. More generaly we would like to stick and measure saturation one of the four golden signals. How can we technically achieve this goal in modern Linux ? In kernel version 4.20 there is system called PSI (Pressure Blocking Information), which provides information about how much are your processes being stalled in selected subsystems:
- cpu
- memory
- io
This informations are stored in procfs
(/proc/pressure), cgroups v2
and expressed in the following format:
$ cat /proc/pressure/memory
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
some
means some processes are stalledfull
all processes (non-idle) are stalledavgX
where X means seconds and are express as percentage of timetotal
total time being stalled in us
Only cpu
pressure is having only some
because CPU is always executing, cannot be stalled at all. You can also monitor pressure by setting threshold and being trigger when this threshold is being exceeded.
More detailed information:
- How to Monitor Server via PSI (Pressure Stall Information) and cgroupv2?
- PSI - Pressure Stall Information
Let’s test this out, I will start container make some load which can be handled by system and then I put some additonal workload to get pressure values. Docker Engine from version 20.10 is supporting cgroup v2, but to make it working I will switch it to cgroup v2. To check if docker engine is using cgroup v2:
# docker info | grep -i "cgroup version"
Cgroup Version: 2
then put some load:
# nproc
1
# docker run --rm -ti ubuntu:latest bash
root@cc8d07a3da81:/# dd if=/dev/zero of=/dev/null
so one process generate 100% cpu usage and we have only one cpu in system, see how PSI metrics look like:
# mount | grep -i cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cc8d07a3da81 ubuntu:latest "bash" 2 minutes ago Up 2 minutes compassionate_tharp
# cat /sys/fs/cgroup/system.slice/docker-cc8d07a3da81a308d842e9dedd0656cf4548f558642ffb2cdb5881742e5b1ec0.scope/cpu.pressure
some avg10=0.00 avg60=0.00 avg300=0.00 total=15190
almost no pressure, so run another process:
# cat /sys/fs/cgroup/system.slice/docker-cc8d07a3da81a308d842e9dedd0656cf4548f558642ffb2cdb5881742e5b1ec0.scope/cpu.pressure
some avg10=98.75 avg60=52.24 avg300=14.14 total=46233007
now we have almost 100% cpu stall in last 10 seconds, because only one process is working another needs to wait, for sure it’s too much for this system. PSI metrics can be collected by node_exporter, how can be collected from container perspective I will try to figure out in next blog post.
powered by Hugo and Noteworthy theme