Linux PSI monitoring

Sun 19 September 2021 by admin

As I mentioned before linux psi metrics are exposed in cgroup v2 hierarchy. From node perspective these metrics are gathered by ie. node_exporter, but how can we collect these metrics from container perspective ? As far as I found out there is no such tools, in Kubernetes world there is tool called cadvisor to provide metrics from containers, which is also integrated in kubelet component. Cadvisor itself has support for cgroup v2, but it doesn't provide rich metrics. I suppose that soon or later these metrics will appear, now the state of adoption cgroup v2 in Kubernetes is at early stage. Latest version of Kubernetes 1.22 has alpha support from cgroup v2, one of the significant change is "true" memory allocation. Before this change when we define memory request in pod manifest kubelet doesn't set any corresponding value in cgroup v1 memory controller tree, it just count how much memory is requested at all. So in my opinion having rich settings in memory controller kernel oom knows which process needs to be killed. Moreover cgroup v2 memory controller through memory.oom.group make it more container aware, when oom kills overreaching process it kills also other processes in container. One more metrics worth to be collected from cgroup v2 tree is memory.events to get better insight about memory pressure.


Comments