So I guess she is talking about this case: 1) something like systemd starts a pr...

ars · on Oct 28, 2014

Is systemd going to become like the NSA leak? Every article no matter what will somehow become about it?

This has nothing to do with systemd.

And init of systemd doesn't die anyway, all the extra stuff is run in separate processes, process 1 is indeed, very simple.

chubot · on Oct 28, 2014

systemd is the most notable process manager that has cgroups support, and the article is talking about process managers that "fail".

It doesn't matter if it's not the PID 1 that fails. The article is saying: if the process that is supposed to receive OOM notification and kill the process fails, you will observe symptoms that are very hard to diagnose.

PID 1 dying is of course really bad, but if other processes in the init system die, your system can still be hosed, as pointed out here.

ars · on Oct 28, 2014

> systemd is the most notable process manager that has cgroups support

So? Not everything is about systemd.

systemd does not disable the OOM killer, so this has nothing to do with systemd.

You can create cgroups yourself you know - you don't need systemd for that.

glandium · on Oct 29, 2014

> So? Not everything is about systemd.

It's time for an equivalent to Godwin's law for systemd.

pdw · on Oct 28, 2014

I don't think the standard Linux kernel supports user-space OOM handling. That's a patch set that Google has been pushing for a few years, but that hasn't been accepted yet (as far as I know). So I'm guessing she's talking about Google's server infrastructure.

DSMan195276 · on Oct 28, 2014

My understanding is that the program running in the cgroup is disabling the OOM killer and doing it itself, so only whatever is inside of the cgroup is frozen (Presumably not systemd). Most programs have no need to disable the OOM killer anyway so it's unlikely they'd run into this problem anyway (They'd just get killed off instead of hanging things).

IgorPartola · on Oct 28, 2014

I don't think that's how that works. A process inside a cgroup has consumed all of the cgroup's RAM. Outside that cgroup you run `ps aux`, which will list all processes, including those inside the container, and this will hang forever. Unless, I am misunderstanding it, and each cgroup gets its own /proc filesystem.

epochwolf · on Oct 28, 2014

This is only an issue when you disable OOM killer. If your OOM killer replacement dies, then you have a problem, and you shouldn't have disabled it in the first place.

The kernel isn't going to protect you from turning off critical bits.

IgorPartola · on Oct 28, 2014

Yes, I read the article. My point is that a container that ran out of RAM and is running a buggy process manager affects the entire system, not just the container.