linux_wiki:oom_killer

OOM Killer

General Information

The Linux OOM (Out Of Memory) Killer is a kernel function that will automatically kill off processes if there is memory contention on a system.

Checklist

  • Distro(s): Any (except for EL based kernel information at the end)

It does this by calculating how “bad” a process is in order to determine which to kill off.

If invoked, the goal is to kill processes so that:

  1. we lose the minimum amount of work done
  2. we recover a large amount of memory
  3. we don't kill anything innocent of eating tons of memory
  4. we want to kill the minimum amount of processes (one)
  5. we try to kill the process the user expects us to kill

The function actually does this:

  • Processes that have the PF_SWAPOFF flag set will be killed first
  • Processes which fork a lot of child processes are next in line
  • Kill off niced processes, since they are typically less important
  • Superuser processes are usually more important, so try to avoid killing those

Besides an application/process dying, evidence of oom executing can be find in the kernel log file.

Example:

grep -i "out of memory" /var/log/kernel
Jan 15 10:05:32 dbserver01 kernel: Out of memory: Kill process 39018 (mysqld) score 441 or sacrifice child

Normally, when OOM is called, it is because an application has been poorly coded or configured to ask for a ridiculous amount of memory that it doesn't actually need. Then, when the application misbehaves and starts to crash/fork, it eats up a ton of memory very quickly…OOM steps in and starts killing things.

The OS could be configured incorrectly too though.

There are two settings that can be tuned to change the system behavior surrounding memory and swapping. Depending upon how they are adjusted, this can make it less or more likely for OOM to be called.

The settings are both in the virtual file system, /proc:

  • overcommit_memory: /proc/sys/vm/overcommit_memory
  • swappiness: /proc/sys/vm/swappiness

This setting can allow for applications to reserve (but not use) more total memory than what the system has available. This is similar to memory over commitment in virtual machines.

This setting should stay at the default (0) unless you really know what you are doing and its an edge case.

Possible settings are 0,1,or 2:

  • 0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slighly more memory in this mode. This is the default.
  • 1 - Always overcommit. Appropriate for some scientific applications.
  • 2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap plus a configurable percentage (default is 50) of physical RAM. Depending on the percentage you use, in most situations this means a process will not be killed while attempting to use already-allocated memory but will receive errors on memory allocation as appropriate.

Check the setting:

cat /proc/sys/vm/overcommit_memory
0

To change the setting: edit sysctl.conf, add a line, save, and re-read the config:

vim /etc/sysctl.conf
vm.overcommit_memory=2
:wq
sysctl -p

Swappiness controls how likely the kernel is to move processes out of memory and onto swap disk space.

The setting is from 0 to 100.

  • 0 = Very aggressively avoid swapping for as long as possible
    • High risk of OOM killing from memory and I/O pressure
  • 10 = Red Hat recommended for Oracle databases
  • 60 = Linux default
  • 100 = Aggressively swap from memory to disk

Check the setting:

cat /proc/sys/vm/swappiness
60

To change the setting: edit sysctl.conf, add a line, save, and re-read the config:

vim /etc/sysctl.conf
vm.swappiness=1
:wq
sysctl -p

Starting in kernel 3.5-rc1, the way that swappiness behaves when set to “0” was changed. Red Hat backported this behavior change to RHEL kernel 2.6.32-303.

vm.swappiness=0 meant that the kernel would avoid swapping for as long as it could, but when things got tight, it would swap.

vm.swappiness=0 now means to really…don't swap…until the very last possible moment.

At this point, OOM is typically called before anything gets a chance to swap.

If you are trying to avoid swapping as much as possible, the newer kernels should not be set below 1.

The new vm.swappiness=1 is like the old vm.swappiness=0 behavior.

  • Kernels >= 3.5-rc1 (or RHEL kernels >= 2.6.32-303)
    • vm.swappiness=1 (at least)
  • Kernels < 3.5-rc1 (or RHEL kernels < 2.6.32-303)
    • vm.swappiness=0 (or a number ⇐ 10)
  • linux_wiki/oom_killer.txt
  • Last modified: 2019/05/25 23:50
  • (external edit)