linux_wiki:kernel_tuning

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
linux_wiki:kernel_tuning [2015/04/28 20:06]
billdozor created
linux_wiki:kernel_tuning [2019/05/25 23:50] (current)
Line 6: Line 6:
  
 **Checklist** **Checklist**
-  * Any distro, but many examples are done via Red Hat based systems+  * Distro(s): Any
  
 ---- ----
  
-===== Hugepages =====+====== Hugepages ======
  
 Problem: Unable to access a VM's console, and it shows out of memory errors on boot. Problem: Unable to access a VM's console, and it shows out of memory errors on boot.
Line 28: Line 28:
 ---- ----
  
-===== Swappiness =====+==== HugePage Bug ====
  
-TODO: VM Swappiness content will go here.+On some kernels, khugepaged can start running at 100% CPU utilization. This is typically seen on systems that have process/memory intensive processes. 
 + 
 +Other symptoms include: 
 +  * Normal commands hang (w, uptime, ps) 
 +  * Processes showing above 100% CPU (ie sometimes up to 1,300%) 
 + 
 +Details 
 +  * Known Affected kernels 
 +    * CentOS 6: 2.6.32-431.el6.x86_64 
 +  * Reference bug report: https://bugzilla.redhat.com/show_bug.cgi?id=879801 
 + 
 +=== Workaround Fix === 
 + 
 +Disable hugepage defragmenting: 
 +<code bash> 
 +echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag 
 +</code> 
 + 
 +Add new cron entry 
 +<code bash> 
 +vim /etc/cron.d/hugepage_defrag 
 + 
 +# Disable kernel huge page defrag due to bugzilla bug: 879801 
 +@reboot root /bin/echo never > /sys/kernel/mm/transparent_hugepage/defrag 
 +</code> 
 + 
 +=== Permanent Fix === 
 + 
 +Ultimately, the permanent fix is to update the kernel to a newer version. 
 + 
 +TODO: Will add confirmed kernel versions that this bug is fixed in. 
 + 
 +---- 
 + 
 +====== Swappiness ====== 
 + 
 +Swappiness controls how likely the kernel is to move processes out of memory and onto swap disk space. 
 + 
 +The setting is from 0 to 100. 
 +  * 0 = Very aggressively avoid swapping for as long as possible 
 +    * High risk of OOM killing from memory and I/O pressure 
 +  * 10 = Red Hat recommended for Oracle databases 
 +  * **60 = Linux default** 
 +  * 100 = Aggressively swap from memory to disk 
 + 
 +Check the setting: 
 +<code bash> 
 +cat /proc/sys/vm/swappiness 
 +60 
 +</code> 
 + 
 +To change the setting: edit sysctl.conf, add a line, save, and re-read the config: 
 +<code bash> 
 +vim /etc/sysctl.conf 
 +vm.swappiness=10 
 +:wq 
 +sysctl -p 
 +</code> 
 + 
 +More details about OOM: https://owlbearconsulting.com/doku.php?id=linux_wiki:oom_killer 
 + 
 +---- 
 + 
 +====== Page Cache ====== 
 + 
 +Tuning how often cached pages in memory are flushed to disk. 
 + 
 +Check current settings: 
 +<code bash> 
 +sysctl -a | grep dirty 
 +</code> 
 + 
 +Background (async) pagecache flushing: 
 +<code bash> 
 +vm.dirty_background_ratio = 3 
 +</code> 
 +  * Explanation: Start flushing pages to disk asynchronously when pagecache is equal to 3% of memory. 
 +  * Default value = 10 
 + 
 +Foreground (sync) pagecache flushing: 
 +<code bash> 
 +vm.dirty_ratio = 15 
 +</code> 
 +  * Explanation: Sync flush when pagecache is 15% of total memory. This blocks ANY process from using I/O when this happens. If this process takes longer than 120 seconds, kernel panics can happen. 
 +  * Default value = 20 
 + 
 +---- 
 + 
 +====== File Descriptors ====== 
 + 
 +Viewing and setting the system wide file descriptors. 
 + 
 +**View system wide file descriptor information** 
 +<code bash> 
 +sysctl fs.file-nr 
 +fs.file-nr = 6144 0 809542 
 +</code> 
 + 
 +The three numbers returned: 
 +  * 6144 = used file descriptors 
 +  * 0 = allocated but not used 
 +  * 809542 = system max 
 + 
 + 
 +**Change System FD Limits** 
 + 
 +Edit /etc/sysctl.conf 
 +<code bash> 
 +fs.file-max = 810542 
 +</code> 
 +Example above increases system wide max by 1,000. 
 + 
 +More details, including per user settings: https://owlbearconsulting.com/doku.php?id=linux_wiki:file_descriptors 
 + 
 +---- 
 + 
 +====== Kernel Panic ====== 
 + 
 +Reboot a system after a 10 seconds of kernel panic 
 +<code bash> 
 +kernel.panic = 10 
 +</code>
  
 ---- ----
  
  • linux_wiki/kernel_tuning.1430265961.txt.gz
  • Last modified: 2019/05/25 23:50
  • (external edit)