linux_wiki:high_system_load

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
linux_wiki:high_system_load [2015/10/07 22:04]
billdozor [Troubleshooting Steps]
linux_wiki:high_system_load [2019/05/25 23:50] (current)
Line 6: Line 6:
  
 **Checklist** **Checklist**
-  * Distro: Enterprise Linux 6.x+  * Distro(s): Enterprise Linux 6 
 + 
 +---- 
 + 
 +====== Understanding System Load ====== 
 + 
 +Load average can be seen in both "uptime" and "top". It shows the load average for the last "1 mins, 5 mins, and 15 mins"
 + 
 +===== Traffic/Bridge analogy ===== 
 + 
 +Reposting takeaway here in case it goes away. 
 +Source: [[http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages|ScoutBlog Load Average]] 
 + 
 +**On a Single Core CPU System** 
 +  * The server is a bridge operator. 
 +  * Cars are processes. 
 +  * Cars on the bridge are using CPU time. 
 +  * Cars waiting to go on the bridge are waiting for CPU time (because the bridge is backed up and they cannot get CPU time immediately. 
 +  * Load of 0.00 means there is no traffic on the bridge. 
 +  * Load of 1.00 means the bridge is at capacity. No more cars(processes) at this very second can get CPU time without waiting. 
 +  * Load over 1.00 means there is a backup. 
 +    * 2.00 => there are "two lanes" worth of cars(processes). One lane is being processed, another lane is waiting for CPU time. 
 + 
 +**Multi-CPU/Core Systems** 
 +  * Load is relative to how many CPUs are on the system. 
 +    * 1 CPU/Core = 100% is load 1.00 
 +    * 2 CPU/Cores = 100% is load 2.00 
 +    * 4 CPU/Cores = 100% is load 4.00 
 +  * Example: From the analogy above, each CPU Core can actively process 1 bridge lane. 
 + 
 +===== Calculate Overall CPU Load ===== 
 +  * Get number of CPUs<code bash>grep -c proc /proc/cpuinfo 
 +OR 
 +nproc</code> 
 +  * Load Average / NumProccessors = decimal % load 
 +    * Example: LoadAvg(1.5) / 2 Processors = 0.75 or 75% system load on a dual core system.
  
 ---- ----
Line 15: Line 50:
  
 Typically built in Typically built in
-  * uptime +  * top => live system process view 
-  * top +  * uptime => system uptime and load averages 
-  * vmstat+  * vmstat => virtual memory stats (memory, swap, i/o, cpu)
  
 Need to install (if using a minimal install base) Need to install (if using a minimal install base)
-  * iotop +  * iostat (sysstat package) => print i/o statistics 
-  * iostat (sysstat package)+  * iotop => live disk i/o 
 +  * lsof => list open files
  
 Base Repo Base Repo
 <code bash> <code bash>
-yum -y install iotop sysstat+yum -y install iotop lsof sysstat
 </code> </code>
  
Line 32: Line 68:
 ====== Troubleshooting Steps ====== ====== Troubleshooting Steps ======
  
-  - **Know how many processors you have**. This is essential to determine if load is high. +  - **Know how many processors you have**. This is essential to determine if load is high. See "Understanding Load" above for more details
-    - <code bash>grep -c processor /proc/cpuinfo</code>+    - <code bash>grep -c proc /proc/cpuinfo</code>
     - %Load (decimal) = (Load Average / Number Processors)     - %Load (decimal) = (Load Average / Number Processors)
     - Example: Number of processors = 2, load average seen = 1.50     - Example: Number of processors = 2, load average seen = 1.50
Line 53: Line 89:
       - SWAP: "so" => Memory swapped to disk each second.       - SWAP: "so" => Memory swapped to disk each second.
         - If either are high, memory is most likely also very low.         - If either are high, memory is most likely also very low.
 +      - MEMORY: "free" => memory free. If this is low, there is probably swapping going on as well.
     - **Further investigate either high CPU/Memory use or Disk I/O**     - **Further investigate either high CPU/Memory use or Disk I/O**
  
 ---- ----
  
-==== High CPU ====+===== High CPU =====
  
 Clues that you should investigate high CPU usage: Clues that you should investigate high CPU usage:
Line 74: Line 111:
 ---- ----
  
-==== High Memory Use ====+===== High Memory Use ====
 + 
 +Notes on Linux memory management 
 +  * Linux uses free memory in RAM as a buffer cache to speed up application performance. 
 +  * When memory is needed, the buffer cache shrinks to allow other applications to use it. 
 +  * **Actual free memory = Memory Free + Buffers + Cached**
  
 Clues that you should investigate high memory usage: Clues that you should investigate high memory usage:
Line 96: Line 138:
 ---- ----
  
-==== Disk I/O ====+===== Disk I/O ====
 + 
 +  * I/O wait (wa) is the percentage of time a CPU is waiting on disk. 
 +    * If I/O wait % is > (1/# CPU cores), then the CPUs are spending a lot of time waiting on disk. 
 +  * Easiest ways to improve disk I/O 
 +    * Give the system more memory 
 +    * Tune the application to use more in memory caches than disk
  
 Clues that you should investigate high Disk I/O: Clues that you should investigate high Disk I/O:
Line 102: Line 150:
   * High CPU "wa" (wait)   * High CPU "wa" (wait)
  
 +\\
 +**iostat** - View I/O stats with extended statistics, every 3 seconds
 +<code bash>
 +iostat -x 3
 +</code>
 +  * "%util" => If this is close to 100%, the listed "Device" is the one to investigate.
 +
 +\\
 +**iotop** - Live disk I/O similar to top
 <code bash> <code bash>
-iostat 
 iotop iotop
 </code> </code>
 +
 +\\
 +**lsof** - If a particular device is discovered, another option for further details is to list open files for that mount point.
 +  * Device discovered via iostat
 +  * Mount point discovered
 +    * If 'dm' device:<code bash>ls -l /dev/mapper</code>
 +  * Then search lsof for that mount point:<code bash>lsof | grep /var/</code>
  
 ---- ----
  
  • linux_wiki/high_system_load.1444269867.txt.gz
  • Last modified: 2019/05/25 23:50
  • (external edit)