Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
linux_wiki:high_system_load [2015/10/07 22:18] billdozor [Troubleshooting Tools] |
linux_wiki:high_system_load [2019/05/25 23:50] (current) |
||
---|---|---|---|
Line 6: | Line 6: | ||
**Checklist** | **Checklist** | ||
- | * Distro: Enterprise Linux 6.x | + | * Distro(s): Enterprise Linux 6 |
+ | |||
+ | ---- | ||
+ | |||
+ | ====== Understanding System Load ====== | ||
+ | |||
+ | Load average can be seen in both " | ||
+ | |||
+ | ===== Traffic/ | ||
+ | |||
+ | Reposting takeaway here in case it goes away. | ||
+ | Source: [[http:// | ||
+ | |||
+ | **On a Single Core CPU System** | ||
+ | * The server is a bridge operator. | ||
+ | * Cars are processes. | ||
+ | * Cars on the bridge are using CPU time. | ||
+ | * Cars waiting to go on the bridge are waiting for CPU time (because the bridge is backed up and they cannot get CPU time immediately. | ||
+ | * Load of 0.00 means there is no traffic on the bridge. | ||
+ | * Load of 1.00 means the bridge is at capacity. No more cars(processes) at this very second can get CPU time without waiting. | ||
+ | * Load over 1.00 means there is a backup. | ||
+ | * 2.00 => there are "two lanes" worth of cars(processes). One lane is being processed, another lane is waiting for CPU time. | ||
+ | |||
+ | **Multi-CPU/ | ||
+ | * Load is relative to how many CPUs are on the system. | ||
+ | * 1 CPU/Core = 100% is load 1.00 | ||
+ | * 2 CPU/Cores = 100% is load 2.00 | ||
+ | * 4 CPU/Cores = 100% is load 4.00 | ||
+ | * Example: From the analogy above, each CPU Core can actively process 1 bridge lane. | ||
+ | |||
+ | ===== Calculate Overall CPU Load ===== | ||
+ | * Get number of CPUs< | ||
+ | OR | ||
+ | nproc</ | ||
+ | * Load Average / NumProccessors = decimal % load | ||
+ | * Example: LoadAvg(1.5) / 2 Processors = 0.75 or 75% system load on a dual core system. | ||
---- | ---- | ||
Line 33: | Line 68: | ||
====== Troubleshooting Steps ====== | ====== Troubleshooting Steps ====== | ||
- | - **Know how many processors you have**. This is essential to determine if load is high. | + | - **Know how many processors you have**. This is essential to determine if load is high. See " |
- | - <code bash> | + | - <code bash> |
- %Load (decimal) = (Load Average / Number Processors) | - %Load (decimal) = (Load Average / Number Processors) | ||
- Example: Number of processors = 2, load average seen = 1.50 | - Example: Number of processors = 2, load average seen = 1.50 | ||
Line 59: | Line 94: | ||
---- | ---- | ||
- | ==== High CPU ==== | + | ===== High CPU ===== |
Clues that you should investigate high CPU usage: | Clues that you should investigate high CPU usage: | ||
Line 76: | Line 111: | ||
---- | ---- | ||
- | ==== High Memory Use ==== | + | ===== High Memory Use ===== |
+ | |||
+ | Notes on Linux memory management | ||
+ | * Linux uses free memory in RAM as a buffer cache to speed up application performance. | ||
+ | * When memory is needed, the buffer cache shrinks to allow other applications to use it. | ||
+ | * **Actual free memory = Memory Free + Buffers + Cached** | ||
Clues that you should investigate high memory usage: | Clues that you should investigate high memory usage: | ||
Line 98: | Line 138: | ||
---- | ---- | ||
- | ==== Disk I/O ==== | + | ===== Disk I/O ===== |
+ | |||
+ | * I/O wait (wa) is the percentage of time a CPU is waiting on disk. | ||
+ | * If I/O wait % is > (1/# CPU cores), then the CPUs are spending a lot of time waiting on disk. | ||
+ | * Easiest ways to improve disk I/O | ||
+ | * Give the system more memory | ||
+ | * Tune the application to use more in memory caches than disk | ||
Clues that you should investigate high Disk I/O: | Clues that you should investigate high Disk I/O: | ||
Line 104: | Line 150: | ||
* High CPU " | * High CPU " | ||
- | iostat - View I/O stats with extended statistics, every 3 seconds | + | \\ |
+ | **iostat** - View I/O stats with extended statistics, every 3 seconds | ||
<code bash> | <code bash> | ||
iostat -x 3 | iostat -x 3 | ||
Line 110: | Line 157: | ||
* " | * " | ||
- | iotop - Live disk I/O similar to top | + | \\ |
+ | **iotop** - Live disk I/O similar to top | ||
<code bash> | <code bash> | ||
iotop | iotop | ||
</ | </ | ||
+ | |||
+ | \\ | ||
+ | **lsof** - If a particular device is discovered, another option for further details is to list open files for that mount point. | ||
+ | * Device discovered via iostat | ||
+ | * Mount point discovered | ||
+ | * If ' | ||
+ | * Then search lsof for that mount point:< | ||
---- | ---- | ||