linux_wiki:xymon_cpu_load_threshold_calc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

linux_wiki:xymon_cpu_load_threshold_calc [2017/01/09 15:32]
billdozor [The Script]
linux_wiki:xymon_cpu_load_threshold_calc [2019/05/25 23:50]
Line 1: Line 1:
-====== Xymon CPU Load Threshold Calc ====== 
- 
-**General Information** 
- 
-Posted to the Xymon Community at: https://wiki.xymonton.org/doku.php/monitors:cpu-load-calc 
- 
-Calculate (and set) a Xymon client's cpu load thresholds, based off of the client's reported number of processors from the hostdata files. 
- 
-This is mean to run on the Xymon Server periodically with cron. 
- 
-It allows for far more useful load monitoring than arbitrarily setting a generic load or spending a lot of time editing config files for each system. 
- 
-**Checklist** 
-  * Xymon server installed/configured 
-  * Xymon clients checking in 
- 
----- 
- 
-====== Installation ====== 
- 
-Installation instructions. 
- 
-===== Xymon Client side ===== 
- 
-No client modifications required. 
- 
----- 
- 
-===== Xymon Server side ===== 
- 
-  - Install the 'bc' package (allows bash string to floating point number conversion) 
-    - Enterprise Linux 6/7<code bash>yum install bc</code> 
-  - Create the cpu-load-calc.sh script somewhere such as: /root/scripts/cpu-load-calc.sh 
-    - See source below for contents 
-    - Edit the "Customize Here" section 
-      - Edit multipliers if desired(number of procs * number for warning and critical CPU load thresholds) 
-  - Create the auto load directory on the Xymon server 
-    - Default: /etc/xymon/analysis.d/auto-cpuload.d 
-  - Add the auto load directory to the Xymon main analysis.cfg file so it is included 
-    - Default: "directory /etc/xymon/analysis.d/auto-cpuload.d" to /etc/xymon/analysis.cfg 
-  - Set the default load in /etc/xymon/analysis.cfg to a super low warning level of "0.1" and a critical of something very high like "100.0". 
-    - That way, a hostdata file is generated for a system that has not had its load auto calculated. (and you don't get alert emails/texts as its only at warning) 
-  - Setup the script to auto run via cron (-v is verbose output) 
-    - Example<code bash>/etc/cron.hourly/xymon-calc-cpuload.sh 
-#!/bin/bash 
-# Description: Execute xymon script to auto calculate cpu load from hostdata 
- 
-/root/scripts/cpu-load-calc.sh -v &> /root/scripts/cpu-load-calc.log</code> 
- 
----- 
- 
-====== The Script ====== 
- 
-<code bash cpu-load-calc.sh> 
-#!/bin/bash 
-# Title: cpu-load-calc.sh 
-# Description: Calculate a xymon client's cpu load (Run on Xymon Server periodically with cron) 
-# Dependency: Requires 'bc' package 
- 
-#======================= 
-# Customize Here 
-#======================= 
- 
-# Warning and Critical Load Multipliers (num of procs * multiplier) 
-load_warn_multiplier=1.0 
-load_crit_multiplier=1.5 
- 
-# Directory to save auto load thresholds 
-auto_load_dir="/etc/xymon/analysis.d/auto-cpuload.d" 
- 
-# Xymon server's hostdata directory 
-xymon_hostdata_dir="/var/lib/xymon/hostdata" 
- 
-# Xymon server's main analysis config file 
-xymon_analysis_cfg="/etc/xymon/analysis.cfg" 
-#======================= 
-# End of Customize 
-#======================= 
- 
-#======================= 
-# Pre-Run Error Checking 
-#======================= 
-## Dependency Check ## 
-which bc &> /dev/null 
-if [[ $? -eq 1 ]]; then 
-  echo ">> Error! Dependent package 'bc' (byte code) not detected. Exiting..." 
-  exit 1 
-fi 
- 
-## Does the Auto Load Directory exist? 
-if [[ ! -d ${auto_load_dir} ]]; then 
-  echo ">> Error! The directory (${auto_load_dir}) does not exist or is not a directory. Exiting..." 
-  exit 
-fi 
- 
-## Write Access Check 
-touch ${auto_load_dir}/testfile &> /dev/null 
-if [[ $? -eq 1 ]]; then 
-  echo ">> Error! User '$(whoami)' does not have write access to ${auto_load_dir}! Exiting..." 
-  exit 1 
-else 
-  rm -f ${auto_load_dir}/testfile &> /dev/null 
-fi 
- 
-## Check if the auto_load_dir is included in main analysis config file 
-grep "^directory ${auto_load_dir}" ${xymon_analysis_cfg} &> /dev/null 
-if [[ $? -eq 1 ]]; then 
-  echo -e ">> Warning! Auto load directory (${auto_load_dir}) is not included in ${xymon_analysis_cfg}. Continuing, but auto CPU load settings will not take affect until 'directory ${auto_load_dir}' is added to ${xymon_analysis_cfg}.\n" 
-fi 
-#======================= 
-# End of Pre-Run Error Checking 
-#======================= 
- 
-#=============================== 
-# Functions; Main starts after 
-#=============================== 
- 
-function show_usage 
-{ 
-  echo -e "\n####==== Xymon Client Auto Load Thresholds ====####" 
-  echo -e "\nDescripton: Calculate a xymon client's cpu load." 
-  echo -e "\n--Usage" 
-  echo -e "$0      => No arguments, configure with no verbosity." 
-  echo -e "$0 -v   => Verbose output." 
-  echo -e "$0 -r   => Refresh CPU load data (force hostdata update)." 
-  echo -e "$0 -h   => Display usage." 
-} 
- 
-# Force snapshots of hostdata 
-function force_hostdata 
-{ 
-  # Use node name passed as argument 
-  node_name=${1} 
- 
-  # Lie to Xymon that the node's cpu is green, then yellow, forcing a hostdata snapshot 
-  xymon 127.0.0.1 "status ${node_name}.cpu green $(date)" 
-  xymon 127.0.0.1 "status ${node_name}.cpu yellow $(date)" 
-} 
- 
-#======================= 
-# Get Script Arguments 
-#======================= 
-# Reset POSIX variable in case it has been used previously in this shell 
-OPTIND=1 
- 
-# By default, no verbose output 
-verbose_output="no" 
-refresh_cpus="no" 
- 
-while getopts "hrv" opt; do 
-  case "${opt}" in 
-    h) # -h (help) argument 
-      show_usage 
-      exit 0 
-    ;; 
-    r) # -r (refersh cpus) argument 
-      refresh_cpus="yes" 
-    ;; 
-    v) # -v (verbose) argument 
-      verbose_output="yes" 
-    ;; 
-    *) # invalid argument 
-      show_usage 
-      exit 0 
-    ;; 
-  esac 
-done 
- 
-#======================= 
-# Main Program 
-#======================= 
-echo -e "== Xymon Client Auto Load Thresholds ==" 
-echo -e "Load Warning Multiplier: ${load_warn_multiplier}" 
-echo -e "Load Critical Multiplier: ${load_crit_multiplier}" 
-echo -e "Saving configs to: ${auto_load_dir}" 
- 
-# For each node reporting host data 
-for node in $(ls ${xymon_hostdata_dir}); do 
- 
-  if [[ ${verbose_output} == "yes" ]]; then 
-    echo -e "\n>> Working on node: ${node}" 
-  fi 
- 
-  if [[ ${refresh_cpus} == "yes" ]]; then 
-    if [[ ${verbose_output} == "yes" ]]; then 
-      echo -e "\n-> Refreshing hostdata..." 
-    fi 
-    # Force an update of hostdata 
-    force_hostdata ${node} 
-  fi 
- 
-  # Get the number of procs reported from node's most recent host data file 
-  node_num_procs="$(cat ${xymon_hostdata_dir}/${node}/$(ls -tr ${xymon_hostdata_dir}/${node}/ | tail -1) | awk '/nproc/ { getline; print }')" 
-   
-  # If node_num_procs is empty or not a number, move to the next node 
-  if [[ -z ${node_num_procs} || ! ${node_num_procs} =~ [0-9][0-9]* ]]; then 
-    # Did not find 'nproc' in the host data file or no number from nproc returned 
- 
-    if [[ ${verbose_output} == "yes" ]]; then 
-      echo "-> Warning! Could not find 'nproc' in ${node}'s host data file or no number returned. Skipping..." 
-    fi 
- 
-    continue 
-  fi 
- 
-  # Calculate the warning and critical load thresholds (normalize as a floating point with bc) 
-  load_warning=$(echo "${node_num_procs} * ${load_warn_multiplier}" | bc) 
-  load_critical=$(echo "${node_num_procs} * ${load_crit_multiplier}" | bc) 
- 
-  if [[ ${verbose_output} == "yes" ]]; then 
-    echo -e "-> Number of Procs: ${node_num_procs}" 
-    echo -e "-> Warning at: ${load_warning}" 
-    echo -e "-> Critical at: ${load_critical}" 
-    echo -e "-> Creating node analysis drop in file..." 
-  fi 
- 
-  # Create analysis drop in file 
-  echo "# ${node}'s CPU Load Thresholds (Warning Critical)" > ${auto_load_dir}/${node}.cfg 
-  echo "HOST=${node}" >> ${auto_load_dir}/${node}.cfg 
-  echo "  LOAD  ${load_warning}  ${load_critical}" >> ${auto_load_dir}/${node}.cfg 
-done 
- 
-echo -e "\n== Auto Load Thresholds Complete ==" 
- 
-exit 0 
-</code> 
- 
----- 
  
  • linux_wiki/xymon_cpu_load_threshold_calc.txt
  • Last modified: 2019/05/25 23:50
  • (external edit)