现在使用zabbix的人很多,不过我觉得如果服务器监控数量不多,老牌监控系统nagios还是很不错的。nagios报警功能非常强大,而且程序小巧,资源占用小。nagios默认不支持画图,可以搭配cacti,不过搭建比较复杂。个人还是习惯用pnp4nagios。
nagios和pnp4nagios的一键安装脚本参考我的github:https://github.com/zhangnq/nagios/tree/master/setup
pnp4nagios默认图非常不美观,如果监控项中有多个数据项,pnp4nagios会分别显示多个。这里博主用监控内存脚本为例介绍如何使用pnp4nagios自定义模板实现美观的监控图。
nagios客户端
客户端上需要添加内存监控的脚本,默认插件不提供。
1、添加内存监控脚本,内容类似如下:
#!/bin/bash #nagios exit code STATE_OK=0 STATE_WARNING=1 STATE_CRITICAL=2 STATE_UNKNOWN=3 help () { local command=`basename $0` echo "NAME ${command} -- check memory status SYNOPSIS ${command} [OPTION] DESCRIPTION -w warning= -c critical= USAGE: $0 -w 50% -c 60%" 1>&2 exit ${STATE_WARNING} } check_num () { local num_str="$1" echo ${num_str}|grep -E '^[0-9]+$' >/dev/null 2>&1 || local stat='not a positive integers!' if [ "${stat}" = 'not a positive integers!' ];then echo "${num_str} ${stat}" 1>&2 exit ${STATE_WARNING} else local num_int=`echo ${num_str}*1|bc` if [ ${num_int} -lt 0 ];then echo "${num_int} must be greater than 0!" 1>&2 exit ${STATE_WARNING} fi fi } #input while getopts w:c: opt do case "$opt" in w) warning=$OPTARG warning_num=`echo "${warning}"|sed 's/%//g'` check_num "${warning_num}" ;; c) critical=$OPTARG critical_num=`echo "${critical}"|sed 's/%//g'` check_num "${critical_num}" ;; *) help;; esac done shift $[ $OPTIND - 1 ] [ $# -gt 0 -o -z "${warning_num}" -o -z "${critical_num}" ] && help if [ -n "${warning_num}" -a -n "${critical_num}" ];then if [ ${warning_num} -ge ${critical_num} ];then echo "-w ${warning} must lower than -c ${critical}!" 1>&2 exit ${STATE_UNKNOWN} fi fi datas=`awk -F':|k' '$2~/[0-9]+/{datas[$1]=$2}END{for (data in datas) {print data"="datas[data]}}' /proc/meminfo | grep -Ev '[)|(]'` var=`echo "${datas}"|sed 's/ //g'` eval "${var}" MemUsed=`echo ${MemTotal}-${MemFree}-${Cached}-${Buffers}|bc` MemUsage=`echo "${MemUsed}/${MemTotal}*100"|bc -l` MemUsage_num=`echo ${MemUsage}/1|bc` #echo ${MemUsage_num} MemTotal_MB=`echo ${MemTotal}/1024|bc` MemUsed_MB=`echo ${MemUsed}/1024|bc` MemFree_MB=`echo ${MemFree}/1024|bc` Cached_MB=`echo ${Cached}/1024|bc` Buffers_MB=`echo ${Buffers}/1024|bc` message () { local stat="$1" echo "MEMORY is ${stat} - Usage: ${MemUsage_num}%. Total: ${MemTotal_MB} MB Used: ${MemUsed_MB} MB Free: ${MemFree_MB} MB | Used=${MemUsed_MB};; Cached=${Cached_MB};; Buffers=${Buffers_MB};; Free=${MemFree_MB};;" } [ ${MemUsage_num} -lt ${warning_num} ] && message "OK" && exit ${STATE_OK} [ ${MemUsage_num} -ge ${critical_num} ] && message "Critical" && exit ${STATE_CRITICAL} [ ${MemUsage_num} -ge ${warning_num} ] && message "Warning" && exit ${STATE_WARNING}
脚本路径一般是/usr/local/nagios/libexec,命名check_mem.sh。
2、然后修改nrpe.cfg配置文件,重启nrpe,命令类似如下。
wget http://download.chekiang.info/nagios/check_mem.sh chmod +x check_mem.sh chown nagios:nagios check_mem.sh cat >>/usr/local/nagios/etc/nrpe.cfg<<"EOF" command[check_mem]=/usr/local/nagios/libexec/check_mem.sh -w 80% -c 90% EOF sleep 3 /root/restart_nrpe.sh
nagios服务端
1、客户端添加完check_mem.sh插件后,在服务端添加监控服务check_mem,重启nagios 。
define service{ use local-service,srv-pnp host_name blog.nbhao.org service_description check memory usage check_command check_nrpe!check_mem notification_options w,c }
2、进入pnp4nagios的check_command配置文件目录,例如/usr/local/pnp4nagios/etc/check_commands/。默认目录中会有几个sample文件,添加check_nrpe.cfg,内容如下。
CUSTOM_TEMPLATE = 1 #使用命令的第一个参数做自定义模板名 DATATYPE = GAUGE #数据类型为即时数值 USE_MIN_ON_CREATE = 0 #绘图数据最小值为0,用来排除某些错误溢出导致的负值
3、进入pnp4nagios的template模板目录,例如/usr/local/pnp4nagios/share/templates.dist 。添加check_mem.php,内存类似如下。
$alpha = 'CC'; $colors = array( '#850707' . $alpha, '#FFDB87' . $alpha, '#25345C' . $alpha, '#88008A' . $alpha, '#4F7774' . $alpha, ); $opt[1] = sprintf('-T 55 -l 0 --vertical-label "Bytes" --title "%s / Memory Usage"', $hostname); $def[1] = ''; $count = 0; foreach ($DS as $i) { $def[1] .= rrd::def("var$i", $rrdfile, $DS[$i], 'AVERAGE'); if ($i == '1') { $def[1] .= rrd::area ("var$i", $colors[$count], rrd::cut(ucfirst($NAME[$i]), 15)); } else { $def[1] .= rrd::area ("var$i", $colors[$count], rrd::cut(ucfirst($NAME[$i]), 15), 'STACK'); } $def[1] .= rrd::gprint ("var$i", array('LAST','MAX','AVERAGE'), "%4.2lf %s\\t"); $count++; }
添加完成之后过几分钟等nagios生成数据即可看到pnp4nagios自定义模板的效果图。
我们可以同时相同的办法实现网卡流量、磁盘读写等相似图表。
参考连接:
https://github.com/June-Wang/NagiosPlugins
http://docs.pnp4nagios.org/pnp-0.6/tpl
http://www.itnms.info/discuz/forum.php?mod=viewthread&tid=2788&page=1
评论列表(0条)
都是代码 完全看不懂 小白一枚
可以一起学习。。。
不用多说,博主是技术宅!
雕虫小技,技术还谈不上哦…
这个小技术,很是给力。
😈 😈 😈
很好的文章!感谢!
支持一下!