网站Logo 欢迎来到我的博客

zabbix实施全网监控

zhang
35
2024-10-09

zabbix实现全网监控

java-gatway远程监控

  • 修改客户端tomcat配置文件 /app/tools/tomcat/bin/catalina.sh
#在124行后添加
CATALINA_OPTS="$CATALINA_OPTS \
-Dcom.sun.management.jmxremote \         #开启远程监控功能
-Dcom.sun.management.jmxremote.port=12345 \    #监控端口
-Dcom.sun.management.jmxremote.authenticate=false \    #不加密
-Dcom.sun.management.jmxremote.ssl=false \
-Djava.rmi.server.hostname=172.16.1.9"      #内网网段
  • 服务端导入zbx-java-gatway镜像

  • 增加服务端docker-compose内容
#version: "3.8"
services:  
  db:  
    image: mysql:8.0-debian
    container_name: zbx_db
    networks:   
      - zmx_zbx_net  
    restart: always   
    volumes:  
      - ./zbx_db/:/var/lib/mysql/
    environment:  
      MYSQL_ROOT_PASSWORD: "${ROOT_PASS}"
      MYSQL_DATABASE: "${ZBX_DB}"
      MYSQL_USER:     "${ZBX_USER}"
      MYSQL_PASSWORD: "${ZBX_PASS}"
    command:  
      - --character-set-server=utf8  
      - --collation-server=utf8_bin  
      - --default-authentication-plugin=mysql_native_password  

  zbx_server:
    image: zabbix/zabbix-server-mysql:7.0.9-ubuntu-python
    build:
      context: .
      dockerfile: Dockerfile-dingding-weixin
    container_name: zabbix-server-mysql-7.0
    networks:   
      - zmx_zbx_net  
    restart: always   
    ports:  
      - 10051:10051
    depends_on:
      - db
    environment:  
      DB_SERVER_HOST: "db"
      MYSQL_ROOT_PASSWORD: "${ROOT_PASS}"
      MYSQL_DATABASE: "${ZBX_DB}"
      MYSQL_USER:     "${ZBX_USER}"
      MYSQL_PASSWORD: "${ZBX_PASS}"
      ZBX_STATSALLOWEDIP: "127.0.0.1,172.16.1.0/24,172.100.1.0/24"
      ZBX_JAVAGATEWAY_ENABLE: true
      ZBX_JAVAGATEWAY: zbx_java_gateway
      ZBX_JAVAGATEWAYPORT: 10052

  zbx_java_gateway:
     image: zabbix/zabbix-java-gateway:7.0.9-ubuntu
     container_name: zabbix-java-gateway-7.0
     networks:
       - zmx_zbx_net
     restart: always
     ports:
       - 10052:10052
     depends_on:
       - zbx_server

  zbx_web:
    image: zabbix/zabbix-web-nginx-mysql:7.0.9-ubuntu
    container_name: zabbix-web-nginx-mysql
    networks:   
      - zmx_zbx_net  
    restart: always   
    ports:  
      - 80:8080
    depends_on:
      - db
      - zbx_server
    environment:  
      ZBX_SERVER_HOST: "zbx_server"
      DB_SERVER_HOST: "db"
      MYSQL_ROOT_PASSWORD: "${ROOT_PASS}"
      MYSQL_DATABASE: "${ZBX_DB}"
      MYSQL_USER:     "${ZBX_USER}"
      MYSQL_PASSWORD: "${ZBX_PASS}"

  
networks:  
  zmx_zbx_net:  
    driver: bridge  
    ipam:  
      config:  
        - subnet: 172.100.0.0/16  
          ip_range: 172.100.1.0/24  
          gateway: 172.100.1.1

  • 重新生成镜像并过滤java进程

  • 前端页面添加监控主机

  • 立即执行自动发现规则

  • 检查状态

实施全网监控

检测DNS域名解析-域名过期时间-证书过期时间

环境准备

主机名 IP
lb01 10.0.0.5/172.16.1.5
web03 10.0.0.9/172.16.1.9
web04 10.0.0.10/172.16.1.10
nfs01 10.0.0.31/172.16.1.31
backup 10.0.0.41/172.16.1.41
m03-zabbix-server 10.0.0.63/172.16.1.63

Ansible一键部署zabbix客户端

  • 编写思路
  • 准备需要分发的主机清单文件hosts
  • 准备需要分发的键值文件
  • 准备zabbix源文件 zabbix.repo
  • 准备zabbix配置文件 zabbix_agent2.conf

roles目录

  • 主机清单hosts文件
[root@ans /server/ans/roles]# cat hosts
[lb]
172.16.1.5
172.16.1.6
[web]
172.16.1.9
172.16.1.10
[db]
172.16.1.52
[nfs]
172.16.1.31
[bak]
172.16.1.41
  • zabbix源文件
[root@ans /server/ans/roles/zabbix-client/files]# cat zabbix.repo 
[zabbix]
name=zabbix
baseurl=https://mirrors.aliyun.com/zabbix/zabbix/7.0/rhel/7/x86_64/
enabled=1
gpgcheck=0
  • zabbix客户端配置文件
[root@ans /server/ans/roles/zabbix-client/files]# cat zabbix_agent2.conf 
PidFile=/run/zabbix/zabbix_agent2.pid
LogFile=/var/log/zabbix/zabbix_agent2.log
LogFileSize=0
Server=172.16.1.63
ServerActive=127.0.0.1
Hostname=Zabbix server
Include=/etc/zabbix/zabbix_agent2.d/*.conf
PluginSocket=/run/zabbix/agent.plugin.sock
ControlSocket=/run/zabbix/agent.sock
Include=/etc/zabbix/zabbix_agent2.d/plugins.d/*.conf
  • zabbix键值文件
[root@ans /server/ans/roles/zabbix-client]# cat files/sys.conf 
UserParameter=sys.zombie,top -bn1 | awk 'NR==2{print $(NF-1)}'
UserParameter=user.login.ip[*],lastlog -u root | awk 'NR==2{print $$3}'

tasks

[root@ans /server/ans/roles/zabbix-client]# cat tasks/main.yml 
- name: 1.配置zabbix源
  copy:
    src: zabbix.repo
    dest: /etc/yum.repos.d/
    backup: yes

- name: 2.安装zabbix客户端
  yum:
    name: zabbix-agent2
    state: present

- name: 3.分发配置文件
  copy:
   src: "{{ item.src }}" 
   dest: "{{ item.dest }}"
  loop:
    - {src: zabbix_agent2.conf,dest: /etc/zabbix/} 
    - {src: sys.conf,dest: /etc/zabbix/zabbix_agent2.d/} 
  notify:
    - restart_zabbix

- name: 4.启动zabbix
  systemd:
    name: zabbix-agent2
    enabled: yes
    state: started

handlers

[root@ans /server/ans/roles/zabbix-client]# cat handlers/main.yml 
- name: restart_zabbix
  systemd:
    name: zabbix-agent2
    state: restarted

执行top.yml

[root@ans /server/ans/roles]# ansible-playbook -i hosts top.yml 

监控DNS域名解析(lb配置)

  • 自定义监控放在负载均衡节点上(任何一个)
  • 域名DNS是否可用,nslookup命令查看
  • 域名过期,输出剩余时间(shell脚本实现)
  • WEB或调取API接口查看,DNS查询数量

客户端编写检测域名检测脚本

[root@lb01.zmx.cn /server/scripts]# cat check_dns.sh 
#!/bin/bash
#author: lidao996 
#url: zmxedu.com
#desc: 检查指定的域名是否可以解析
#可以显示1
#不可以显示0

#1.vars
url=$1

#2.判断是否为域名
#2.检查nslookup命令是否存在
which  nslookup &>/dev/null || {
  yum install -y bind-utils
}
#3.检查
if nslookup $url &>/dev/null  ;then 
   #输出1表示可以使用
   echo "1"
else
   #输出0 
   echo "0"
fi
[root@lb01.zmx.cn /server/scripts]# bash check_dns.sh  www.zhangmianxin.xin
1

编写zabbix键值

[root@lb01.zmx.cn /server/scripts]# cat /etc/zabbix/zabbix_agent2.d/dns.conf 
UserParameter=check.dns[*],/bin/bash   /server/scripts/check_dns.sh  "$1"
  • 客户端测试
[root@lb01.zmx.cn ~]# zabbix_agent2 -t check.dns[zhangmianxin.xin]
check.dns[zhangmianxin.xin]                   [s|1]

服务端zabbix容器测试键值及故障案例

  • 故障案例

  • 需要对客户端提权zabixx
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# vim /etc/sudoers

[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# tail -1 /etc/sudoers
zabbix ALL=(ALL)  NOPASSWD:ALL

[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# visudo -c
/etc/sudoers:解析正确

  • 检查DNS解析是否成功
zabbix@ed0836c905c0:~$ zabbix_get -s 172.16.1.5  -k  check.dns[zhangmianxin.xin]
1

前端页面添加监控项

  • 添加新监控项与测试

  • 添加触发器

监控域名过期时间与域名证书过期时间

客户端编写脚本

[root@lb01.zmx.cn ~]# cat /server/scripts/check_https_expire.sh 
#!/bin/bash
##############################################################
# File Name:30.check_url_guoqi.sh
# Version:V1.0
# Author:zmx lidao996
# Organization:www.zmxedu.com
# Desc:
##############################################################

#1.vars
export LANG=en_US.UTF-8



#检查域名过期
check_domain() {
  local expire_date=`whois  $url |egrep "Expiry|Expiration" |awk -F ": " '{print $2}'`
  local exprire_date_second=`date -d "${expire_date}" +%s`
  local date_second_now=`date +%s`
  local date_expire_days=`echo "(${exprire_date_second} - ${date_second_now} )/60/60/24" |bc`
  echo "$date_expire_days"
  
}

#检查证书过期
check_https() {
  #这里还可以加入curl判断.
  local expire_date=`curl -v https://www.$url  |& grep expire |awk -F ": |GMT" '{print $2}'`
  local exprire_date_second=`date -d "${expire_date}" +%s`
  local date_second_now=`date +%s`
  local date_expire_days=`echo "(${exprire_date_second} - ${date_second_now} )/60/60/24" |bc`
  echo "$date_expire_days"
  
}

#main
main() {
 choice=$1
 url=$2
 case "$choice" in
     domain) 
            check_domain ;;
     https)  
            check_https  ;;
 esac 
}

main $*

[root@lb01.zmx.cn /server/scripts]# bash check_https_expire.sh domain zhangmianxin.xin
309
[root@lb01.zmx.cn /server/scripts]# bash check_https_expire.sh https zhangmianxin.xin
0
  • 编写客户端键值
[root@lb01.zmx.cn /server/scripts]# cat /etc/zabbix/zabbix_agent2.d/dns.conf 

UserParameter=check.domain_https[*],sudo sh /server/scripts/check_https_expire.sh  "$1"   "$2"

服务端zabbix容器测试键值

zabbix@ed0836c905c0:~$ zabbix_get -s 172.16.1.5  -k  check.domain_https[domain,zhangmianxin.xin]
309
zabbix@ed0836c905c0:~$ zabbix_get -s 172.16.1.5  -k  check.domain_https[https,zhangmianxin.xin]
0

前端页面添加监控项-触发器

域名过期

  • 域名过期剩余时间

  • 触发器 触发条件域名时间小于等于30天

证书过期

  • 域名证书剩余时间

  • 触发器 触发时间域名证书小于30天

  • 证书时间剩余0天,出现报错信息

  • 检查域名过期与证书过期
zabbix@ed0836c905c0:~$ zabbix_get -s 172.16.1.5  -k  check.domain_https[domain,baidu.com]
1175
zabbix@ed0836c905c0:~$ zabbix_get -s 172.16.1.5  -k  check.domain_https[https,baidu.com]
382

lb负载均衡-服务监控nginx-Tengine状态页数据

(1)监控lb通用--准备tengine

#配置nginx源
[root@lb01.zmx.cn ~]# cat /etc/yum.repos.d/nginx.repo 
[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/7/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
#安装nginx
[root@lb01.zmx.cn ~]# yum -y install nginx
#替换nginx命令
nginx-tengine命令替换已有的ngx
#nginx -v
[root@lb01.zmx.cn ~]# nginx -V
Tengine version: Tengine/3.1.0
nginx version: nginx/1.24.0
built by gcc 7.3.0 (GCC) 
built with OpenSSL 1.1.1f  31 Mar 2020
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx \
--modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf \
--error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log \
--pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock \
--http-client-body-temp-path=/var/cache/nginx/client_temp \
--http-proxy-temp-path=/var/cache/nginx/proxy_temp \
--http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp \
--http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp \
--http-scgi-temp-path=/var/cache/nginx/scgi_temp \
--user=nginx --group=nginx --with-compat --with-file-aio --with-threads \
--with-http_addition_module --with-http_auth_request_module --with-http_dav_module \
--with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module \
--with-http_mp4_module --with-http_random_index_module --with-http_realip_module \
--with-http_secure_link_module --with-http_slice_module --with-http_ssl_module \
--with-http_stub_status_module --with-http_sub_module --with-http_v2_module \
--with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module \
--with-stream_ssl_module --with-stream_ssl_preread_module \
--with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' \
--with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie' --add-module=./modules/ngx_http_upstream_check_module --add-module=./modules/ngx_http_upstream_session_sticky_module

物理层:

系统层:使用模板+sys.conf

服务层:ngx状态检查模块(stub_status) 有模板 , ngx负载均衡检查功能(check )书写脚本⾃定义监控

访问:⽇志,状态码,错误error/failed数量 ⾃定义监控 tcp/ip 连接数

并发数

(1)物理层:

物理服务器 ipmi监控硬件信息 megacli ipmitool

(2)系统层

使用模板+sys.conf

系统层监控

  • 模板 Linux by Zabbix agent(cpu,内存,磁盘,负载,⽹络,磁盘读写,登录⽤⼾数,运⾏时间)
  • 补充 僵尸进程,挂起进程数,⽂件是否发⽣变化aide -- check,⽤⼾是否来⾃于堡垒机.
判断是否为堡垒机的登录
检查lastlog -u root 登录的ip地址是否为堡垒机.

(3)服务层:nginx

  • 自建 nginx、haproxy lvs
如果是ngx或类型的
开启ngx状态检查模块
开启ngx负载均衡状态检查模块.
  • 负载均衡
 [root@lb01 ~]# cat /etc/nginx/conf.d/blog.conf 
upstream check_pools {
 server 10.0.0.7:80;
 server 10.0.0.8:80;
 check interval=3000 rise=2 fall=5 
timeout=1000 type=tcp;
 #check_http_send "HEAD / HTTP/1.0\r\nHost: 
blog.zmxlinux.cn\r\n\r\n";
 #check_http_expect_alive http_2xx http_3xx;
 }
 server {
 listen 80;
 server_name blog.zmxlinux.cn ;
 error_log  /var/log/nginx/check-error.log  
notice; 
#最好加入到每个独立的站点中.
 location /lb_status {
 #负载均衡状态检查模块
check_status;  
access_log off;
 }
 location / {
 proxy_pass http://check_pools;
 include proxy.conf;
    #XFF
  
   }
 }
  • 添加状态模块监控
[root@lb01 ~]# cat 
/etc/nginx/conf.d/default.conf
 server {
    listen       80 default_server;
    server_name  localhost;
    default_type text/plain;
    location / {
      #return 200 "website is ok";
      index index.html;
    }
    location /status {
      #allow 221.218.213.9 ;
      allow 127.0.0.1;
      allow 10.0.0.1;
      allow 172.16.1.0/24;
      deny all;
      stub_status;
    }
 }
#curl检查
curl 172.16.1.5/status
curl -v 172.16.1.5/status
  • 状态输出信息
[root@lb01.zmx.cn ~]# curl -v 172.16.1.5/status
*   Trying 172.16.1.5:80...
* Connected to 172.16.1.5 (172.16.1.5) port 80 (#0)
> GET /status HTTP/1.1
> Host: 172.16.1.5
> User-Agent: curl/7.71.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: Tengine/3.1.0
< Date: Sat, 26 Jul 2025 12:38:12 GMT
< Content-Type: text/plain
< Content-Length: 111
< Connection: keep-alive
< 
Active connections: 1 
server accepts handled requests request_time
 2 2 2 0
Reading: 0 Writing: 1 Waiting: 0 
* Connection #0 to host 172.16.1.5 left intact
  • 指令详解

克隆修改zabbix-nginx监控内置模板为Tengine

  • 修改宏变量的值

测试Tengine的键值获取信息

#zabbix内置监控值
web.page.get["{$NGINX.STUB_STATUS.HOST}","{$NGINX.STUB_STATUS.PATH}","{$NGINX.STUB_STATUS.PORT}"]
#测试使用
web.page.get["localhost","status","80"]
  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5  -k web.page.get["localhost","status","80"]
#相关项对应的主要项
HTTP/1.1 200 OK
Connection: close
Content-Length: 111
Content-Type: text/plain
Date: Sat, 26 Jul 2025 12:46:52 GMT
Server: Tengine/3.1.0

Active connections: 1 
server accepts handled requests request_time
 3 3 3 0
Reading: 0 Writing: 1 Waiting: 0

修改监控项中的相关项

  • 正则表达式匹配需要值
Server: Tengine/(.*)


server accepts handled requests request_time\s+([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)

Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+)


var a = value.match(/server accepts handled requests request_time\s+([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)/)
if (a) {
    return a[1]-a[2]
}
#正则表达式匹配内容
HTTP/1.1 200 OK
Connection: close
Content-Length: 111
Content-Type: text/plain
Date: Sat, 26 Jul 2025 12:46:52 GMT
Server: Tengine/3.1.0

Active connections: 1 
server accepts handled requests request_time
 3 3 3 0
Reading: 0 Writing: 1 Waiting: 0 
  • tengine版本

  • 测试

  • 最后模板关联主机

  • 查看最新数据

lb负载均衡-监控指定站点的web情况 自定义监控 (监控考试系统)

监控前准备

  • 升级nginx取得tengine健康检查模块
#停止nginx服务
systemctl stop nginx 
#查看是否有配置nginx源文件
[root@docker-lb01 ~]# cat /etc/yum.repos.d/nginx.repo 
[nginx-stable]
name=nginx stable repo
#注意修改为7!!!
baseurl=http://nginx.org/packages/centos/7/$basearch/
gpgcheck=1
enabled=1
#备份nginx命令
[root@docker-lb01 ~]# cp /usr/sbin/nginx   /usr/sbin/nginx.bak
#替换为tengine
cp nginx-tengine-3.1.0 /usr/sbin/nginx
chmod +x /usr/sbin/nginx 
chown root:root /usr/sbin/nginx
#语法检查查看是否成功替换
[root@docker-lb01 ~]# nginx -V 2>&1  |  grep -i 'check_module'
#语法检查与重启nginx服务
[root@docker-lb01 ~]# nginx -t && systemctl start nginx 

监控web站点情况

  • 负载均衡配置文件
[root@docker-lb01 ~]# cat /etc/nginx/conf.d/exam.conf 
upstream exam_l7_pools {
 server 10.0.0.7:80;
 server 10.0.0.8:80;
 hash $remote_addr consistent;

 check interval=3000 rise=2 fall=5 timeout=1000 type=http;
 check_http_send "HEAD /index.html HTTP/1.0\r\nHost:stu.zmx.cn\r\nUser-Agent: lb_check\r\n\r\n";
 check_http_expect_alive http_2xx http_3xx;
}
server {
 listen 80;
 server_name admin.zmx.cn;
 location / {
 proxy_pass http://exam_l7_pools;
 proxy_set_header Host $http_host;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header X-Real-Ip $remote_addr;
   }

}
server {
 listen 80;
 server_name stu.zmx.cn;
  location / {
 proxy_pass http://exam_l7_pools;
 proxy_set_header Host $http_host;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header X-Real-Ip $remote_addr;
  }
 location /lb_status {
   check_status;
   access_log off;
   #allow
   #deny   
 } 

}
  • 语法检查
nginx -t
  • 重启nginx服务
systemctl restart nginx 
  • 网页测试

  • 命令行测试
curl -s -H Host:stu.zmx.cn localhost/lb_status?format=csv
curl -s -H Host:stu.zmx.cn localhost/lb_status?format=csv | grep -wi up |  wc -l
  • 编写脚本
#!/bin/bash
##############################################################
# File Name:/server/scripts/check_lb_pools.sh
# Version:V1.0
# Author:lidao996
# Organization:zhubaolin.blog.csdn.net
# Desc:
##############################################################
#1.vars
url=$1
#2.统计负载均衡后端服务器在线的数量
total=`curl -s -H Host:${url} localhost/lb_status?format=csv |wc -l`
up=`curl -s -H Host:${url} localhost/lb_status?format=csv |grep -wi up |wc -l`
#3.统计负载均衡后端服务器在线百分比
echo "scale=2; $up / $total * 100" |bc -l

  • 修改客户端子配置文件
vim /etc/zabbix/zabbix_agent2.d/lb.conf
UserParameter=check.lb[*],/bin/bash /server/scripts/check_lb_pools.sh "$1"
  • 重启zabbix
systemctl restart zabbix-agent2.service 
  • 客户端测试
[root@docker-lb01 ~]# zabbix_agent2 -t check.lb[stu.zmx.cn]
  • 服务端测试
zabbix_get -s 172.16.1.5 -k check.lb[stu.zmx.cn]

web页面监控

  • 添加监控项

  • 添加触发器

  • 立即执行查看

监控keepalived节点

  • 监控keepalived是否切换
  • 监控项:过滤keepalived是否有进程
  • 监控项:是否有vip,过滤vip,确定是否发生了主备切换
  • 自定义监控项目
  • 编写监控脚本
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# cat /server/scripts/chk_vip.sh 
#!/bin/bash
##############################################################
# File Name:chk_vip.sh
# Version:V1.0
# Author:zmx lidao996
# Organization:www.zmxedu.com
# Desc:
##############################################################
#1.vars
vip=$1
#2.过滤
vip_cnt=`ip a |grep -w "${vip}" |wc -l`
#3.判断
if [ $vip_cnt -gt 0 ];then
	echo 1
else
	echo 0
fi
  • 编写修改客户端子配置文件
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# cat vip.conf 
UserParameter=keepalived.vip[*],/bin/bash /server/scripts/chk_vip.sh "$1"
  • 重启zabbix服务
systemctl restart zabbix-agent2.service
  • 客户端测试
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# cat /server/scripts/ch
changeip.sh        check_lb_pools.sh  chk_vip.sh   

  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k keepalived.vip[10.0.0.3]
1

web界面监控

  • 添加监控项

  • 添加触发器

  • 查看最新数据

监控访问日志

  • nginx访问日志(elk服务)
  • ip地址出现次数
  • 状态码及出现次数
  • 自定义监控脚本(统计状态码数量)
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# cat /server/scripts/check_ngx_access_log.sh 
#!/bin/bash
##############################################################
# File Name:check_ngx_access_log.sh
# Version:V1.0
# Author:zmx 
# Organization:www.zmx.com
# Desc:
##############################################################

#1.vars
access_files="/var/log/nginx/access.log"
code=$1
#2.case
case "$code" in
	200) awk '{print $9}' $access_files |grep -w "200"|wc -l;;
	206) awk '{print $9}' $access_files |grep -w "206"|wc -l;;
	301) awk '{print $9}' $access_files |grep -w "301"|wc -l;;
	302) awk '{print $9}' $access_files |grep -w "302"|wc -l;;
	304) awk '{print $9}' $access_files |grep -w "304"|wc -l;;
	400) awk '{print $9}' $access_files |grep -w "400"|wc -l;;
	401) awk '{print $9}' $access_files |grep -w "401"|wc -l;;
	403) awk '{print $9}' $access_files |grep -w "403"|wc -l;;
	404) awk '{print $9}' $access_files |grep -w "404"|wc -l;;
	405) awk '{print $9}' $access_files |grep -w "405"|wc -l;;
	413) awk '{print $9}' $access_files |grep -w "413"|wc -l;;
	500) awk '{print $9}' $access_files |grep -w "500"|wc -l;;
	502) awk '{print $9}' $access_files |grep -w "502"|wc -l;;
	503) awk '{print $9}' $access_files |grep -w "503"|wc -l;;
	504) awk '{print $9}' $access_files |grep -w "504"|wc -l;;
esac
  • 修改zabbix子配置文件
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# cat acess.conf 
UserParameter=nginx.log.status[*],sudo /bin/bash /server/scripts/check_ngx_access_log.sh "$1"
  • 重启zabbix服务
systemctl restart zabbix-agent2.service
  • 客户端测试
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# zabbix_agent2 -t nginx.log.status[200]
nginx.log.status[200]                         [s|440]

  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k nginx.log.status[200]
440

web界面监控

  • 添加监控项

  • 添加触发器

  • 添加图形

监控错误日志

  • nginx错误日志:faild/denied/error/最近5000行
tail -5000 /var/log/nginx/error.log|grep -c -i error
  • 自定义监控
#命令行测试
start=`date  +%Y\/%m\/%d" "%H:%M -d "-1min"`
echo $start
2025/07/24 21:07

start=`date  +"%Y\/%m\/%d %H:%M" -d "-1min"`
echo $start
2024\/07\/26 11:37

# 输出最近1分钟的错误日志
sed -n "/${start}/,\$p" /var/log/nginx/error.log
  • 修改客户端子配置文件
#统计错误日志中最近1000行有多少error
UserParameter=check.ngx.error,sudo tail -n1000 /var/log/nginx/error.log |egrep -i 'error|failed|denied'|wc -l
#分析安全日志系统是否有异常登录(暴力破解)
UserParameter=check.error.login,sudo tail -n1000 /var/log/secure |egrep -i 'fail'|wc -l
  • 重启服务
systemctl restart zabbix-agent2.service
  • 客户端测试
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# zabbix_agent2 -t check.ngx.error
check.ngx.error                               [s|999]
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# zabbix_agent2 -t check.error.login
check.error.login                             [s|2]

  • 故障案例--重复定义删除重复定义中的任意一条即可

  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k check.ngx.error
999
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k check.error.login
2

web界面监控

  • 查看最新数据

监控TCP/IP

  • 自定义监控
  • 修改zabbix客户端配置文件
#并发数
UserParameter=net.tcp.estab,sudo ss -ant | grep -i estab | wc -l
#将要断开连接的数量
UserParameter=net.tcp.wait,sudo ss -ant | grep -i wait | wc -l
  • 重启服务
systemctl restart zabbix-agent2.service
  • 客户端测试
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# zabbix_agent2 -t net.tcp.estab
net.tcp.estab                                 [s|1]
[root@docker-lb01 /etc/zabbix/zabbix_agent2.d]# zabbix_agent2 -t net.tcp.wait
net.tcp.wait                                  [s|25]
  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k net.tcp.estab
2
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.5 -k net.tcp.wait
36

web界面监控

  • 添加监控项

  • 添加触发器

  • 查看最新数据

web监控

(1)nginx php

应用:代码 war包 jar包

服务:nginx php tomcat jar

系统:模板+自定义

监控nginx服务(应用监控)

  • 准备测试页面
#测试页面
cd /app/code/blog/
echo 'blog' > chk_ngx.html

cat >chk_php.php<<'EOF' 
<?php
phpinfo();
?>
EOF

cat >chk_db.php<<'EOF' 
<?php
//数据库地址
$db_host='172.16.1.51';
//数据库用户名
$db_user='blog';
$db_pass='blog';
//数据库名字
$db_name="blog";

$link_id=mysqli_connect($db_host,$db_user,$db_pass,$db_name);
if($link_id){
  echo "mysql successful \n" ;
}
else{
echo "connection failed!\n" ;
}
?>
EOF
  • web页面添加web场景监控

  • 步骤:
  • 检查nginx状态

  • 检查nginx+php

  • 检查db+php

  • 查看web检测

  • 图表信息

  • 添加触发器
last(/web01-172.16.1.7/web.test.fail[监控blog业务是否正常])<>0
web.test.fail 该监控项将显示场景中失败的步骤数。如果所有步骤都成功执行,则返回 0

  • 关闭web服务

监控php服务

  • 修改nginx配置文件/etc/nginx/conf.d/default.conf
[root@web01.zmx.cn ~]# cat /etc/nginx/conf.d/default.conf

server {
  listen 80 default_server;
  server_name status.zmx.cn;
  location /status {
    stub_status;
  }

  location /php_status {
    fastcgi_pass 127.0.0.1:9000;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
  }
  
  location /php_ping {
    fastcgi_pass 127.0.0.1:9000;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
  }
}
  • 修改php配置文件/etc/php-fpm.d/www.conf
[root@web01.zmx.cn ~]# egrep -n  '^pm.status|^ping' /etc/php-fpm.d/www.conf
240:pm.status_path = /php_status
252:ping.path = /php_ping
257:ping.response = pong
  • 重启nginx与php服务
 systemctl restart nginx php-fpm.service 
  • 命令行测试
[root@web01.zmx.cn ~]# curl 10.0.0.7/php_status
pool:                 www
process manager:      dynamic
start time:           27/Jul/2025:20:28:18 +0800
start since:          38
accepted conn:        17
listen queue:         0
max listen queue:     0
listen queue len:     128
idle processes:       5
active processes:     1
total processes:      6
max active processes: 1
max children reached: 0
slow requests:        0
  • 服务端测试
zabbix@7bb11638da53:~$ zabbix_get -s 172.16.1.7 -k  web.page.get["localhost","php_status","80"]

  • 修改zabbix内置php模板、

  • 主机关联模板

  • 批量增加模板标记

  • 查看最新数据