Zabbix

监控系统

What to monitor?

Devices / Software
- Server, Router, Switches, I/O System etc.
- Operating System, Networks, Applications, etc.
Incidents(意外状况)
- DB down, Replication stpoped, Server not reachable, etc.
Critical Events(关键事件)
- Disk more than n% full or less than m Gbyte free, Replication more than n seconds lagging, Data node down.(磁盘利用率，主从复制从服务器延迟)
100% CPU utilization, etc
- Alert, mmediate intervention, fire fighting
Trends (includes time)
- Nginx 结束请求的时间，图形方式获取
How long des it take until
- my disk is full?
- my Index Memory is fiiled up?
When does is happen
- Peak? Backup?
How often does it happen? Does it happen periodically?
- once a day? Always at sunday night?
How does it correlate to other infromatioins?
- I/O problems during

NMS(Client) ---------协议---------> agent(Server)