System Monitoring with Xymon/Other Docs/About System Monitoring

In most systems, one or more "servers" hold the data for all of the monitored "hosts". There are a few distinct approaches to how data are gathered by servers. The simplest systems are server-only, where data are gathered from the hosts remotely with no cooperation from the various monitored hosts. An example of this might be a system that only records ICMP ECHO round-trip times or checks that a web page successfully returns HTTP code 200. Such systems are limited in the data they can gather remotely, but require no configuration of the monitored hosts beyond perhaps ensuring that firewall rules are configured appropriately.

In order to gather data that can only be obtained on the host, for example CPU load, some kind of "agent" must run on the host to gather the data and return it to the server. Some agent-based systems use a standard communications protocol, usually SNMP, as the means to transfer the data from host to server; others use a non-standard communications protocol that is custom to the monitoring system.

In most SNMP-based systems, the host may send an SNMP TRAP in the case of unusual operational events (e.g. a hard drive failing or system cold start). Most of the time, data are gathered periodically from the SNMP agent via SNMP GET requests. Not all SNMP-based systems support both TRAP and GET. SNMP can also be used for configuring hosts using the PUT method, but this is seldom used for monitoring purposes. SNMP-based systems are especially popular for network devices like routers and switches.

Systems that rely on custom communications protocols can operate either with the host periodically contacting the server ("push style", e.g. Xymon) or the server periodically contacting the agent on the host ("pull style", e.g. nagios using nrpe).