A few years ago, we were approached by a European company providing IT services to businesses in Central Africa. One of the most vital deliverables requested by their customers was the monitoring of vastly distributed pieces of equipment. The problem is — network connection in remote areas where these equipment units were located is unreliable, meaning that data was frequently lost due to WAN outages.
Most often, data gathered in centralized monitoring is concerned with equipment performance — CPU utilization, memory usage and file system space consumption, disk input and output performance, as well as various connection characteristics — network traffic statistics, network resource availability and response time, internet connection quality. For a history to be complete, these data need to be transferred to the central server continuously and without any disruption. What happens when network connection is lost for half an hour or even for a whole day? Well, the usual scenario is that these data are lost, so in the case of equipment failure, it’s hard to determine the precise timing and cause of the problem.
To avoid such data losses, Raden Solutions has added a specific functionality to NetXMS, its open-source network monitoring and management system. The first feature is the introduction of proxy agents. Instead of polling SNMP-capable devices and servers in each location directly from the central monitoring server, one of the agents at each site is dedicated to working as a proxy. Nothing more than a modest computer on site is needed to create a proxy agent. The data collection schedule is uploaded to the proxy, which then does the polling on behalf of the server.
When connection with the central server is up and running, data are sent immediately to the server for processing and storage. When the network is down, the proxy agent stores collected data locally, to later resynchronize them with the central server as soon as the connection is restored. While the stored data are being transferred, the agent continues to collect new data at the same time to ensure there are no blind spots. The capacity of the proxy agent depends on the requirements of the end user. While technically storing data for a period of up to several weeks is possible, the biggest constraint is the overall quality of the connection — if it is restored after a very long break, the large amounts of data could cause serious congestion. So, usually, proxy servers help out in cases of short disruptions — from minutes to several days.
Add to this another NetXMS feature called zoning and you’ll get an even more robust network management solution. Zoning is most helpful for sites that are not directly reachable by the server or have overlapping IP address ranges. NetXMS combines each site or group of sites into zones, with a separate proxy agent assigned to each zone. Accordingly, all communication between the devices and the server is directed through the proxy without the need for the central server to access each and every device on site. Agents can connect back to the server identifying themselves with certificates, which allows proxy agents to be run on dynamic IP addresses. This solution is especially valuable for shipping companies where connection is only periodically available — for example, when the ship is close to land or there is a good satellite signal.
Together, these additions significantly improve the reliability of NetXMS in monitoring networks of vastly distributed equipment units. Painful data loss due to unstable network connection is now a thing of the past!