Collecting enormous amounts of SNMP data from all around the world? No problem!

One of our customers, a global company with offices around the world, was looking for a solution capable of collecting, monitoring and exporting data from more than 12,000 network devices. We used two features of NetXMS, plus a specialized database, to meet this challenge.

NetXMS server

In our last blog, we took a look at cases demonstrating how steady data collection can be executed in areas with poor internet coverage. The NetXMS zoning feature was of key importance there, and it also comes in handy when connectivity is not an issue, but the amount of data is. Imagine the need to monitor thousands of network devices over SNMP, collecting hundreds of metrics from each device. Sounds challenging enough? Add to this checking the collected data for threshold violations and storing them for several weeks for quick access, as well as exporting them to an external database for analytics and visualization.

Zoning allows a network to be split into separate zones, with a server communicating with devices within each zone via proxy. Proxies solve the load balancing problem — they help deal with large flows of data by creating endless possibilities for the horizontal scaling-up of monitoring capability. With the zoning feature on, instead of polling SNMP-capable devices and servers in each location directly from a central monitoring server, the server uploads the data collection schedule to the proxy, which then does polling on behalf of the server. Proxies can be easily added or removed as needed to match the expansion or contraction of the network.

At the same time, zoning ensures high availability — if for some reason one proxy fails, other proxy agents will take over data collection and distribute it evenly among themselves. This way, failure of one proxy does not cause an overload and data collection is not interrupted even for a moment — which would be catastrophic for a system of such a scale. While doing their primary job, proxies also perform self-monitoring and report their status to the server so that data collection can be optimised quickly and effectively.

Another feature — the fan-out data driver — was used to feed the data into an external database, the InfluxDB, where they were used for analytics and visualisations.

A particular problem we faced was storage of the collected performance data. Traditional SQL databases (Oracle, PostgreSQL, Microsoft SQL Server and others) are not very effective when it comes to deletion of relatively small portions of data from huge tables. This is why, to improve performance and solve housekeeping issues, we used a specialised time series database — TimescaleDB — as a backend database.

The combination of NetXMS features together with a custom mix of databases allowed us to create a vigorously capable, flexible and elegant setup for such a large-scale operation.

We’d like to keep in touch!

Allow us to check in with the most relevant information — the latest announcements, release notes, and news.

It’s all done! Subscription confirmation email is sent to [email protected]. Thank you!
We've failed to submit your subscription. Please try again later.