rescaled Argus¶
Argus is the result of an endeavour to build a more reliable monitoring platform, service providers actually want to use to not only receive critical notifications but also to monitor and report availabilities of hosted services to track SLA compliance.
Existing solutions either didn't offer the expected feature set or required major compromises.
Solved Problems¶
Our Argus monitoring platform is solving a variety of issues other competitors have.
Distributed Monitoring¶
The majority of uptime monitoring services are performing their checks from a single (sometimes not even selectable) location. However, the internet is not a perfect world and local outages are a usual phenomenon. If there's a small network interruption in between the monitoring server and the monitoring target, the checks immediately fail and a false positive notification is being triggered, because the monitored service is in fact still available for 99.999% of the internet.
A smaller minority of monitoring services is solving this issue with a cross-check after the primary monitoring server is reporting an outage. In this case a second check from another location is performed to determine whether the first result is true. While that's definitely the better approach it's usually either a paid feature or from a similar location that may suffer from the same potential connectivity issue as the first check instance.
Argus solves this issue by running checks from multiple locations in parallel. The check results are aggregated by a monitoring result collector and then evaluated against a defined quorum for each monitor.
Only if as many check locations as the defined quorum (or more) are reporting an issue with a monitor, it is considered to be really unavailable. This gives customers peace of mind when it comes to receiving critical notifications, e.g. via phone call in the night.
Tedious Configuration¶
Popular uptime monitoring services require you to configure which contacts shall be notified in which case for each monitor individually. Our customers are working with hundreds of monitors and adding or removing contacts from each one of them would be (and was indeed) extremely tedious.
Argus solves this issue with a tag-based configuration approach. While you're still able to configure individual contacts for individual monitors, its recommended configuration philosophy is to tag monitors and contacts to put them into common groups. Customers then have the possibility to use a dynamic rule builder to glue monitors with certain tags to notify contacts with certain other tags in case of a notifiable event.
In the past, onboarding a new employee to receive monitoring notifications meant that one would have to edit every monitor to make sure the employee receives all desired alerts.
For our customers this process became as simple as adding a new contact with a certain tag. No other configuration needs to be touched or modified.
Roadmap¶
We're still early in our journey to develop a likeable and usable monitoring platform. While we already solved the first issues and pain-points with other major providers, we're not done, yet.
Our roadmap contains several other quality of life improvements over our competition, and we're eager to tackle the challenges to become better every day.
In the near future you'll see features around the following topics:
- Automated Recurring SLA Reports
- Chain of Escalation
- On-Call Rotation
If you're missing a crucial feature in Argus or have big ideas on how to make monitoring and notifications better, please don't hesitate to contact us to talk about your visions.