Icinga 2 can generate its own alerts when a host or service has reached a certain state (hosts: UP, DOWN, or UNREACHABLE, services: OK, WARNING, CRITICAL, or UNKNOWN). With the right configuration the monitoring software sends out emails, text or instant messages, etc. to users or user groups. (Check out our documentation to learn more about Notification objects.) As soon as you want to organise notifications for more than a few people or several teams with different on-call duties, things can become a bit uncomfortable in Icinga 2. (more…)
8 years ago, May 2009 all we knew – create a better monitoring tool and work with our community and their demands. Integrate long awaited patches and functionality and care about feedback. Community members from all around the globe joined that vision. Germany, Austria, Italy, UK, India, USA, Australia, Brazil, Belgium,… many of us never worked in an open source project, and we try to make things right. Passionate with lots of emotional discussions, still putting our ideas to the next level.
March was all about our lovely community. We’ve had Icinga Camp Berlin and San Francisco, and also joined FLOSSUK. You’ll also recognize that our Puppet module for Icinga 2 was officially approved by Puppet. Blerim released icingabeat and blogged about it at the Elastic blog. And many more things happened …
We’ve also thought about 1st of April, but hey – we have so many great things to share and work on, we’ll skip it for 2017 ;-)
Icinga 2 is used by many companies in all sizes and allows you to monitor servers, applications, network hardware, services, web pages, etc. However, there is a problem when monitoring external services – many people do not have the view from the outside. You can monitor everything from your own network, but what your customer sees and uses, is not 100% guaranteed to work in this way.
Practically, Icinga 2 has a proven way to integrate Icinga 2 satellites into its own monitoring environment with its cluster and zoning concept. If you want to monitor external services, the satellite zone must be outside of your own network. Our partner NETWAYS started an open source service which offers Icinga 2 in a fully managed or satellite way.
Icinga 2 satellite helps you to check your services from your customer’s point of view. Get insights into your service metrics by choosing different locations around the globe.
- US West
There are multiple preselected check plugins available for the most common scenarios. Monitor all kinds of different services including your self-operated Icinga 2 monitoring instance. All check plugins include performance metrics like latency, execution time and status codes.
A couple of available plugins for the Icinga 2 satellite:
- Web service checks
- E-Mail service checks
- Availability checks
Ping, ICMP, TCP/UDP
- Icinga 2
Icinga 2’s customized protocol
Full Icinga instance with Icinga Director and Grafana
Perhaps a satellite is not enough and you consider replacing your current setup. There also is a full featured Icinga instance with web based configuration and an integrated performance metric solution.
Easy configuration of your hosts and services. We offer a set of preconfigured common checks based on our varied and long lasting expertise of monitoring projects.
- System health
- Network protocols
- Application servers
Graphed metrics help you to identify load peaks immediately. Every time Icinga 2 runs a check it also collects performance metrics. Those can be used to help you to find and understand possible bottlenecks in your infrastructure.
Metrics are saved up to 12 months. Full Icinga API support and smooth integration of the most common operating systems makes integration in a breath.Try the master for free
Today I am happy to announce that Icinga is now part of the Chef Cookbook Program. We were able to get certified with our Icinga2 Cookbook which can be used to install and manage either Icinga2 Core or Icingaweb2. A big shout-out goes to Virender Khatri for maintaining and developing most of this Cookbook.
Chef is a configuration management software that helps automating the installation and configuration of software on your servers. By turning infrastructure to code Chef makes it possible to abstract installation processes and configurations into Cookbooks.
The Partner Cookbook Program sets guidelines and best practices for cookbook creation and maintenance. By using cookbooks with the partner badge users can be sure that the cookbook has been reviewed and fulfils Chefs quality standards.
Given that the very development of Icinga arose from the need for additional functionalities in open source monitoring, it’s little surprise that the tool has become indispensable for so many IT professionals. Its configurability and flexibility allow for a sophisticated approach to monitoring, which is both scalable and extensible to large, complex environments.
The 2016 State of Monitoring Report confirmed the popularity of Icinga, with a large number of respondents naming it their primary systems monitoring tool. Of those that reported using the tool, almost 40% stated that they work in an enterprise organization with 1000+ employees. The vast majority reported that they push changes to production several times per day, and that infrastructure changes are deployed a few times per week. But at the same time, 64% stated that their IT team employs 20 or less people. Given the complexity and scale of the environments that Icinga monitors, and the ability of the tool to create sophisticated and detailed checks, it is perhaps understandable that resource-strapped IT teams encounter challenges managing and triaging high volumes of alerts.
Too much of a good thing?
In effect, sysadmins can become victims of their own success. They’ve configured the tool to effectively identify potential problems – but identifying and separating critical issues from warnings and high-impact from low-impact ones is challenging, particularly when dealing with complex, noisy environments.
Plus, most sysadmins don’t just have Icinga alerts to consider. According to the State of Monitoring, almost all Icinga users utilize anywhere from two to 10 additional tools in their stack, which means that multiple systems may trigger alerts related to the same issue. Lacking any form of unification between the various tools, IT teams are left with a pile of manual work to establish relationships amongst alerts before triage can begin. That chaos leads to missed SLAs, long mean time to resolve (MTTR), and, ultimately, customer outages. When you consider the sheer amount of machine data generated by agile, fragmented systems, there’s really no question that the scale goes beyond what any human – or even a small army of humans – is capable of managing.
Alert correlation – why it matters
BigPanda helps companies improve detection, accelerate remediation, and increase productivity through automated alert correlation. For a tool like Icinga, which is capable of doing so much, alert correlation provides an additional – and critical – layer of insight that allows sysadmins to effectively prioritize alerts and know what to do next.
To illustrate, let’s consider a typical “day in the life” of your average app. At any given time, checks on the app could complain of a whole host of issues: disk space, CPU consumption, low memory. Depending on the nature of these issues, some might be urgent while others aren’t. Some that aren’t urgent may be precursors to future high severity events. How do you know what matters and what to do next? Enter BigPanda.
The BigPanda correlation engine consumes alerts generated by Icinga, in addition to any other monitoring tool an organization might be using, and then normalizes and groups them into related, high-level incidents. This not only significantly reduces the number of items that IT has to deal with, but it also centralizes alerts from all monitoring tools into a single pane of glass for easy management and tracking. BigPanda automatically enriches incidents with contextual information – such as runbooks, metrics, related incidents, and configuration items and code deploys – helping you quickly gain the understanding required to properly prioritize and remediate incidents.
The BigPanda integration: How it works
BigPanda intelligently correlates alerts across monitoring systems by evaluating three main parameters:
Check: analyzing similar checks or error conditions across alerts and alert sources is a strong indicator that items are related.
Time: the rate at which related alerts occur. Alerts occurring around the same time are more likely to be related than alerts occurring far apart.
Context: the host, host group, service, application, cloud, or other infrastructure element that emits the alerts. Alerts are more likely to be related when they come from the same components of your infrastructure.
Every new alert generated by Icinga will be automatically correlated against existing active incidents in BigPanda. If there’s no match, a new incident will be created. The system works out-of-the-box, without requiring customers to define an explicit set of rules or build a dependency model for their environments. However, BigPanda customers can customize the correlation rules to suit unique requirements based on their specific systems, applications, topologies, and division of duties. On average, BigPanda users benefit from compression rates upwards of 90% between raw alerts and correlated incidents.
The best part? BigPanda can be set up in minutes with clicks, not code. We offer a tight and seamless integration that supports both Icinga and Icinga2, and is ready in a few easy steps.
This is a guest blogpost by our partner BigPanda.