Hi all,

icinga2_apiat this point in development it is obviously not so hard to write something about the upcoming API for Icinga 2 v2.4. We’ve been busy in the past months to design, refine and plan the development of such an API. In order to give you an insight into what’s going on and what else to expect, please lean back and grab a coffee or two.

Hint: Follow Icinga on Twitter for faster updates :-) And make sure to join Icinga Camp Portland where we have talks and demos ready for you :)

 

Design

You might have seen it already and wondered why the cluster functionality contains the ApiListener configuration object including x509 connection handling. Generally speaking, the cluster API is an internal core interface, nothing we’d like to expose to users or programmatic scripts.

We’ve also been discussing whether to use the existing JSON-RPC interface and expose that to users. While JSON-RPC is still cool, it would have been tremendously hard to add client libraries and examples. In the end it would be yet another proprietary API protocol, and we certainly want something easy but flexible for our Icinga 2 API. Looking at existing APIs and recommendations made by community members (thanks Michael Medin for believing in that) we decided to go for a REST API after some mockups and use-case analysis.

In order to define our own url schema we’ve looked into other APIs such as DigitalOcean, Foreman, etc. and created concepts and to-dos for our very own schema.

 

Purpose

The main purpose of the Icinga 2 API is to

There’s a variety of existing tools and interfaces for which the API shall act as replacement:

  • send_nsca: pass a checkresult to Icinga 2 via actions interface
  • Livestatus: status queries and sending commands
  • External command pipe: Send commands (without quirky local permission problems and/or SELinux)
  • SNMP Traps: handlers can create/modify objects at runtime and send check results
  • Perfdata/OSCP-Commands: receive check results directly as event stream
  • Inventory/Auto-Discovery: external applications create/modify objects at runtime (PuppetDB/Foreman, CMDB, AWS, etc)

Target audience:

  • (web) applications fetching data and provide their own filters and restrictions
  • admins with root permissions querying the api on their own
  • scripts which pull/push data automatically (including command restrictions)

 

Main Requirements

  • RESTful url schema
  • Basic API framework including an HTTP server
  • ApiUser config object for authentication: Basic Auth or x509 client certificate name (default will be created upon installation)
  • Authorization and simple permissions (restrict users to run specific commands for ack only e.g.)
  • HTTP handler to interpret and process requests (GET, POST, PUT, DELETE)
  • Url schema versioning, JSON as output, dashes in urls (no underscores)
  • Url paramaters including object filters and column limiting
  • Dependency tracking for object deletion (services depend on hosts, etc.)

 

Configuration Management

The main idea behind it is to allow external applications to create configuration packages and stages based on configuration files and directory trees. This replaces any additional SSH connection and whatnot to dump configuration files to Icinga 2 directly. In case you’re pushing a new configuration stage to a package, Icinga 2 will validate the configuration asynchronously and populate a status log which can be fetched in a separated request.

Example: Create the config package “puppet”:

$ curl -k -s -u root:icinga -X POST https://localhost:5665/v1/config/packages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "status": "Created package."
        }
    ]
}

Add a new config file to the stage (this one has an error in it for better demo cases):

$ curl -k -s -u root:icinga -X POST -d '{ "files": { "conf.d/test.conf": "object Host \"cfg-mgmt\" { chec_command = \"dummy\" }" } }' https://localhost:5665/v1/config/stages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "stage": "imagine-1441133065-1",
            "status": "Created stage."
        }
    ]
}

If the configuration fails, the old active stage will remain active. If everything is successful, the new config stage is activated and live. Older stages will still be available in order to have some sort of revision system in place.

List all config packages, their active stage and other stages. That way you may iterate of all of them programmatically for older revisions.

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/packages | python -m json.tool{
    "results": [
        {
            "active-stage": "",
            "name": "aws",
            "stages": []
        },
        {
            "active-stage": "",
            "name": "puppet",
            "stages": [
                "imagine-1441133065-1"
            ]
        }
    ]
}

Now that we don’t have an active stage for “puppet” yet, there must have been an error. Fetch the “startup.log” file and check the config validation errors:

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/files/puppet/imagine-1441133065-1/startup.log
...

critical/config: Error: Attribute 'chec_command' does not exist.
Location:
/var/lib/icinga2/api/packages/puppet/imagine-1441133065-1/conf.d/test.conf(1): object Host "cfg-mgmt" { chec_command = "dummy" }
                                                                                                       ^^^^^^^^^^^^^^^^^^^^^^

critical/config: 1 error

Apart from populating just the local configuration, the config file management interface also supports “zones.d” trees which will be taken into account for the well-known cluster config sync automatically.

This API feature is mainly required for the upcoming Icinga Web 2 Config Tool for Icinga 2.

 

Create Objects at Runtime

Objects can be created by sending a PUT request including all required object attributes. Icinga 2 will validate all objects and return detailed errors on failure.

Objects created by the API are persisted on disk. In the next development sprint we’ll also finish the cluster synchronization – new objects will automatically be synced amongst authorized cluster nodes, no manual configuration required.

Example: Create host “google.com” with object attributes. The required “check_command” attribute is hidden in the imported “generic-host” template.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "templates": [ "generic-host" ], "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Object was created."
        }
    ]
}

Creating new objects will trigger apply-rule evaluation automatically – host.address and host.vars.os will result in “ping4” and “ssh” services.

If the configuration validation fails, the new object will not be created and the response body contains a detailed error message. The following example omits the required check_command attribute.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 500.0,
            "errors": [
                "Error: Validation failed for object 'google.com' of type 'Host'; Attribute 'check_command': Attribute must not be empty."
            ],
            "status": "Object could not be created."
        }
    ]
}

 

Modify Objects at Runtime

In case you want to modify attributes at runtime, we’ve implemented a cool internal event handler system notifying external interfaces on changes (DB IDO, cluster, etc). You are not limited to specific attributes as known from Icinga 1.x, but (nearly) everything. Changing the host’s address at runtime is not an issue for example. All modified attributes are persisted on disk and will survive a restart. These modified attributes will result in objects versions (to be implemented) throughout the cluster synchronization.

Example for existing object google.com:

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X POST \
-d '{ "attrs": { "address": "8.8.4.4", "vars.os" : "Windows" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

One thing to note – there’s also support for indexers, e.g. “vars.os” instead of declaring “vars” as JSON dictionary.

Take a different example: Lower the “retry_interval” for all hosts in a Not-UP state:

curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts?filter=host.state!=0' -X POST -d '{ "attrs": { "retry_interval": 30 } }' | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "host-oob",
            "status": "Attributes updated.",
            "type": "Host"
        },
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

 

Delete Objects at Runtime

In case of deleting objects, it’s a bit trickier: What happens if you delete the host object having several services depending on it? In the past, the host would have been deleted and the services would remain an inconsistent state. The solution to that sounds simple – track the object dependencies and only allow to delete such dependency chains if the user says so (cascading delete). If not, the DELETE request will return an error. You may also only delete objects created by the API – that’s for safety reasons preventing unwanted mixes of static configuration, config management and runtime config changes.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?cascade=1' -X DELETE | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Object was deleted.",
            "type": "Host"
        }
    ]
}

Note: Apply Rules must be statically configured or passed through the config management API. Newly created objects will automatically trigger apply rule evaluation (e.g. host with address automatically gets the “ping4” check assigned if that apply rule is in place).

 

Status Queries

While Livestatus and DB IDO do not expose all object attributes, the Icinga 2 API allows you to fetch all object types and their runtime configuration and state attributes. Apart from accessing a single object you may also use the same filter expressions known from apply rules to fetch a filtered list of objects.

You can select specific attributes by adding them as url parameters using ?attrs=…. Multiple attributes must be added one by one, e.g. ?attrs=host.address&attrs=host.name.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?attrs=host.name&attrs=host.address' -X GET | python -m json.tool
{
    "results": [
        {
            "attrs": {
                "host.address": "8.8.8.8",
                "host.name": "google.com"
            }
        }
    ]
}

 

Another cool thing – the check results also contain the executed command. That’s something pretty helpful for testing your configuration. Or – check the group membership of a host. Modify the attributes at runtime, and retrieve their status again.

icinga2_api_status_query_host_01 icinga2_api_status_query_host_02 icinga2_api_status_query_service_checkresult

Hint: If you want to view JSON in your browser, look for apps like for Chrome: JsonView.

Finishing this task is scheduled for the next weeks, some details are still missing.

 

Actions

Actions provide well-known runtime commands where you’ll schedule downtimes, acknowledge problems, add comments, etc. By using the same filter expression as found in the config language, you’ll have lots of possibilities to trigger actions. Futhermore all passed attributes are easily identified by their name. Forget about Icinga 1.x or Nagios using “SCHEDULE_HOST_DOWNTIME;host1;1110741500;1110748700;1;0;7200;foo;comment”!

Example: Reschedule check for host “google.com” using a filter.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/reschedule-check?type=Host&filter=host.name==%22google.com%22' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully rescheduled check for google.com."
        }
    ]
}

Example: Acknowledge all service problems at once.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/acknowledge-problem?type=Service&filter=service.state!=0' -d '{ "author": "michi", "comment": "Mega outage. Will take care." }' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for host-oob!service-oob"
        },
...
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for google.com!ssh"
        }
    ]
}

 

One more: Schedule a downtime for all hosts having the custom attribute “vars.os” set to “Linux”, e.g. for a general Puppet run rebooting the boxes on kernel updates.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/schedule-downtime?type=Host&filter=host.vars.os==%22Linux%22' -d '{ "author" : "michi", "comment": "Maintenance.", "start_time": 1441136260, "end_time": 1441137260, "duration": 1000 }' -X POST | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-1",
            "legacy_id": 11.0,
            "status": "Successfully scheduled downtime with id 11 for object google.com."
        },

...

        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-12",
            "legacy_id": 22.0,
            "status": "Successfully scheduled downtime with id 22 for object imagine.Speedport_W_921V_1_36_0009."
        }
    ]
}

Event Streams

Register clients listening on event streams and filter these events, e.g. only receive not-ok states. The following example is from our concept phase to give you an idea:

Request:

$ curl -k -s -u root:icinga -X POST 'https://localhost:5665/v1/events?queue=michi&types=CheckResult&filter=event.check_result.exit_status==2'

{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421319.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421324.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421329.7226390839,"type":"CheckResult"}

Note: This is not implemented yet. Development sprint is scheduled for CW42.

Btw – ohcp and oscp commands should be fairly replaceable by event stream clients forwarding all events to your umbrella monitoring system.

 

Reflection

List all url endpoints (objects, types, attributes) including details. Take an example: The Icinga 2 types follow an hierarchical order: Host inherits from Checkable inherits from  CustomVarObject inherits from ConfigObject inherits from Object. Using that information including all the object attributes you’ll get:

  • all object attributes
  • all object type prototypes (e.g. Object#clone)

icinga2_api_reflection_type_host icinga2_api_reflection_type_dictionary icinga2_api_reflection_type_apiuser

 

(HTTP) Clients

Ok, there’s curl and alternatives on the shell. We’ll also work on the icinga2 console providing an HTTP client to directly connect to the Icinga 2 API.

But yet there’s another cool thing: Icinga Studio. It connects to the Icinga 2 API and provides a type hierarchy including all objects and their runtime configuration and state. Built with wxWidgets making it cross-plattform (Linux, Windows, MacOSX). We’ll prepare packages in the next weeks for that as well (only where wxWidgets is available). For now it helps debugging and testing, at some later point we may consider changing its read-only state allowing runtime modifications :-)

icinga2_api_icinga_studio_01 icinga2_api_icinga_studio_02 icinga2_api_icinga_studio_03

 

Future

We’ve discussed, designed, re-evaluated and (pair) programmed quite a lot in the past weeks. Our goal is to have 2.4 ready right before OSMC later this year in November where you’ll get the whole package.

We’ll have the latest and greatest Icinga 2 API snapshot with us at Icinga Camp Portland right after PuppetConf – join us for live demos, talks, feedback & some G&T of course :)

In case you’re an addon developer, or want to start playing, our documentation is not complete yet, but will be frequently updated in the next weeks. Our Vagrant boxes use the latest and greatest snapshot packages too! :-)

Cheers from the Icinga 2 Core development team,

Michael, Gunnar & Jean-Marcel