Plan File

Plans can be written YAML files, deployed with Blip, and used by specifying config.plans.files. File paths are relative to the current working directory of blip.

Name

Plan names are exactly as written in config.plans.files. For example:

plans:
  files:
    - "/blip/plan1.yaml"
    - ../plan2.yaml
    - ${HOME}/plan3.yaml

The first plan name is /blip/plan1.yaml (Blip does not strip base paths). The second plan name is ../plan2.yaml (Blip does not replace relative paths). The third plan name is /home/user/plan3.yaml where HOME=/home/user because interpolation happens first.

Syntax

A plan file consists of one or more levels with a unique name and frequency. Each level has the following structure:

level-name:
  freq: <duration string>
  collect:
    domain-name:
      <domain config>

level-name is any name you want to call the level. freq is how often the level is collected, expressed as a Go duration string like “5s” for “every 5 seconds”. Both level name and frequency must be unique within the same plan (they can be reused in different plans).

Each level has a collect subsection under which domains are specified. And each domain has a domain-specific configuration that includes:

  • metrics: A list of domain-specific metrics to collect
  • options: A key-value map of options
  • errors: A key-value map of error policies

These values are documented for each domain and printed on the command line by --print-domains.

Since Blip automatically levels up overlapping frequencies (described in Intro / Plans), it’s conventional to define levels from most to least frequent, as in this example:

performance:
  freq: 5s
  collect:
    status.global:
      options:
        all: yes
      metrics:
        # ALL because of option ^
    innodb:
      metrics:
        - trx_rseg_history_len # HLL

data-size:
  freq: 60m
  collect:
    data.size:
      # All defaults

The example above specifies two levels: performance and data-size.

The performance level is collected every 5 seconds—the most frequent. For this level, Blip collect metrics from two domains: status.global and innodb.

The data-size level is collected every 60 minutes—the least frequent. For this level, Blip collects metrics for one domain: data.size.

When the two levels overlap (every 60 minutes), Blip collects metrics for both levels. Therefore, you should never repeat metrics (in the same domain) in a plan. However, you can repeat domains at different levels (and frequencies) like:

performance:
  freq: 5s
  collect:
    status.global:
      metrics:
        - Queries
        - Threads_running

standard:
  freq: 20s
  collect:
    status.global:
      metrics:
        - Bytes_sent
        - Bytes_received

Every 5 seconds, Blip will collect status.global metrics Queries and Threads_running. And every 20 seconds it will collect those two plus status.global metrics Bytes_sent and Bytes_received.

Repeat domains; never metrics.

You can repeat domains at different levels to collect more metrics, but don’t repeat metrics in a plan. See also Metrics / Collecting / Reusing.

Interpolation

Blip interpolates domain option values, like:

level:
  freq: 10s
  collect:
    domain:
      options:
        foo: "${FOO}"

The domain option value ${FOO} will be replaced with the value of the FOO environment variable.

Plan file interpolation has the same syntax and rules as config file interpolation.