Domains

This page documents the metric domains from which Blip currently collects metrics. Use --print-domains to list these domains from the command line:

$ blip --print-domains | less

Each domain begins with a table:

Blip version
Blip version domain was added or changed.
MySQL config
If MySQL must be explicitly or specially configured to provide the metrics.
Sources
MySQL source of metrics.
Derived metrics
Derived metrics. Omitted if none.
Group keys
Metric groups. Omitted if none.
Meta
Metric meta. Omitted if none.
Options
Domain options. Omitted if none.
Error policy
MySQL error codes handled by optional error policy. Omitted if none.


aws.rds

Amazon RDS for MySQL

Blip version v1.0.0
MySQL config no
Sources Amazon RDS API

Collects Amazon RDS metrics.

innodb

InnoDB Metrics

Blip version v1.0.0
MySQL config maybe
Sources information_schema.innodb_metrics
Meta subsystem = SUBSYSTEM column
Options all

Metrics from INFORMATION_SCHEMA.INNODB_METRICS.

Options

  • all
    Default: no
    If yes, all InnoDB metrics are collect—the whole table. If no (the default), only the explicitly listed InnoDB metrics are collected. If enabled, only InnoDB metrics enabled by the MySQL configuration are collected (WHERE status='enabled' in the table).

percona.response-time

Percona Server Query Response Time

Blip version v1.0.0
MySQL config yes
Sources Percona Server 5.7 RTD plugin
Derived metrics pN (gauge) for each value in the percentiles option
Meta pN=pA: where pN is collected percentile and pA is actual percentile
Options flush
real-percentiles
Error policy unknown-table

The percona.response-time domain collects query response time percentiles from the Percona Server 5.7 Response Time Distribution plugin.

This domain is functionally identical to query.response-time; only one option name is different:

percona.response-time query.response-time
flush truncate-table

See query.response-time for details.

Error Policy

  • unknown-table
    MySQL error 1109: Unknown table ‘query_response_time’ in information_schema

query.response-time

MySQL Query Response Time

Blip version v1.0.0
MySQL config yes
Sources MySQL 8.0 p_s.events_statements_histogram_global
Derived metrics pN (gauge)
Meta pN=pA: where pN is collected percentile and pA is actual percentile
Options real-percentiles
truncate-table
truncate-timeout
Error policy table-not-exist
truncate-timeout

The query.response-time domain collect query response time percentiles. By default, it reports the P999 (99.9th percentile) response time in microseconds.

To convert units, use the TransformMetrics plugin or write a custom sink.

Derived metrics

  • pN
    Type: gauge
    Response time percentile to collect where N between 1 and 999. (The “p” prefix is required.) p95 collects the 95th percentile. p999 collects the 99.9th percentile. The response time value is reported in microseconds. The true percentile might be slightly greater depending on how the histogram buckets are configured. For example, if collecting p95, the real percentile might be p95.8.

Options

  • real-percentiles
    Default: yes
    If yes (default), reports the real percentile in meta for each percentile in options. MySQL (and Percona Server) use histograms with variable bucket ranges. Therefore, the P99 might actually be P98.9 or P99.2. Meta key pN indicates the configured percentile, and its value pA indicates the actual percentile that was used.

  • truncate-table
    Default: yes
    Truncate performance_schema.events_statements_histogram_global after each collection. This resets percentile values so that each collection represents the global query response time during the collection interval rather than during the entire uptime of the MySQL. However, truncating the table interferes with other tools reading (or truncating) the table.

  • truncate-timeout
    Default: 250ms
    The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation.

Error Policy

  • table-not-exist
    MySQL error 1146: Table ‘performance_schema.events_statements_histogram_global’ doesn’t exist

  • truncate-timeout
    Truncation failures on table performance_schema.events_statements_histogram_global

repl

MySQL Replication

Blip version v1.0.1
Sources ≥ MySQL 8.0.22: SHOW REPLICA STATUS
≤ MySQL 8.0.21: SHOW SLAVE STATUS
MySQL config no
Derived metrics running (gauge)
Meta source = Source_Host or Master_Host
Options report-not-a-replica

The repl collects replication metrics. Currently, it collects a single derived metric: running (described below).

A future release will collect these MySQL metrics:

Replica Status Variable Collected
Slave_IO_Running
Slave_SQL_Running
Relay_Log_Space
Seconds_Behind_Master
Auto_Position

Derived metrics

  • running
    Type: gauge

    Value Meaning
    1   ☑MySQL is a replica
      ☑Slave_IO_Running=Yes
      ☑Slave_SQL_Running=Yes
      ☑Last_Errno=0
    0 MySQL is a replica, but IO and SQL threads are not running or a replication error occurred
    -1 MySQL is not a replica: SHOW SLAVE|REPLICA STATUS returns no output

    Replication lag does not affect the running metric: replication can be running but lagging.

Options

  • report-not-a-replica
    Default: no
    If yes, report repl.running = -1 if not a replica. If no, drop the metric if not a replica.

repl.lag

MySQL Replication Lag

Blip version v1.0.0
Sources Blip Heartbeat
MySQL config yes
Derived metrics current (gauge): Current replication lag (milliseconds)
Meta source = Option source-id
Options network-latency
repl-check
report-no-heartbeat
report-not-a-replica
source-id
source-role
table
writer

The repl.lag collector measures and reports MySQL replication lag from a source using the Blip heartbeat. By default, it reports replication lag from the latest timestamp (heartbeat), which presumes there is only one writable node in the replication topology at all times. See Heartbeat to learn more.

Derived metrics

  • current
    Type: gauge
    The current replication lag in milliseconds. This is an instantaneous measurement: replication lag at one moment. As such, it might not detect if lag is “flapping”: oscillating between near-zero and a higher value. But will always detect if replication is steadily lagged and if the lag increases. A future feature might measure and record lag between report intervals.

Options

  • network-latency
    Default: 50
    Network latency (in milliseconds) between source and replicas. The value must be an integer >= 0. (Do not suffix with “ms”.) See Heartbeat > Accuracy.

  • repl-check
    MySQL global system variable, like server_id. (Do not prefix with “@”.) If the value is zero, replica lag is not collected. See Heartbeat > Repl Check.

  • report-no-heartbeat
    Default: no
    If yes, no heartbeat from the source is reported as value -1. If no, the metric is dropped if no heartbeat from the source.

  • report-not-a-replica
    Default: no
    If yes, report repl.running = -1 if not a replica. If no, drop the metric if not a replica.

  • source-id
    Source ID to report lag from. The default (no value) reports lag from the latest (most recent) timestamp. See Heartbeat > Source Following.

  • source-role
    Source role to report lag from. If set, the most recent timestamp is used. See Heartbeat > Source Following.

  • table
    Default: blip.heartbeat
    Blip heartbeat table.

  • writer
    Default: blip
    Type of heartbeat writer. Only blip is currently supported.

size.binlog

Binary Log Storage Size

Blip version v1.0.0
Sources SHOW BINARY LOGS
MySQL config no
Derived metrics bytes: Total size of all binary logs in bytes
Error policy access-denied
binlog-not-enabled

Error Policy

  • access-denied MySQL error 1227: access denied on SHOW BINARY LOGS.

  • binlog-not-enabled MySQL error 1381: binary logging not enabled.

Derived metrics

  • bytes
    Type: gauge
    Total size of all binary logs in bytes.

size.database

Database Storage Sizes

Blip version v1.0.0
MySQL config no
Derived metrics bytes: Database size in bytes
Group keys db

Derived metrics

  • bytes
    Type: gauge
    Database size in bytes.

size.table

Table Storage Sizes

Blip version v1.0.0
MySQL config no
Derived metrics bytes: Table size in bytes
Group keys db, tbl

Derived metrics

  • bytes
    Type: gauge
    Table size in bytes.

status.global

Global Status Variables

Blip version v1.0.0
MySQL config no
Sources SHOW GLOBAL STATUS

status.global collects the primary source of MySQL server metrics: SHOW GLOBAL STATUS.

stmt.current

Statement Metrics

Statements are the second level of the event hierarchy:

transactions
└── statements
    └── stages
        └── waits

All queries are statements, but not all statements are queries. For example, “dump binary log” is a statement used by replicas, but it is not a query in the typical sense. As a result, this domain is much more low-level than the query domain even though the metrics are nearly identical.

Statement metrics are reported as summary statistics: average, maximum, and so forth.

stmt.current reports summary statistics for currently running statements.

tls

TLS (SSL) Status and Configuration

Blip version v1.0.0
MySQL config no
Sources Global variables
Derived metrics enabled: True (1) if have_ssl=YES, else false (0)

Derived metrics

  • enabled
    Type: bool
    True (1) if have_ssl = YES, else false (0).

have_ssl is deprecated as of MySQL 8.0.26. This domain does not currently support the tls_channel_status table.

trx

Transactions

Blip version v1.0.0
MySQL config no
Sources information_schema.innodb_trx
Derived metrics oldest: Time of oldest active trx in seconds

Derived metrics

  • oldest
    Type: gauge
    Time of oldest active (still running) transaction in seconds.

var.global

MySQL System Variables

Blip version v1.0.0
MySQL config no
Sources SHOW GLOBAL VARIABLES, SELECT @@GLOBAL.<var>, Performance Schema

var.global collects global MySQL system variables (“sysvars”).

These are not technically metrics, but some are required to calculate utilization percentages. For example, it’s common to report max_connections to gauge the percentage of max connections used: Max_used_connections / max_connections * 100, which would be status.global.max_used_connections / var.global.max_connections * 100 in Blip metric naming convention.

wait.io.table

Table I/O Wait Metrics

Blip version v1.0.0
MySQL config yes
Sources performance_schema.table_io_waits_summary_by_table
Options exclude
include
truncate
truncate-timeout
all
Error policy truncate-timeout
Group keys db, tbl

Summarized table I/O wait metrics from performance_schema.table_io_waits_summary_by_table. All columns in that table can be specified, or use option all to collect all columns.

Options

  • include
    A comma-separated list of database or table names to include (overrides option exclude).

  • exclude
    Default: mysql.*,information_schema.*,performance_schema.*,sys.*
    A comma-separated list of database or table names to exclude (ignored if include is set).

  • truncate-table
    Default: yes
    If the source table should be truncated to reset data after each retrieval.

  • truncate-timeout
    Default: 250ms
    The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation.

  • all
    Default: no
    If yes, all performance_schema.table_io_waits_summary_by_table metrics are collected—all columns. If no (the default), only the explicitly listed performance_schema.table_io_waits_summary_by_table metrics are collected.

Error Policy

  • truncate-timeout
    Truncation failures on table performance_schema.events_statements_histogram_global