Domains

This page documents the metric domains from which Blip currently collects metrics. Use --print-domains to list these domains from the command line:

$ blip --print-domains | less

Each domain begins with a table:

Blip version: Blip version domain was added or changed.
MySQL config: If MySQL must be explicitly or specially configured to provide the metrics.
Sources: MySQL source of metrics.
Derived metrics: Derived metrics. Omitted if none.
Group keys: Metric groups. Omitted if none.
Meta: Metric meta. Omitted if none.
Options: Domain options. Omitted if none.
Error policy: MySQL error codes handled by optional error policy. Omitted if none.

aws.rds
innodb
percona.response-time
query.response-time
repl
repl.lag
size.binlog
size.database
size.table
status.global
stmt.current
tls
trx
var.global
wait.io.table

aws.rds

Amazon RDS for MySQL

Blip version	v1.0.0
MySQL config	no
Sources	Amazon RDS API

Collects Amazon RDS metrics.

innodb

InnoDB Metrics

Blip version	v1.0.0
MySQL config	maybe
Sources	`information_schema.innodb_metrics`
Meta	• `subsystem` = `SUBSYSTEM` column
Options	• `all`

Metrics from INFORMATION_SCHEMA.INNODB_METRICS.

Options

all
Default: no
If yes, all InnoDB metrics are collect—the whole table. If no (the default), only the explicitly listed InnoDB metrics are collected. If enabled, only InnoDB metrics enabled by the MySQL configuration are collected (WHERE status='enabled' in the table).

percona.response-time

Percona Server Query Response Time

Blip version	v1.0.0
MySQL config	yes
Sources	Percona Server 5.7 RTD plugin
Derived metrics	• `pN` (gauge) for each value in the `percentiles` option
Meta	• `pN=pA`: where `pN` is collected percentile and `pA` is actual percentile
Options	• `flush` • `real-percentiles`
Error policy	• `unknown-table`

The percona.response-time domain collects query response time percentiles from the Percona Server 5.7 Response Time Distribution plugin.

This domain is functionally identical to query.response-time; only one option name is different:

`percona.response-time`	`query.response-time`
`flush`	`truncate-table`

See query.response-time for details.

Error Policy

unknown-table
MySQL error 1109: Unknown table ‘query_response_time’ in information_schema

query.response-time

MySQL Query Response Time

Blip version	v1.0.0
MySQL config	yes
Sources	MySQL 8.0 p_s.events_statements_histogram_global
Derived metrics	• `pN` (gauge)
Meta	• `pN=pA`: where `pN` is collected percentile and `pA` is actual percentile
Options	• `real-percentiles` • `truncate-table` • `truncate-timeout`
Error policy	• `table-not-exist` • `truncate-timeout`

The query.response-time domain collect query response time percentiles. By default, it reports the P999 (99.9th percentile) response time in microseconds.

To convert units, use the TransformMetrics plugin or write a custom sink.

Derived metrics

pN
Type: gauge
Response time percentile to collect where N between 1 and 999. (The “p” prefix is required.) p95 collects the 95th percentile. p999 collects the 99.9th percentile. The response time value is reported in microseconds. The true percentile might be slightly greater depending on how the histogram buckets are configured. For example, if collecting p95, the real percentile might be p95.8.

Options

real-percentiles
Default: yes
If yes (default), reports the real percentile in meta for each percentile in options. MySQL (and Percona Server) use histograms with variable bucket ranges. Therefore, the P99 might actually be P98.9 or P99.2. Meta key pN indicates the configured percentile, and its value pA indicates the actual percentile that was used.
truncate-table
Default: yes
Truncate performance_schema.events_statements_histogram_global after each collection. This resets percentile values so that each collection represents the global query response time during the collection interval rather than during the entire uptime of the MySQL. However, truncating the table interferes with other tools reading (or truncating) the table.
truncate-timeout
Default: 250ms
The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation.

Error Policy

table-not-exist
MySQL error 1146: Table ‘performance_schema.events_statements_histogram_global’ doesn’t exist
truncate-timeout
Truncation failures on table performance_schema.events_statements_histogram_global

repl

MySQL Replication

Blip version	v1.0.1
Sources	≥ MySQL 8.0.22: `SHOW REPLICA STATUS` ≤ MySQL 8.0.21: `SHOW SLAVE STATUS`
MySQL config	no
Derived metrics	• `running` (gauge)
Meta	• `source` = `Source_Host` or `Master_Host`
Options	• `report-not-a-replica`

The repl collects replication metrics. Currently, it collects a single derived metric: running (described below).

A future release will collect these MySQL metrics:

Replica Status Variable	Collected
Slave_IO_Running	✓
Slave_SQL_Running	✓
Relay_Log_Space	✓
Seconds_Behind_Master	✓
Auto_Position	✓

Derived metrics

running
Type: gauge

Value	Meaning
1	☑MySQL is a replica ☑`Slave_IO_Running=Yes` ☑`Slave_SQL_Running=Yes` ☑`Last_Errno=0`
0	MySQL is a replica, but IO and SQL threads are not running or a replication error occurred
-1	MySQL is not a replica: `SHOW SLAVE\|REPLICA STATUS` returns no output

Replication lag does not affect the running metric: replication can be running but lagging.

Options

report-not-a-replica
Default: no
If yes, report repl.running = -1 if not a replica. If no, drop the metric if not a replica.

repl.lag

MySQL Replication Lag

Blip version	v1.0.0
Sources	Blip Heartbeat
MySQL config	yes
Derived metrics	• `current` (gauge): Current replication lag (milliseconds)
Meta	• `source` = Option `source-id`
Options	• `network-latency` • `repl-check` • `report-no-heartbeat` • `report-not-a-replica` • `source-id` • `source-role` • `table` • `writer`

The repl.lag collector measures and reports MySQL replication lag from a source using the Blip heartbeat. By default, it reports replication lag from the latest timestamp (heartbeat), which presumes there is only one writable node in the replication topology at all times. See Heartbeat to learn more.

Derived metrics

current
Type: gauge
The current replication lag in milliseconds. This is an instantaneous measurement: replication lag at one moment. As such, it might not detect if lag is “flapping”: oscillating between near-zero and a higher value. But will always detect if replication is steadily lagged and if the lag increases. A future feature might measure and record lag between report intervals.

Options

network-latency
Default: 50
Network latency (in milliseconds) between source and replicas. The value must be an integer >= 0. (Do not suffix with “ms”.) See Heartbeat > Accuracy.
repl-check
MySQL global system variable, like server_id. (Do not prefix with “@”.) If the value is zero, replica lag is not collected. See Heartbeat > Repl Check.
report-no-heartbeat
Default: no
If yes, no heartbeat from the source is reported as value -1. If no, the metric is dropped if no heartbeat from the source.
report-not-a-replica
Default: no
If yes, report repl.running = -1 if not a replica. If no, drop the metric if not a replica.
source-id
Source ID to report lag from. The default (no value) reports lag from the latest (most recent) timestamp. See Heartbeat > Source Following.
source-role
Source role to report lag from. If set, the most recent timestamp is used. See Heartbeat > Source Following.
table
Default: blip.heartbeat
Blip heartbeat table.
writer
Default: blip
Type of heartbeat writer. Only blip is currently supported.

size.binlog

Binary Log Storage Size

Blip version	v1.0.0
Sources	`SHOW BINARY LOGS`
MySQL config	no
Derived metrics	• `bytes`: Total size of all binary logs in bytes
Error policy	• `access-denied` • `binlog-not-enabled`

Error Policy

access-denied MySQL error 1227: access denied on SHOW BINARY LOGS.
binlog-not-enabled MySQL error 1381: binary logging not enabled.

Derived metrics

bytes
Type: gauge
Total size of all binary logs in bytes.

size.database

Database Storage Sizes

Blip version	v1.0.0
MySQL config	no
Derived metrics	• `bytes`: Database size in bytes
Group keys	`db`

Derived metrics

bytes
Type: gauge
Database size in bytes.

size.table

Table Storage Sizes

Blip version	v1.0.0
MySQL config	no
Derived metrics	• `bytes`: Table size in bytes
Group keys	`db`, `tbl`

Derived metrics

bytes
Type: gauge
Table size in bytes.

status.global

Global Status Variables

Blip version	v1.0.0
MySQL config	no
Sources	`SHOW GLOBAL STATUS`

status.global collects the primary source of MySQL server metrics: SHOW GLOBAL STATUS.

stmt.current

Statement Metrics

Statements are the second level of the event hierarchy:

transactions
└── statements
    └── stages
        └── waits

All queries are statements, but not all statements are queries. For example, “dump binary log” is a statement used by replicas, but it is not a query in the typical sense. As a result, this domain is much more low-level than the query domain even though the metrics are nearly identical.

Statement metrics are reported as summary statistics: average, maximum, and so forth.

stmt.current reports summary statistics for currently running statements.

tls

TLS (SSL) Status and Configuration

Blip version	v1.0.0
MySQL config	no
Sources	Global variables
Derived metrics	• `enabled`: True (1) if have_ssl=YES, else false (0)

Derived metrics

enabled
Type: bool
True (1) if have_ssl = YES, else false (0).

have_ssl is deprecated as of MySQL 8.0.26. This domain does not currently support the tls_channel_status table.

trx

Transactions

Blip version	v1.0.0
MySQL config	no
Sources	`information_schema.innodb_trx`
Derived metrics	• `oldest`: Time of oldest active trx in seconds

Derived metrics

oldest
Type: gauge
Time of oldest active (still running) transaction in seconds.

var.global

MySQL System Variables

Blip version	v1.0.0
MySQL config	no
Sources	`SHOW GLOBAL VARIABLES`, `SELECT @@GLOBAL.<var>`, Performance Schema

var.global collects global MySQL system variables (“sysvars”).

These are not technically metrics, but some are required to calculate utilization percentages. For example, it’s common to report max_connections to gauge the percentage of max connections used: Max_used_connections / max_connections * 100, which would be status.global.max_used_connections / var.global.max_connections * 100 in Blip metric naming convention.

wait.io.table

Table I/O Wait Metrics

Blip version	v1.0.0
MySQL config	yes
Sources	`performance_schema.table_io_waits_summary_by_table`
Options	• `exclude` • `include` • `truncate` • `truncate-timeout` • `all`
Error policy	• `truncate-timeout`
Group keys	`db`, `tbl`

Summarized table I/O wait metrics from performance_schema.table_io_waits_summary_by_table. All columns in that table can be specified, or use option all to collect all columns.

Options

include
A comma-separated list of database or table names to include (overrides option exclude).
exclude
Default: mysql.*,information_schema.*,performance_schema.*,sys.*
A comma-separated list of database or table names to exclude (ignored if include is set).
truncate-table
Default: yes
If the source table should be truncated to reset data after each retrieval.
truncate-timeout
Default: 250ms
The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation.
all
Default: no
If yes, all performance_schema.table_io_waits_summary_by_table metrics are collected—all columns. If no (the default), only the explicitly listed performance_schema.table_io_waits_summary_by_table metrics are collected.

Error Policy

truncate-timeout
Truncation failures on table performance_schema.events_statements_histogram_global