Domains
This page documents the metric domains from which Blip currently collects metrics. Use --print-domains
to list these domains from the command line:
$ blip --print-domains | less
Each domain begins with a table:
- Blip version
- Blip version domain was added or changed.
- MySQL config
- If MySQL must be explicitly or specially configured to provide the metrics.
- Sources
- MySQL source of metrics.
- Derived metrics
- Derived metrics. Omitted if none.
- Group keys
- Metric groups. Omitted if none.
- Meta
- Metric meta. Omitted if none.
- Options
- Domain options. Omitted if none.
- Error policy
- MySQL error codes handled by optional error policy. Omitted if none.
- aws.rds
- innodb
- percona.response-time
- query.response-time
- repl
- repl.lag
- size.binlog
- size.database
- size.table
- status.global
- stmt.current
- tls
- trx
- var.global
- wait.io.table
aws.rds
Amazon RDS for MySQL
Blip version | v1.0.0 |
MySQL config | no |
Sources | Amazon RDS API |
Collects Amazon RDS metrics.
innodb
InnoDB Metrics
Blip version | v1.0.0 |
MySQL config | maybe |
Sources | information_schema.innodb_metrics |
Meta | • subsystem = SUBSYSTEM column |
Options | • all |
Metrics from INFORMATION_SCHEMA.INNODB_METRICS
.
Options
all
Default:no
Ifyes
, all InnoDB metrics are collect—the whole table. Ifno
(the default), only the explicitly listed InnoDB metrics are collected. Ifenabled
, only InnoDB metrics enabled by the MySQL configuration are collected (WHERE status='enabled'
in the table).
percona.response-time
Percona Server Query Response Time
Blip version | v1.0.0 |
MySQL config | yes |
Sources | Percona Server 5.7 RTD plugin |
Derived metrics | • pN (gauge) for each value in the percentiles option |
Meta | • pN=pA : where pN is collected percentile and pA is actual percentile |
Options | • flush • real-percentiles |
Error policy | • unknown-table |
The percona.response-time
domain collects query response time percentiles from the Percona Server 5.7 Response Time Distribution plugin.
This domain is functionally identical to query.response-time
; only one option name is different:
percona.response-time | query.response-time |
---|---|
flush | truncate-table |
See query.response-time
for details.
Error Policy
unknown-table
MySQL error 1109: Unknown table ‘query_response_time’ in information_schema
query.response-time
MySQL Query Response Time
Blip version | v1.0.0 |
MySQL config | yes |
Sources | MySQL 8.0 p_s.events_statements_histogram_global |
Derived metrics | • pN (gauge) |
Meta | • pN=pA : where pN is collected percentile and pA is actual percentile |
Options | • real-percentiles • truncate-table • truncate-timeout |
Error policy | • table-not-exist • truncate-timeout |
The query.response-time
domain collect query response time percentiles. By default, it reports the P999 (99.9th percentile) response time in microseconds.
To convert units, use the TransformMetrics plugin or write a custom sink.
Derived metrics
pN
Type: gauge
Response time percentile to collect whereN
between 1 and 999. (The “p” prefix is required.)p95
collects the 95th percentile.p999
collects the 99.9th percentile. The response time value is reported in microseconds. The true percentile might be slightly greater depending on how the histogram buckets are configured. For example, if collectingp95
, the real percentile might bep95.8
.
Options
-
real-percentiles
Default: yes
If yes (default), reports the real percentile in meta for each percentile in options. MySQL (and Percona Server) use histograms with variable bucket ranges. Therefore, the P99 might actually be P98.9 or P99.2. Meta keypN
indicates the configured percentile, and its valuepA
indicates the actual percentile that was used. -
truncate-table
Default: yes
Truncate performance_schema.events_statements_histogram_global after each collection. This resets percentile values so that each collection represents the global query response time during the collection interval rather than during the entire uptime of the MySQL. However, truncating the table interferes with other tools reading (or truncating) the table. -
truncate-timeout
Default: 250ms
The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation.
Error Policy
-
table-not-exist
MySQL error 1146: Table ‘performance_schema.events_statements_histogram_global’ doesn’t exist -
truncate-timeout
Truncation failures on tableperformance_schema.events_statements_histogram_global
repl
MySQL Replication
Blip version | v1.0.1 |
Sources | ≥ MySQL 8.0.22: SHOW REPLICA STATUS ≤ MySQL 8.0.21: SHOW SLAVE STATUS |
MySQL config | no |
Derived metrics | • running (gauge) |
Meta | • source = Source_Host or Master_Host |
Options | • report-not-a-replica |
The repl
collects replication metrics. Currently, it collects a single derived metric: running
(described below).
A future release will collect these MySQL metrics:
Replica Status Variable | Collected |
---|---|
Slave_IO_Running | ✓ |
Slave_SQL_Running | ✓ |
Relay_Log_Space | ✓ |
Seconds_Behind_Master | ✓ |
Auto_Position | ✓ |
Derived metrics
-
running
Type: gaugeValue Meaning 1 ☑MySQL is a replica
☑Slave_IO_Running=Yes
☑Slave_SQL_Running=Yes
☑Last_Errno=0
0 MySQL is a replica, but IO and SQL threads are not running or a replication error occurred -1 MySQL is not a replica: SHOW SLAVE|REPLICA STATUS
returns no outputReplication lag does not affect the
running
metric: replication can be running but lagging.
Options
report-not-a-replica
Default: no
If yes, reportrepl.running = -1
if not a replica. If no, drop the metric if not a replica.
repl.lag
MySQL Replication Lag
Blip version | v1.0.0 |
Sources | Blip Heartbeat |
MySQL config | yes |
Derived metrics | • current (gauge): Current replication lag (milliseconds) |
Meta | • source = Option source-id |
Options | • network-latency • repl-check • report-no-heartbeat • report-not-a-replica • source-id • source-role • table • writer |
The repl.lag
collector measures and reports MySQL replication lag from a source using the Blip heartbeat. By default, it reports replication lag from the latest timestamp (heartbeat), which presumes there is only one writable node in the replication topology at all times. See Heartbeat to learn more.
Derived metrics
current
Type: gauge
The current replication lag in milliseconds. This is an instantaneous measurement: replication lag at one moment. As such, it might not detect if lag is “flapping”: oscillating between near-zero and a higher value. But will always detect if replication is steadily lagged and if the lag increases. A future feature might measure and record lag between report intervals.
Options
-
network-latency
Default: 50
Network latency (in milliseconds) between source and replicas. The value must be an integer >= 0. (Do not suffix with “ms”.) See Heartbeat > Accuracy. -
repl-check
MySQL global system variable, likeserver_id
. (Do not prefix with “@”.) If the value is zero, replica lag is not collected. See Heartbeat > Repl Check. -
report-no-heartbeat
Default: no
If yes, no heartbeat from the source is reported as value -1. If no, the metric is dropped if no heartbeat from the source. -
report-not-a-replica
Default: no
If yes, reportrepl.running = -1
if not a replica. If no, drop the metric if not a replica. -
source-id
Source ID to report lag from. The default (no value) reports lag from the latest (most recent) timestamp. See Heartbeat > Source Following. -
source-role
Source role to report lag from. If set, the most recent timestamp is used. See Heartbeat > Source Following. -
table
Default:blip.heartbeat
Blip heartbeat table. -
writer
Default:blip
Type of heartbeat writer. Onlyblip
is currently supported.
size.binlog
Binary Log Storage Size
Blip version | v1.0.0 |
Sources | SHOW BINARY LOGS |
MySQL config | no |
Derived metrics | • bytes : Total size of all binary logs in bytes |
Error policy | • access-denied • binlog-not-enabled |
Error Policy
-
access-denied
MySQL error 1227: access denied onSHOW BINARY LOGS
. -
binlog-not-enabled
MySQL error 1381: binary logging not enabled.
Derived metrics
bytes
Type: gauge
Total size of all binary logs in bytes.
size.database
Database Storage Sizes
Blip version | v1.0.0 |
MySQL config | no |
Derived metrics | • bytes : Database size in bytes |
Group keys | db |
Derived metrics
bytes
Type: gauge
Database size in bytes.
size.table
Table Storage Sizes
Blip version | v1.0.0 |
MySQL config | no |
Derived metrics | • bytes : Table size in bytes |
Group keys | db , tbl |
Derived metrics
bytes
Type: gauge
Table size in bytes.
status.global
Global Status Variables
Blip version | v1.0.0 |
MySQL config | no |
Sources | SHOW GLOBAL STATUS |
status.global
collects the primary source of MySQL server metrics: SHOW GLOBAL STATUS
.
stmt.current
Statement Metrics
Statements are the second level of the event hierarchy:
transactions
└── statements
└── stages
└── waits
All queries are statements, but not all statements are queries. For example, “dump binary log” is a statement used by replicas, but it is not a query in the typical sense. As a result, this domain is much more low-level than the query
domain even though the metrics are nearly identical.
Statement metrics are reported as summary statistics: average, maximum, and so forth.
stmt.current
reports summary statistics for currently running statements.
tls
TLS (SSL) Status and Configuration
Blip version | v1.0.0 |
MySQL config | no |
Sources | Global variables |
Derived metrics | • enabled : True (1) if have_ssl=YES, else false (0) |
Derived metrics
enabled
Type: bool
True (1) ifhave_ssl = YES
, else false (0).
have_ssl
is deprecated as of MySQL 8.0.26. This domain does not currently support the tls_channel_status
table.
trx
Transactions
Blip version | v1.0.0 |
MySQL config | no |
Sources | information_schema.innodb_trx |
Derived metrics | • oldest : Time of oldest active trx in seconds |
Derived metrics
oldest
Type: gauge
Time of oldest active (still running) transaction in seconds.
var.global
MySQL System Variables
Blip version | v1.0.0 |
MySQL config | no |
Sources | SHOW GLOBAL VARIABLES , SELECT @@GLOBAL.<var> , Performance Schema |
var.global
collects global MySQL system variables (“sysvars”).
These are not technically metrics, but some are required to calculate utilization percentages. For example, it’s common to report max_connections
to gauge the percentage of max connections used: Max_used_connections / max_connections * 100
, which would be status.global.max_used_connections / var.global.max_connections * 100
in Blip metric naming convention.
wait.io.table
Table I/O Wait Metrics
Blip version | v1.0.0 |
MySQL config | yes |
Sources | performance_schema.table_io_waits_summary_by_table |
Options | • exclude • include • truncate • truncate-timeout • all |
Error policy | • truncate-timeout |
Group keys | db , tbl |
Summarized table I/O wait metrics from performance_schema.table_io_waits_summary_by_table
. All columns in that table can be specified, or use option all
to collect all columns.
Options
-
include
A comma-separated list of database or table names to include (overrides optionexclude
). -
exclude
Default:mysql.*,information_schema.*,performance_schema.*,sys.*
A comma-separated list of database or table names to exclude (ignored ifinclude
is set). -
truncate-table
Default:yes
If the source table should be truncated to reset data after each retrieval. -
truncate-timeout
Default: 250ms
The amount of time to wait while attempting to truncate performance_schema.events_statements_histogram_global. Normally, truncating a table is nearly instantaneous, but metadata locks can block the operation. -
all
Default:no
Ifyes
, allperformance_schema.table_io_waits_summary_by_table
metrics are collected—all columns. Ifno
(the default), only the explicitly listedperformance_schema.table_io_waits_summary_by_table
metrics are collected.
Error Policy
truncate-timeout
Truncation failures on tableperformance_schema.events_statements_histogram_global