林昶
林昶
Published on 2025-09-29 / 60 Visits
1
0

MySQL 监控部署文档

📋 文档概述

本文档详细说明如何在 Kubernetes 环境中部署 MySQL 监控系统,监控多个环境的 MySQL 实例(dev、uat 等)。

🎯 架构组件

  • Prometheus Operator: 管理 Prometheus 实例

  • mysqld_exporter: MySQL 指标导出器

  • Grafana: 数据可视化

  • Kubernetes: 容器编排平台

🔧 前置条件

  • Kubernetes 集群(已安装 KubeSphere)

  • kubectl 命令行工具

  • 访问 MySQL 实例的权限


第一部分:MySQL 服务器配置

1.1 创建监控用户

在每个 MySQL 服务器上执行:

-- 创建专用监控用户
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'StrongPassword123!' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
GRANT SELECT ON performance_schema.* TO 'exporter'@'localhost';
FLUSH PRIVILEGES;

1.2 安装和配置 mysqld_exporter

# 下载 mysqld_exporter
MYSQLD_EXPORTER_VERSION="0.17.2"
wget https://github.com/prometheus/mysqld_exporter/releases/download/v${MYSQLD_EXPORTER_VERSION}/mysqld_exporter-${MYSQLD_EXPORTER_VERSION}.linux-amd64.tar.gz

# 解压安装
tar xvf mysqld_exporter-${MYSQLD_EXPORTER_VERSION}.linux-amd64.tar.gz
sudo mv mysqld_exporter-${MYSQLD_EXPORTER_VERSION}.linux-amd64/mysqld_exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/mysqld_exporter

# 创建系统用户
sudo useradd -rs /bin/false mysqld_exporter

# 创建配置文件(根据环境调整密码)
sudo tee /etc/mysqld_exporter.cnf > /dev/null <<EOF
[client]
user=exporter
password=StrongPassword123!
host=localhost
port=3306
EOF

sudo chown mysqld_exporter:mysqld_exporter /etc/mysqld_exporter.cnf
sudo chmod 600 /etc/mysqld_exporter.cnf

1.3 创建 systemd 服务

sudo tee /etc/systemd/system/mysqld_exporter.service > /dev/null <<'EOF'
[Unit]
Description=Prometheus MySQL Exporter
After=network.target

[Service]
User=mysqld_exporter
Environment=DATA_SOURCE_NAME="exporter:StrongPassword123!@(localhost:3306)/"
ExecStart=/usr/local/bin/mysqld_exporter \
  --config.my-cnf=/etc/mysqld_exporter.cnf \
  --collect.global_status \
  --collect.info_schema.tables \
  --collect.info_schema.tablestats \
  --collect.info_schema.innodb_metrics \
  --collect.perf_schema.tablelocks \
  --collect.perf_schema.file_events \
  --collect.perf_schema.eventswaits \
  --collect.perf_schema.indexiowaits \
  --collect.perf_schema.tableiowaits \
  --collect.slave_status \
  --collect.auto_increment.columns \
  --collect.info_schema.processlist \
  --collect.binlog_size \
  --collect.info_schema.query_response_time \
  --collect.info_schema.userstats \
  --web.listen-address=0.0.0.0:9104
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target
EOF

# 启动服务
sudo systemctl daemon-reload
sudo systemctl enable mysqld_exporter
sudo systemctl start mysqld_exporter

# 验证服务
sudo systemctl status mysqld_exporter
curl http://localhost:9104/metrics | grep mysql_up

1.4 防火墙配置

# 开放 9104 端口
sudo firewall-cmd --add-port=9104/tcp --permanent
sudo firewall-cmd --reload

# 验证端口可访问性
curl http://<服务器IP>:9104/metrics | grep mysql_up

第二部分:Kubernetes 配置

2.1 创建抓取配置 Secret

创建 mysql-scrape-config.yaml

apiVersion: v1
kind: Secret
metadata:
  name: additional-scrape-configs
  namespace: kubesphere-monitoring-system
type: Opaque
stringData:
  mysql-scrape.yaml: |
    - job_name: 'mysql-dev'
      static_configs:
        - targets: ['192.168.181.153:9104']
          labels:
            instance: 'mysql-dev'
            environment: 'dev'
            account: 'dev-account'
            group: 'dev-group'
            vendor: 'internal'
            region: 'cn-east-1'
      scrape_interval: 30s
      scrape_timeout: 10s
      metrics_path: /metrics

    - job_name: 'mysql-uat'
      static_configs:
        - targets: ['192.168.181.121:9104']
          labels:
            instance: 'mysql-uat'
            environment: 'uat'
            account: 'uat-account'
            group: 'uat-group'
            vendor: 'internal'
            region: 'cn-east-1'
      scrape_interval: 15s
      scrape_timeout: 10s
      metrics_path: /metrics

kubectl apply -f mysql-scrape-config.yaml

2.2 更新 Prometheus CRD

获取当前 Prometheus 配置:

kubectl -n kubesphere-monitoring-system get prometheus k8s -o yaml > prometheus-crd.yaml

编辑 prometheus-crd.yaml,在 spec部分添加:

spec:
  additionalScrapeConfigs:
    name: additional-scrape-configs
    key: mysql-scrape.yaml
  # ... 其他现有配置保持不变

应用更新:

kubectl apply -f prometheus-crd.yaml

2.3 验证配置

# 检查 Secret
kubectl -n kubesphere-monitoring-system get secret additional-scrape-configs

# 检查 Prometheus 配置
kubectl -n kubesphere-monitoring-system port-forward svc/prometheus-k8s 9090
# 访问 http://localhost:9090/config 查看配置

# 检查目标状态
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.scrapePool | startswith("mysql"))'

第三部分:Grafana 配置

3.1 添加 Prometheus 数据源

如果尚未添加,在 Grafana 中:

  1. 导航到 ​Configuration → Data Sources

  2. 点击 ​Add data source

  3. 选择 ​Prometheus

  4. 配置:

    • Name: Prometheus-MySQL

    • URL: http://prometheus-k8s.kubesphere-monitoring-system.svc.cluster.local:9090

  5. 点击 ​Save & Test

3.2 导入 MySQL 监控仪表盘

方法一:使用官方仪表盘 ID

  1. 导航到 ​Dashboards → New → Import

  2. 输入仪表盘 ID: 7362(MySQL Overview)

  3. 选择数据源: Prometheus-MySQL

  4. 点击 ​Import

方法二:使用自定义 JSON

创建 mysql-dashboard.json

{
  "dashboard": {
    "title": "MySQL 多环境监控",
    "tags": ["mysql", "database"],
    "timezone": "browser",
    "panels": [...],
    "templating": {
      "list": [
        {
          "name": "environment",
          "type": "query",
          "query": "label_values(mysql_up, environment)",
          "refresh": 1,
          "includeAll": true,
          "multi": true,
          "allValue": ".*"
        },
        {
          "name": "instance",
          "type": "query",
          "query": "label_values(mysql_up{environment=~\"$environment\"}, instance)",
          "refresh": 1,
          "includeAll": true,
          "multi": true
        }
      ]
    }
  }
}

导入自定义仪表盘。

3.3 配置告警规则(可选)

创建 mysql-alerts.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: mysql-alerts
  namespace: kubesphere-monitoring-system
  labels:
    prometheus: k8s
    role: alert-rules
spec:
  groups:
  - name: mysql.rules
    rules:
    - alert: MySQLDown
      expr: mysql_up == 0
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "MySQL instance {{ $labels.instance }} is down"
        description: "MySQL instance {{ $labels.instance }} has been down for more than 1 minute."
    
    - alert: MySQLHighConnections
      expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "High MySQL connections on {{ $labels.instance }}"
        description: "MySQL connections are at {{ $value | humanizePercentage }} of maximum."
        
    - alert: MySQLSlowQueries
      expr: rate(mysql_global_status_slow_queries[5m]) > 5
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High slow query rate on {{ $labels.instance }}"
        description: "Slow query rate is {{ $value }} queries per second."

应用告警规则:

kubectl apply -f mysql-alerts.yaml

第四部分:故障排查和验证

4.1 常见问题排查

问题 1:mysqld_exporter 无法连接 MySQL

# 测试 MySQL 连接
mysql -u exporter -p'StrongPassword123!' -h localhost -e "SHOW STATUS LIKE 'Uptime';"

# 检查 exporter 日志
journalctl -u mysqld_exporter -f

问题 2:Prometheus 无法抓取指标

# 检查网络连通性
kubectl run -it --rm --image=curlimages/curl test-curl -- curl http://192.168.181.153:9104/metrics

# 检查防火墙
sudo firewall-cmd --list-ports | grep 9104

问题 3:Grafana 无数据

# 检查 Prometheus 数据
curl http://localhost:9090/api/v1/query?query=mysql_up

# 检查标签匹配
curl http://localhost:9090/api/v1/series?match[]=mysql_up

4.2 验证步骤

# 1. 验证 mysqld_exporter 工作
curl http://192.168.181.153:9104/metrics | grep mysql_up

# 2. 验证 Prometheus 抓取
kubectl -n kubesphere-monitoring-system exec -it prometheus-k8s-0 -c prometheus -- \
curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.scrapePool | startswith("mysql"))'

# 3. 验证指标存在
kubectl -n kubesphere-monitoring-system port-forward svc/prometheus-k8s 9090
curl http://localhost:9090/api/v1/query?query=mysql_up

# 4. 验证 Grafana 仪表盘
# 访问 Grafana,检查 MySQL 仪表盘是否显示数据

4.3 性能优化建议

  1. 调整抓取间隔​:

    • 开发环境:30-60秒

    • 生产环境:15-30秒

  2. 限制收集的指标​:

    # 只收集必要的指标
    --collect.global_status \
    --collect.info_schema.tables \
    --collect.slave_status
  3. 设置数据保留策略​:

    spec:
      retention: 7d  # 根据需求调整

第五部分:维护和监控

5.1 日常监控检查清单

  • [ ] MySQL 实例状态 (mysql_up)

  • [ ] 连接数监控 (mysql_global_status_threads_connected)

  • [ ] 查询性能 (mysql_global_status_queries)

  • [ ] 复制状态 (mysql_global_status_slave_status)

  • [ ] 缓冲池使用率 (mysql_global_status_innodb_buffer_pool_pages_total)

5.2 备份和恢复

备份配置​:

# 备份 Prometheus 配置
kubectl -n kubesphere-monitoring-system get secret additional-scrape-configs -o yaml > additional-scrape-configs-backup.yaml

# 备份仪表盘配置
# 在 Grafana 中导出仪表盘 JSON

恢复配置​:

# 恢复 Secret
kubectl apply -f additional-scrape-configs-backup.yaml

# 重启 Prometheus 使配置生效
kubectl -n kubesphere-monitoring-system delete pods -l app.kubernetes.io/name=prometheus

5.3 扩展监控

添加新的 MySQL 实例​:

  1. 在新服务器上部署 mysqld_exporter

  2. 更新 additional-scrape-configsSecret

  3. 添加新的 job 配置

监控其他数据库指标​:

  • 慢查询日志分析

  • 表空间使用情况

  • 锁等待统计

  • 复制延迟监控


📞 支持信息

日志位置

  • mysqld_exporter: /var/log/mysqld_exporter.logjournalctl -u mysqld_exporter

  • Prometheus: kubectl logs -n kubesphere-monitoring-system prometheus-k8s-0

  • Grafana: kubectl logs -n kubesphere-monitoring-system grafana-pod-name

有用的命令

# 检查所有 MySQL 目标状态
curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.scrapePool | startswith("mysql")) | {job: .scrapePool, instance: .labels.instance, health: .health}'

# 检查特定指标
curl http://localhost:9090/api/v1/query?query=mysql_global_status_connections

# 强制重载配置
curl -X POST http://localhost:9090/-/reload

✅ 完成检查清单

  • [ ] 所有 MySQL 实例已部署 mysqld_exporter

  • [ ] Kubernetes Secret 配置正确

  • [ ] Prometheus 能够抓取所有 MySQL 指标

  • [ ] Grafana 仪表盘正常显示数据

  • [ ] 告警规则配置正确

  • [ ] 防火墙端口已开放

  • [ ] 监控用户权限正确配置


Comment