自动化监控运维(一)Prometheus 构建

零、 说明

  • 两种部署方式
    • 二进制包部署+systemd管理
    • docker 部署

一、 二进制方式部署

1.1 下载

下载连接

1.2 解压

tar -xzvf prometheus-2.47.0.linux-amd64.tar.gz
cp -r prometheus-2.47.0.linux-amd64 /usr/local/prometheus

1.3 运行测试

cd /usr/local/prometheus
./prometheus --config.file=./prometheus.yml

默认数据目录: ./data

1.4 访问测试

curl http://192.168.128.203:9090/metrics

有数据返回一般就没什么问题了.

1.5 使用 systemd 管理

# /usr/local/prometheus-2.47/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
Type=simple
User=root
Restart=on-failure
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/prometheus-2.47/prometheus \
--config.file=/usr/local/prometheus-2.47/prometheus.yml \
--storage.tsdb.path /usr/local/prometheus-2.47/data \
--storage.tsdb.retention=15d

[Install]
WantedBy=multi-user.target

systemctl daemon-reload
systemctl enable prometheus --now

二、 使用 docker 构建

2.1 基于centos7.9镜像构建

docker build --network=host -t prometheus-2.47.0 .

  • -network=host可以有效避免yum install的报错
  • -t prometheus-2.47.0 指定构建后的镜像名称

Dockerfile: vim Dockerfile

FROM centos:centos7

# Define Prometheus home and its version
ARG PROMETHEUS_HOME=/opt/prometheus
ARG PROMETHEUS_VERSION=2.47.0

# Define TAR & folder names, as well as download URL for easier use
ARG PROMETHEUS_TAR_MAYOR=prometheus-${PROMETHEUS_VERSION}.linux-amd64
ARG PROMETHEUS_TAR_FULLNAME=${PROMETHEUS_TAR_MAYOR}.tar.gz
ARG PROMETHEUS_URL=https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/${PROMETHEUS_TAR_FULLNAME}

# Install wget; download Prometheus
RUN yum install -y wget && \
wget ${PROMETHEUS_URL}

# Untar the file and rename it to "prometheus"
RUN tar xvfz ${PROMETHEUS_TAR_FULLNAME} -C /opt && \
mv /opt/${PROMETHEUS_TAR_MAYOR} /opt/prometheus && \
rm -rf /opt/prometheus/prometheus.yml

RUN groupadd -r prometheus && \
useradd -g prometheus -s /bin/bash -c "Prometheus user" prometheus && \
chown -R prometheus:prometheus /opt/prometheus

COPY config/prometheus.yml /opt/prometheus/prometheus.yml

RUN chown prometheus:prometheus /opt/prometheus/prometheus.yml && \
chmod 755 /opt/prometheus/prometheus.yml

EXPOSE 9090
VOLUME [ "/opt/prometheus" ]
WORKDIR ${PROMETHEUS_HOME}

USER prometheus

ENTRYPOINT [ "./prometheus" ]
CMD [ "--config.file=prometheus.yml" ]

运行镜像测试:
docker run -p 9090:9090 prometheus-2.47.0


没有条件访问github的使用这个Dockerfile
docker build -f prometheus.yml -t prometheus-2.49-rc2 .

FROM centos:centos7.9.2009

ARG PROMETHEUS_HOME=/usr/local/prometheus

# Add file
COPY ./prometheus-2.49.0-rc.2.linux-amd64.tar.gz /opt

# tar file and rename to "prometheus"
RUN mkdir -p ${PROMETHEUS_HOME}
RUN tar -xzvf /opt/prometheus-2.49.0-rc.2.linux-amd64.tar.gz -C /usr/local/prometheus --strip-components=1

RUN groupadd -r prometheus && \
useradd -g prometheus -s /bin/bash -c "Prometheus user" prometheus && \
chown -R prometheus:prometheus /usr/local/prometheus

# port
EXPOSE 9090

VOLUME [ "/usr/local/prometheus" ]

WORKDIR ${PROMETHEUS_HOME}

USER prometheus

ENTRYPOINT [ "./prometheus" ]
CMD [ "--config.file=prometheus.yml" ]

2.2 使用 docker-compose 快速构建

vim Prometheus.yaml

version: "3"
services:
prometheus:
image: prometheus-2.47.0:latest
user: "1000:1000"
ports:
- "9090:9090"
container_name: prometheus
restart: always
volumes:
- /usr/local/prometheus-2.47/prometheus.yml:/opt/prometheus/prometheus.yml
networks:
default:
driver: bridge

运行:

docker-compose -f prometheus.yaml up -d

三、 常见问题

3.1 Prometheus查询不到数据

报错详情: Error on ingesting samples that are too old or are too far into the future
报错详情: err=”out of bounds”

caller=scrape.go:1741 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.128.203:9090/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=666

caller=scrape.go:1405 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.128.203:9090/metrics msg="Append failed" err="out of bounds"

引用

问题解决: 构建镜像或者其他原因造成的容器内时间和容器外时间不一致,且Prometheus已获取了错误时间内的相关数据.
清空容器内 /opt/prometheus/data 的数据 并重启容器.

清空容器内错误数据:
docker exec -it prometheus /bin/bash
cd /opt/prometheus/data
rm -rf 0* wal/0* wal/checkpoint.0*

重启容器:
docker-compose -f prometheus.yaml down
docker-compose -f prometheus.yaml up -d