将请求镜像到第二个 Grafana Mimir 集群
当您想要设置一个 Grafana Mimir 测试集群,使其接收与主集群摄取相同的序列,并且无法控制 Prometheus remote write 配置时,可以使用请求镜像功能。
如果您可以控制 Prometheus remote write 配置,建议您在 Prometheus 中配置两个 remote write 条目。有关 Prometheus remote write 配置的更多信息,请参阅Prometheus remote write 参考。
使用 Envoy proxy 进行镜像
您可以使用Envoy proxy 将 HTTP 请求镜像到辅助上游集群。从网络路径角度来看,将 Envoy 运行在两个集群的distributors 前方。
这种方法使得 Envoy proxy 可以向主 Grafana Mimir 集群发出请求,然后在后台将请求镜像到辅助集群。
辅助集群的性能和可用性对主集群的请求没有影响。响应始终从主集群发送到客户端。
从 Envoy 到辅助集群的请求是“即发即弃”的,这意味着 Envoy 在将响应发送回客户端之前,不会等待请求在辅助集群上完成。
下图展示了简化的网络结构。
Envoy 配置示例
以下 Envoy 配置展示了一个包含两个 Grafana Mimir 集群的示例。Envoy 监听端口9900
并代理所有请求到mimir-primary:8080
,同时也将它们镜像到mimir-secondary:8080
。
admin:
# No access logs.
access_log_path: /dev/null
address:
socket_address: { address: 0.0.0.0, port_value: 9901 }
static_resources:
listeners:
- name: mimir_listener
address:
socket_address: { address: 0.0.0.0, port_value: 9900 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
stat_prefix: mimir_ingress
route_config:
name: all_routes
virtual_hosts:
- name: all_hosts
domains: ["*"]
routes:
- match: { prefix: "/" }
route:
cluster: mimir_primary
# Specifies the upstream timeout. This spans between the point at which the entire downstream
# request has been processed and when the upstream response has been completely processed.
timeout: 15s
# Specifies the cluster that requests will be mirrored to. The performance
# and availability of the secondary cluster have no impact on the requests to the primary
# one. The response to the client will always be the one from the primary one. In this sense,
# the requests from Envoy to the secondary cluster are "fire and forget".
request_mirror_policies:
- cluster: mimir_secondary
http_filters:
- name: envoy.router
clusters:
- name: mimir_primary
type: STRICT_DNS
connect_timeout: 1s
# Replace mimir-primary with the address and port the distributor of your primary mimir cluster
hosts: [{ socket_address: { address: mimir-primary, port_value: 8080 }}]
dns_refresh_rate: 5s
- name: mimir_secondary
type: STRICT_DNS
connect_timeout: 1s
# Replace mimir-secondary with the address and port the distributor of your secondary mimir cluster
hosts: [{ socket_address: { address: mimir-secondary, port_value: 8080 }}]
dns_refresh_rate: 5s