菜单
开源

启用尾部采样

您可以在必要或期望时使用尾部采样来降低采样百分比,例如为了运行时或出口流量相关的成本。概率采样策略易于实现,但也存在丢弃您以后可能需要的相关数据的风险。

尾部采样与 Grafana Alloy 配合使用。Alloy 配置文件使用 Alloy 配置语法 编写。

配置尾部采样

要开始使用尾部采样,请在您的配置文件中定义采样策略。

如果您使用 Alloy 的多实例部署,请添加负载均衡并指定查找设置中其他 Alloy 实例的解析机制。

要查看负载均衡的所有可用配置选项,请参阅 Alloy 组件参考

Alloy 示例

Alloy 使用 otelcol.processor.tail_sampling 组件 进行尾部采样。

alloy
otelcol.receiver.otlp "default" {
  http {}
  grpc {}

  output {
    traces  = [otelcol.processor.tail_sampling.policies.input]
  }
}

// The Tail Sampling processor will use a set of policies to determine which received
// traces to keep and send to Tempo.
otelcol.processor.tail_sampling "policies" {
    // Total wait time from the start of a trace before making a sampling decision.
    // Note that smaller time periods can potentially cause a decision to be made
    // before the end of a trace has occurred.
    decision_wait = "30s"

    // The following policies follow a logical OR pattern, meaning that if any of the
    // policies match, the trace will be kept. For logical AND, you can use the `and`
    // policy. Every span of a trace is examined by each policy in turn. A match will
    // cause a short-circuit.

    // This policy defines that traces that contain errors should be kept.
    policy {
        // The name of the policy can be used for logging purposes.
        name = "sample-erroring-traces"
        // The type must match the type of policy to be used, in this case examining
        // the status code of every span in the trace.
        type = "status_code"
        // This block determines the error codes that should match in order to keep
        // the trace, in this case the OpenTelemetry 'ERROR' code.
        status_code {
            status_codes = [ "ERROR" ]
        }
    }

    // This policy defines that only traces that are longer than 200ms in total
    // should be kept.
    policy {
        // The name of the policy can be used for logging purposes.
        name = "sample-long-traces"
        // The type must match the policy to be used, in this case the total latency
        // of the trace.
        type = "latency"
        // This block determines the total length of the trace in milliseconds.
        latency {
            threshold_ms = 200
        }
    }

    // The output block forwards the kept traces onto the batch processor, which
    // will marshall them for exporting to the Grafana OTLP gateway.
    output {
        traces = [otelcol.exporter.otlp.default.input]
    }
}

otelcol.exporter.otlp "default" {
  client {
    endpoint = env("OTLP_ENDPOINT")
  }
}