菜单
开源

管道

注意

Promtail 已被弃用,并将通过长期支持 (LTS) 持续到 2026 年 2 月 28 日。Promtail 将于 2026 年 3 月 2 日结束生命周期 (EOL)。您可以在此处找到迁移资源。

详细介绍如何设置 Promtail 处理您的日志行,包括提取指标和标签。

管道

管道用于转换单个日志行、其标签和其时间戳。管道由一组阶段组成。共有 4 种类型的阶段

  1. 解析阶段解析当前日志行并从中提取数据。提取的数据可供其他阶段使用。
  2. 转换阶段转换之前阶段提取的数据。
  3. 操作阶段接受之前阶段提取的数据并对其进行处理。操作可以
    1. 向日志行添加或修改现有标签
    2. 更改日志行的时间戳
    3. 更改日志行的内容
    4. 根据提取的数据创建指标
  4. 过滤阶段根据某些条件选择性地应用一部分阶段或丢弃条目。

典型的管道将从解析阶段开始(例如 regexjson 阶段)以从日志行中提取数据。然后,会有一系列操作阶段对提取的数据进行处理。最常见的操作阶段将是 labels 阶段,用于将提取的数据转换为标签。

常见的阶段还将包括 match 阶段,用于根据 LogQL 流选择器和过滤表达式 选择性地应用阶段或丢弃条目。

请注意,目前管道不能用于日志去重;例如,如果出现以下情况,Grafana Loki 将多次接收相同的日志行:

  1. 两个抓取配置读取同一文件
  2. 文件中的重复日志行通过管道发送。不执行去重。

但是,对于具有完全相同的纳秒级时间戳、标签和日志内容的日志,Loki 会在查询时执行一些去重。

此文档示例很好地展示了您可以使用管道实现什么

yaml
scrape_configs:
- job_name: kubernetes-pods-name
  kubernetes_sd_configs: ....
  pipeline_stages:

  # This stage is only going to run if the scraped target has a label
  # of "name" with value "promtail".
  - match:
      selector: '{name="promtail"}'
      stages:
      # The regex stage parses out a level, timestamp, and component. At the end
      # of the stage, the values for level, timestamp, and component are only
      # set internally for the pipeline. Future stages can use these values and
      # decide what to do with them.
      - regex:
          expression: '.*level=(?P<level>[a-zA-Z]+).*ts=(?P<timestamp>[T\d-:.Z]*).*component=(?P<component>[a-zA-Z]+)'

      # The labels stage takes the level and component entries from the previous
      # regex stage and promotes them to a label. For example, level=error may
      # be a label added by this stage.
      - labels:
          level:
          component:

      # Finally, the timestamp stage takes the timestamp extracted from the
      # regex stage and promotes it to be the new timestamp of the log entry,
      # parsing it as an RFC3339Nano-formatted value.
      - timestamp:
          format: RFC3339Nano
          source: timestamp

  # This stage is only going to run if the scraped target has a label of
  # "name" with a value of "nginx" and if the log line contains the word "GET"
  - match:
      selector: '{name="nginx"} |= "GET"'
      stages:
      # This regex stage extracts a new output by matching against some
      # values and capturing the rest.
      - regex:
          expression: \w{1,3}.\w{1,3}.\w{1,3}.\w{1,3}(?P<output>.*)

      # The output stage changes the content of the captured log line by
      # setting it to the value of output from the regex stage.
      - output:
          source: output

  # This stage is only going to run if the scraped target has a label of
  # "name" with a value of "jaeger-agent".
  - match:
      selector: '{name="jaeger-agent"}'
      stages:
      # The JSON stage reads the log line as a JSON string and extracts
      # the "level" field from the object for use in further stages.
      - json:
          expressions:
            level: level

      # The labels stage pulls the value from "level" that was extracted
      # from the previous stage and promotes it to a label.
      - labels:
          level:
- job_name: kubernetes-pods-app
  kubernetes_sd_configs: ....
  pipeline_stages:
  # This stage will only run if the scraped target has a label of "app"
  # with a name of *either* grafana or prometheus.
  - match:
      selector: '{app=~"grafana|prometheus"}'
      stages:
      # The regex stage will extract a level and component for use in further
      # stages, allowing the level to be defined as either lvl=<level> or
      # level=<level> and the component to be defined as either
      # logger=<component> or component=<component>
      - regex:
          expression: ".*(lvl|level)=(?P<level>[a-zA-Z]+).*(logger|component)=(?P<component>[a-zA-Z]+)"

      # The labels stage then promotes the level and component extracted from
      # the regex stage to labels.
      - labels:
          level:
          component:

  # This stage will only run if the scraped target has a label "app"
  # with a value of "some-app" and the log line doesn't contain the word "info"
  - match:
      selector: '{app="some-app"} != "info"'
      stages:
      # The regex stage tries to extract a Go panic by looking for panic:
      # in the log message.
      - regex:
          expression: ".*(?P<panic>panic: .*)"

      # The metrics stage is going to increment a panic_total metric counter
      # which Promtail exposes. The counter is only incremented when panic
      # was extracted from the regex stage.
      - metrics:
          panic_total:
            type: Counter
            description: "total count of panic"
            source: panic
            config:
              action: inc

阶段可访问的数据

以下章节进一步描述了每个阶段可访问的类型(尽管并非所有类型都会被使用)

标签集

日志行的当前标签集。初始化为随日志行一起抓取的标签集。标签集仅由操作阶段修改,但过滤阶段会读取它。

最终的标签集将由 Loki 索引,并可用于查询。

提取的映射

在解析阶段提取的键值对集合。后续阶段对提取的映射进行操作,可以转换它们或对其采取操作。在管道结束时,提取的映射将被丢弃;因此,要使解析阶段有用,它必须始终与至少一个操作阶段配对。

提取的映射使用与日志行一起抓取的初始标签集进行初始化。此初始数据允许在仅操作提取映射的管道阶段内对标签值进行操作。例如,从文件中尾随的日志条目具有标签 filename,其值为尾随的文件路径。当管道为此日志条目执行时,初始提取的映射将包含 filename,使用与标签相同的值。

日志时间戳

日志行的当前时间戳。操作阶段可以修改此值。如果未设置,则默认为日志抓取的时间。

时间戳的最终值将发送到 Loki。

日志行

当前日志行,表示为文本。初始化为 Promtail 抓取的文本。操作阶段可以修改此值。

日志行的最终值作为给定日志条目的文本内容发送到 Loki。

阶段

有关支持的各种阶段的 schema,请参阅Promtail 阶段配置参考