菜单
文档面包屑箭头 Beyla面包屑箭头 Cilium 兼容性
Grafana Cloud

Cilium 兼容性

Cilium 是一个开源的云原生解决方案,用于提供、保护和观测工作负载之间的网络连接。在某些情况下,Cilium 使用的 eBPF 程序可能与 Beyla 使用的 eBPF 程序冲突,并导致问题。

Beyla 和 Cilium 使用 eBPF 流量控制分类器程序,即 BPF_PROG_TYPE_SCHED_CLS。这些程序附加到内核网络栈的 ingress 和 egress 数据路径。它们共同形成一个数据包过滤器链。每个数据包过滤器都可以检查数据包的内容并执行操作,例如重定向或丢弃数据包。

Beyla 程序从不中断数据包的流向,但 Cilium 会在其操作中改变数据包的流向。如果 Cilium 在 Beyla 之前处理数据包,可能会影响 Beyla 处理数据包的能力。

附加模式

Beyla 使用 Linux 内核中的 Traffic Control eXpress (TCX) API 或 Netlink 接口来附加流量控制 (TC) 程序。

TCX API 自内核版本 6.6 起可用,是附加 TC 程序的首选方法。它提供了一种链表机制,可以将程序附加到头部、中部或尾部。Beyla 和 Cilium 会自动检测内核是否支持 TCX,并默认使用它。

当 Beyla 和 Cilium 都使用 TCX 时,它们不会相互干扰。Beyla 将其 eBPF 程序附加到列表的头部,Cilium 附加到尾部。TCX 是在可能的情况下首选的操作模式。

传统的 Netlink 接口依赖于 clsactqdiscs 和 BPF TC 过滤器来将 eBPF 程序附加到网络接口。内核按照优先级顺序执行过滤器,数字越小优先级越高,1 是最高优先级。

当 TCX 不可用时,Beyla 和 Cilium 都使用 Netlink 接口安装 eBPF 程序。如果 Beyla 检测到 Cilium 运行优先级为 1 的程序,Beyla 将退出并显示错误。您可以通过将 Cilium 配置为使用大于 1 的优先级来解决此错误。

如果 Beyla 配置为使用 Netlink 附加并且检测到 Cilium 使用 TCX,它也会拒绝运行。

您可以通过 bpf-filter-priority 配置选项配置 Cilium Netlink 程序的优先级

shell
cilium config set bpf-filter-priority 2

这确保了 Beyla 程序总是在 Cilium 程序之前运行。

在内核同时使用 TCX 和 Netlink 附加的场景中,TCX 程序会在通过 Netlink 接口附加的程序之前运行。

Beyla 附加模式配置

请参考配置文档,使用 BEYLA_BPF_TC_BACKEND 配置选项配置 Beyla TC 附加模式。

您可以通过 Cilium 的 enable-tcx 布尔配置选项配置 Cilium,更多信息请参考Cilium 文档

shell
cilium config set enable-tcx (true|false)

Beyla 和 Cilium 演示

以下示例演示了 Beyla 和 Cilium 在 Kubernetes 环境中协同工作以传播链路追踪上下文。

安装 Cilium

按照Cilium 文档中的说明,将 Cilium 安装到 kind 托管的 Kubernetes 容器中。

如果您部署 Cilium 的内核不支持 TCX,请将 Cilium 配置为对其 eBPF 程序使用优先级 2

shell
cilium config set bpf-filter-priority 2

部署示例服务

使用以下定义部署相同的服务。这些是互相通信的小型示例服务,可以让您看到 Beyla 如何与链路追踪上下文传播协同工作

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nodejs-deployment
  labels:
    app: node
spec:
  replicas: 1
  selector:
    matchLabels:
      app: node
  template:
    metadata:
      labels:
        app: node
    spec:
      containers:
        - name: node
          image: ghcr.io/grafana/beyla-test/nodejs-testserver
          ports:
            - containerPort: 3030
              hostPort: 3030
---
apiVersion: v1
kind: Service
metadata:
  name: node-service
spec:
  type: NodePort
  selector:
    app: node
  ports:
    - name: node
      protocol: TCP
      port: 30030
      targetPort: 3030
      nodePort: 30030
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: go-deployment
  labels:
    app: go-testserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: go-testserver
  template:
    metadata:
      labels:
        app: go-testserver
    spec:
      containers:
        - name: go-testserver
          image: ghcr.io/grafana/beyla-test/go-testserver
          ports:
            - containerPort: 8080
              hostPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: testserver
spec:
  type: NodePort
  selector:
    app: go-testserver
  ports:
    - name: go-testserver
      protocol: TCP
      port: 8080
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-deployment
  labels:
    app: python-testserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: python-testserver
  template:
    metadata:
      labels:
        app: python-testserver
    spec:
      containers:
        - name: python-testserver
          image: ghcr.io/grafana/beyla-test/python-testserver
          ports:
            - containerPort: 8083
              hostPort: 8083
---
apiVersion: v1
kind: Service
metadata:
  name: pytestserver
spec:
  type: NodePort
  selector:
    app: python-testserver
  ports:
    - name: python-testserver
      protocol: TCP
      port: 8083
      targetPort: 8083
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rails-deployment
  labels:
    app: rails-testserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rails-testserver
  template:
    metadata:
      labels:
        app: rails-testserver
    spec:
      containers:
        - name: rails-testserver
          image: ghcr.io/grafana/beyla-test/rails-testserver
          ports:
            - containerPort: 3040
              hostPort: 3040
---
apiVersion: v1
kind: Service
metadata:
  name: utestserver
spec:
  type: NodePort
  selector:
    app: rails-testserver
  ports:
    - name: rails-testserver
      protocol: TCP
      port: 3040
      targetPort: 3040

部署 Beyla

创建 Beyla 命名空间

kubectl create namespace beyla

应用权限

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: beyla
  name: beyla
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: beyla
rules:
  - apiGroups: [ "apps" ]
    resources: [ "replicasets" ]
    verbs: [ "list", "watch" ]
  - apiGroups: [ "" ]
    resources: [ "pods", "services", "nodes" ]
    verbs: [ "list", "watch" ]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: beyla
subjects:
  - kind: ServiceAccount
    name: beyla
    namespace: beyla
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: beyla

部署 Beyla

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: beyla
  name: beyla-config
data:
  beyla-config.yml: |
    attributes:
      kubernetes:
        enable: true
    routes:
      unmatched: heuristic
    # let's instrument only the docs server
    discovery:
      services:
        - k8s_deployment_name: "nodejs-deployment"
        - k8s_deployment_name: "go-deployment"
        - k8s_deployment_name: "python-deployment"
        - k8s_deployment_name: "rails-deployment"
    trace_printer: text
    ebpf:
      enable_context_propagation: true
      traffic_control_backend: tcx
      disable_blackbox_cp: true
      track_request_headers: true
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: beyla
  name: beyla
spec:
  selector:
    matchLabels:
      instrumentation: beyla
  template:
    metadata:
      labels:
        instrumentation: beyla
    spec:
      serviceAccountName: beyla
      hostPID: true
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
        - name: beyla
          image: grafana/beyla:main
          securityContext:
            privileged: true
            readOnlyRootFilesystem: true
          volumeMounts:
            - mountPath: /config
              name: beyla-config
            - mountPath: /var/run/beyla
              name: var-run-beyla
          env:
            - name: BEYLA_CONFIG_PATH
              value: "/config/beyla-config.yml"
      volumes:
        - name: beyla-config
          configMap:
            name: beyla-config
        - name: var-run-beyla
          emptyDir: {}

将端口转发到主机并触发请求

shell
kubectl port-forward services/node-service 30030:30030 &
curl https://:30030/traceme

最后检查您的 Beyla Pod 日志

shell
for i in `kubectl get pods -n beyla -o name | cut -d '/' -f2`; do kubectl logs -n beyla $i | grep "GET " | sort; done

您应该会看到 Beyla 检测到的请求输出,其中包含类似于以下的链路追踪上下文传播信息

2025-01-17 21:42:18.11794218 (5.045099ms[5.045099ms]) HTTPClient 200 GET /tracemetoo [10.244.1.92 as go-deployment.default:37450]->[10.96.214.17 as pytestserver.default:8083] size:0B svc=[default/go-deployment go] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-319fb03373427a41[cfa6d5d448e40b00]-01]
2025-01-17 21:42:18.11794218 (5.284521ms[5.164701ms]) HTTP 200 GET /gotracemetoo [10.244.2.144 as nodejs-deployment.default:57814]->[10.244.1.92 as go-deployment.default:8080] size:0B svc=[default/go-deployment go] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-cfa6d5d448e40b00[cce1e6b5e932b89a]-01]
2025-01-17 21:42:18.11794218 (1.934744ms[1.934744ms]) HTTP 403 GET /users [10.244.2.32 as python-deployment.default:46876]->[10.244.2.176 as rails-deployment.default:3040] size:222B svc=[default/rails-deployment ruby] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-57d77d99e9665c54[3d97d26b0051112b]-01]
2025-01-17 21:42:18.11794218 (2.116628ms[2.116628ms]) HTTPClient 403 GET /users [10.244.2.32 as python-deployment.default:46876]->[10.96.69.89 as utestserver.default:3040] size:256B svc=[default/python-deployment python] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-ff48ab147cc92f93[2770ac4619aa0042]-01]
2025-01-17 21:42:18.11794218 (4.281525ms[4.281525ms]) HTTP 200 GET /tracemetoo [10.244.1.92 as go-deployment.default:37450]->[10.244.2.32 as python-deployment.default:8083] size:178B svc=[default/python-deployment python] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-2770ac4619aa0042[319fb03373427a41]-01]
2025-01-17 21:42:18.11794218 (5.391191ms[5.391191ms]) HTTPClient 200 GET /gotracemetoo [10.244.2.144 as nodejs-deployment.default:57814]->[10.96.134.167 as testserver.default:8080] size:256B svc=[default/nodejs-deployment nodejs] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-202ee68205e4ef3b[9408610968fa20f8]-01]
2025-01-17 21:42:18.11794218 (6.939027ms[6.939027ms]) HTTP 200 GET /traceme [127.0.0.1 as 127.0.0.1:44720]->[127.0.0.1 as 127.0.0.1.default:3030] size:86B svc=[default/nodejs-deployment nodejs] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-9408610968fa20f8[0000000000000000]-01]