Cilium 兼容性
Cilium 是一个开源的云原生解决方案,用于提供、保护和观测工作负载之间的网络连接。在某些情况下,Cilium 使用的 eBPF 程序可能与 Beyla 使用的 eBPF 程序冲突,并导致问题。
Beyla 和 Cilium 使用 eBPF 流量控制分类器程序,即 BPF_PROG_TYPE_SCHED_CLS
。这些程序附加到内核网络栈的 ingress 和 egress 数据路径。它们共同形成一个数据包过滤器链。每个数据包过滤器都可以检查数据包的内容并执行操作,例如重定向或丢弃数据包。
Beyla 程序从不中断数据包的流向,但 Cilium 会在其操作中改变数据包的流向。如果 Cilium 在 Beyla 之前处理数据包,可能会影响 Beyla 处理数据包的能力。
附加模式
Beyla 使用 Linux 内核中的 Traffic Control eXpress (TCX) API 或 Netlink 接口来附加流量控制 (TC) 程序。
TCX API 自内核版本 6.6 起可用,是附加 TC 程序的首选方法。它提供了一种链表机制,可以将程序附加到头部、中部或尾部。Beyla 和 Cilium 会自动检测内核是否支持 TCX,并默认使用它。
当 Beyla 和 Cilium 都使用 TCX 时,它们不会相互干扰。Beyla 将其 eBPF 程序附加到列表的头部,Cilium 附加到尾部。TCX 是在可能的情况下首选的操作模式。
传统的 Netlink 接口依赖于 clsact
、qdiscs
和 BPF TC 过滤器来将 eBPF 程序附加到网络接口。内核按照优先级顺序执行过滤器,数字越小优先级越高,1 是最高优先级。
当 TCX 不可用时,Beyla 和 Cilium 都使用 Netlink 接口安装 eBPF 程序。如果 Beyla 检测到 Cilium 运行优先级为 1 的程序,Beyla 将退出并显示错误。您可以通过将 Cilium 配置为使用大于 1 的优先级来解决此错误。
如果 Beyla 配置为使用 Netlink 附加并且检测到 Cilium 使用 TCX,它也会拒绝运行。
配置 Cilium Netlink 优先级
您可以通过 bpf-filter-priority
配置选项配置 Cilium Netlink 程序的优先级
cilium config set bpf-filter-priority 2
这确保了 Beyla 程序总是在 Cilium 程序之前运行。
混合使用 TCX 和 Netlink
在内核同时使用 TCX 和 Netlink 附加的场景中,TCX 程序会在通过 Netlink 接口附加的程序之前运行。
Beyla 附加模式配置
请参考配置文档,使用 BEYLA_BPF_TC_BACKEND
配置选项配置 Beyla TC 附加模式。
您可以通过 Cilium 的 enable-tcx
布尔配置选项配置 Cilium,更多信息请参考Cilium 文档。
cilium config set enable-tcx (true|false)
Beyla 和 Cilium 演示
以下示例演示了 Beyla 和 Cilium 在 Kubernetes 环境中协同工作以传播链路追踪上下文。
安装 Cilium
按照Cilium 文档中的说明,将 Cilium 安装到 kind 托管的 Kubernetes 容器中。
如果您部署 Cilium 的内核不支持 TCX,请将 Cilium 配置为对其 eBPF 程序使用优先级 2
cilium config set bpf-filter-priority 2
部署示例服务
使用以下定义部署相同的服务。这些是互相通信的小型示例服务,可以让您看到 Beyla 如何与链路追踪上下文传播协同工作
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-deployment
labels:
app: node
spec:
replicas: 1
selector:
matchLabels:
app: node
template:
metadata:
labels:
app: node
spec:
containers:
- name: node
image: ghcr.io/grafana/beyla-test/nodejs-testserver
ports:
- containerPort: 3030
hostPort: 3030
---
apiVersion: v1
kind: Service
metadata:
name: node-service
spec:
type: NodePort
selector:
app: node
ports:
- name: node
protocol: TCP
port: 30030
targetPort: 3030
nodePort: 30030
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: go-deployment
labels:
app: go-testserver
spec:
replicas: 1
selector:
matchLabels:
app: go-testserver
template:
metadata:
labels:
app: go-testserver
spec:
containers:
- name: go-testserver
image: ghcr.io/grafana/beyla-test/go-testserver
ports:
- containerPort: 8080
hostPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: testserver
spec:
type: NodePort
selector:
app: go-testserver
ports:
- name: go-testserver
protocol: TCP
port: 8080
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: python-deployment
labels:
app: python-testserver
spec:
replicas: 1
selector:
matchLabels:
app: python-testserver
template:
metadata:
labels:
app: python-testserver
spec:
containers:
- name: python-testserver
image: ghcr.io/grafana/beyla-test/python-testserver
ports:
- containerPort: 8083
hostPort: 8083
---
apiVersion: v1
kind: Service
metadata:
name: pytestserver
spec:
type: NodePort
selector:
app: python-testserver
ports:
- name: python-testserver
protocol: TCP
port: 8083
targetPort: 8083
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rails-deployment
labels:
app: rails-testserver
spec:
replicas: 1
selector:
matchLabels:
app: rails-testserver
template:
metadata:
labels:
app: rails-testserver
spec:
containers:
- name: rails-testserver
image: ghcr.io/grafana/beyla-test/rails-testserver
ports:
- containerPort: 3040
hostPort: 3040
---
apiVersion: v1
kind: Service
metadata:
name: utestserver
spec:
type: NodePort
selector:
app: rails-testserver
ports:
- name: rails-testserver
protocol: TCP
port: 3040
targetPort: 3040
部署 Beyla
创建 Beyla 命名空间
kubectl create namespace beyla
应用权限
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: beyla
name: beyla
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: beyla
rules:
- apiGroups: [ "apps" ]
resources: [ "replicasets" ]
verbs: [ "list", "watch" ]
- apiGroups: [ "" ]
resources: [ "pods", "services", "nodes" ]
verbs: [ "list", "watch" ]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: beyla
subjects:
- kind: ServiceAccount
name: beyla
namespace: beyla
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: beyla
部署 Beyla
apiVersion: v1
kind: ConfigMap
metadata:
namespace: beyla
name: beyla-config
data:
beyla-config.yml: |
attributes:
kubernetes:
enable: true
routes:
unmatched: heuristic
# let's instrument only the docs server
discovery:
services:
- k8s_deployment_name: "nodejs-deployment"
- k8s_deployment_name: "go-deployment"
- k8s_deployment_name: "python-deployment"
- k8s_deployment_name: "rails-deployment"
trace_printer: text
ebpf:
enable_context_propagation: true
traffic_control_backend: tcx
disable_blackbox_cp: true
track_request_headers: true
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
namespace: beyla
name: beyla
spec:
selector:
matchLabels:
instrumentation: beyla
template:
metadata:
labels:
instrumentation: beyla
spec:
serviceAccountName: beyla
hostPID: true
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: beyla
image: grafana/beyla:main
securityContext:
privileged: true
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /config
name: beyla-config
- mountPath: /var/run/beyla
name: var-run-beyla
env:
- name: BEYLA_CONFIG_PATH
value: "/config/beyla-config.yml"
volumes:
- name: beyla-config
configMap:
name: beyla-config
- name: var-run-beyla
emptyDir: {}
将端口转发到主机并触发请求
kubectl port-forward services/node-service 30030:30030 &
curl https://:30030/traceme
最后检查您的 Beyla Pod 日志
for i in `kubectl get pods -n beyla -o name | cut -d '/' -f2`; do kubectl logs -n beyla $i | grep "GET " | sort; done
您应该会看到 Beyla 检测到的请求输出,其中包含类似于以下的链路追踪上下文传播信息
2025-01-17 21:42:18.11794218 (5.045099ms[5.045099ms]) HTTPClient 200 GET /tracemetoo [10.244.1.92 as go-deployment.default:37450]->[10.96.214.17 as pytestserver.default:8083] size:0B svc=[default/go-deployment go] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-319fb03373427a41[cfa6d5d448e40b00]-01]
2025-01-17 21:42:18.11794218 (5.284521ms[5.164701ms]) HTTP 200 GET /gotracemetoo [10.244.2.144 as nodejs-deployment.default:57814]->[10.244.1.92 as go-deployment.default:8080] size:0B svc=[default/go-deployment go] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-cfa6d5d448e40b00[cce1e6b5e932b89a]-01]
2025-01-17 21:42:18.11794218 (1.934744ms[1.934744ms]) HTTP 403 GET /users [10.244.2.32 as python-deployment.default:46876]->[10.244.2.176 as rails-deployment.default:3040] size:222B svc=[default/rails-deployment ruby] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-57d77d99e9665c54[3d97d26b0051112b]-01]
2025-01-17 21:42:18.11794218 (2.116628ms[2.116628ms]) HTTPClient 403 GET /users [10.244.2.32 as python-deployment.default:46876]->[10.96.69.89 as utestserver.default:3040] size:256B svc=[default/python-deployment python] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-ff48ab147cc92f93[2770ac4619aa0042]-01]
2025-01-17 21:42:18.11794218 (4.281525ms[4.281525ms]) HTTP 200 GET /tracemetoo [10.244.1.92 as go-deployment.default:37450]->[10.244.2.32 as python-deployment.default:8083] size:178B svc=[default/python-deployment python] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-2770ac4619aa0042[319fb03373427a41]-01]
2025-01-17 21:42:18.11794218 (5.391191ms[5.391191ms]) HTTPClient 200 GET /gotracemetoo [10.244.2.144 as nodejs-deployment.default:57814]->[10.96.134.167 as testserver.default:8080] size:256B svc=[default/nodejs-deployment nodejs] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-202ee68205e4ef3b[9408610968fa20f8]-01]
2025-01-17 21:42:18.11794218 (6.939027ms[6.939027ms]) HTTP 200 GET /traceme [127.0.0.1 as 127.0.0.1:44720]->[127.0.0.1 as 127.0.0.1.default:3030] size:86B svc=[default/nodejs-deployment nodejs] traceparent=[00-14f07e11b5e57f14fd2da0541f0ddc2f-9408610968fa20f8[0000000000000000]-01]