可观测性配置

可观测性配置

Envoy 提供丰富的可观测性功能,通过配置访问日志、统计指标和分布式追踪来帮助监控和调试系统。

访问日志配置

基本访问日志

http_filters:
- name: envoy.filters.http.router
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
    access_log:
    - name: envoy.access_loggers.file
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
        path: "/var/log/envoy/access.log"
        format: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"

JSON 格式访问日志

- name: envoy.access_loggers.file
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
    path: "/var/log/envoy/access.json"
    json_format:
      timestamp: "%START_TIME%"
      method: "%REQ(:METHOD)%"
      path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
      protocol: "%PROTOCOL%"
      response_code: "%RESPONSE_CODE%"
      response_flags: "%RESPONSE_FLAGS%"
      bytes_received: "%BYTES_RECEIVED%"
      bytes_sent: "%BYTES_SENT%"
      duration: "%DURATION%"
      upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
      x_forwarded_for: "%REQ(X-FORWARDED-FOR)%"
      user_agent: "%REQ(USER-AGENT)%"
      request_id: "%REQ(X-REQUEST-ID)%"
      authority: "%REQ(:AUTHORITY)%"
      upstream_host: "%UPSTREAM_HOST%"

条件访问日志

- name: envoy.access_loggers.file
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
    path: "/var/log/envoy/error.log"
    format: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"
    filter:
      status_code_filter:
        comparison:
          op: GE
          value:
            default_value: 400
            runtime_key: access_log.error_threshold

统计指标配置

基本统计配置

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: some_service

集群统计配置

clusters:
- name: example_service
  connect_timeout: 0.25s
  type: STRICT_DNS
  lb_policy: ROUND_ROBIN
  load_assignment:
    cluster_name: example_service
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: example-service
              port_value: 8080

分布式追踪配置

基本追踪配置

tracing:
  http:
    name: envoy.tracers.jaeger
    typed_config:
      "@type": type.googleapis.com/envoy.config.trace.v3.JaegerConfig
      collector_cluster: jaeger
      collector_endpoint: "/api/v1/spans"
      collector_endpoint_version: HTTP_JSON

Jaeger 集群配置

clusters:
- name: jaeger
  connect_timeout: 0.25s
  type: STRICT_DNS
  lb_policy: ROUND_ROBIN
  load_assignment:
    cluster_name: jaeger
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: jaeger
              port_value: 14268

路由器追踪配置

http_filters:
- name: envoy.filters.http.router
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
    start_child_span: true
    dynamic_stats: true

健康检查配置

HTTP 健康检查

clusters:
- name: health_check_service
  connect_timeout: 0.25s
  type: STRICT_DNS
  lb_policy: ROUND_ROBIN
  health_checks:
  - timeout: 1s
    interval: 10s
    unhealthy_threshold: 3
    healthy_threshold: 2
    http_health_check:
      path: "/health"
      expected_statuses:
      - start: 200
        end: 299
  load_assignment:
    cluster_name: health_check_service
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: health-service
              port_value: 8080

TCP 健康检查

clusters:
- name: tcp_health_service
  connect_timeout: 0.25s
  type: STRICT_DNS
  lb_policy: ROUND_ROBIN
  health_checks:
  - timeout: 1s
    interval: 10s
    unhealthy_threshold: 3
    healthy_threshold: 2
    tcp_health_check:
      send:
        text: "PING"
      receive:
      - text: "PONG"
  load_assignment:
    cluster_name: tcp_health_service
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: tcp-health-service
              port_value: 9090

管理接口配置

管理接口

admin:
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9901
  access_log:
  - name: envoy.access_loggers.file
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
      path: "/var/log/envoy/admin.log"
      format: "[%START_TIME%] Admin: %REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL% %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION%\n"

结构化日志配置

结构化访问日志

- name: envoy.access_loggers.file
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
    path: "/var/log/envoy/structured.log"
    typed_json_format:
      "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.JSONLogFormat
      fields:
        timestamp:
          string_value: "%START_TIME%"
        method:
          string_value: "%REQ(:METHOD)%"
        path:
          string_value: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
        status_code:
          number_value: "%RESPONSE_CODE%"
        duration:
          number_value: "%DURATION%"
        upstream_service_time:
          number_value: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
        request_id:
          string_value: "%REQ(X-REQUEST-ID)%"
        user_agent:
          string_value: "%REQ(USER-AGENT)%"
        client_ip:
          string_value: "%REQ(X-FORWARDED-FOR)%"

性能监控配置

延迟监控日志

http_filters:
- name: envoy.filters.http.router
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
    access_log:
    - name: envoy.access_loggers.file
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
        path: "/var/log/envoy/latency.log"
        format: "[%START_TIME%] Latency: %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% %REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%\n"
        filter:
          duration_filter:
            comparison:
              op: GE
              value:
                default_value: 1000

最佳实践

1. 日志配置

  • 使用结构化日志格式
  • 配置适当的日志级别
  • 实施日志轮转
  • 监控日志大小

2. 统计指标

  • 定义关键业务指标
  • 设置合理的采样率
  • 监控指标性能
  • 定期审查指标

3. 追踪配置

  • 配置适当的采样率
  • 设置合理的追踪超时
  • 监控追踪性能
  • 保护敏感信息

4. 健康检查

  • 配置适当的检查间隔
  • 设置合理的阈值
  • 监控健康状态
  • 定期审查配置

注意事项

  • 日志会影响系统性能
  • 需要管理日志存储
  • 追踪会增加延迟
  • 需要保护敏感数据
  • 健康检查增加系统开销

可观测性配置为 Envoy 提供了全面的监控能力,合理配置可以帮助快速定位和解决问题。

文章导航

章节完成

恭喜完成本章节!下一章节即将开始。下一章节:路由

章节概览