可观测性配置
可观测性配置
Envoy 提供丰富的可观测性功能,通过配置访问日志、统计指标和分布式追踪来帮助监控和调试系统。
访问日志配置
基本访问日志
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
access_log:
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/access.log"
format: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"
JSON 格式访问日志
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/access.json"
json_format:
timestamp: "%START_TIME%"
method: "%REQ(:METHOD)%"
path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
protocol: "%PROTOCOL%"
response_code: "%RESPONSE_CODE%"
response_flags: "%RESPONSE_FLAGS%"
bytes_received: "%BYTES_RECEIVED%"
bytes_sent: "%BYTES_SENT%"
duration: "%DURATION%"
upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
x_forwarded_for: "%REQ(X-FORWARDED-FOR)%"
user_agent: "%REQ(USER-AGENT)%"
request_id: "%REQ(X-REQUEST-ID)%"
authority: "%REQ(:AUTHORITY)%"
upstream_host: "%UPSTREAM_HOST%"
条件访问日志
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/error.log"
format: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"
filter:
status_code_filter:
comparison:
op: GE
value:
default_value: 400
runtime_key: access_log.error_threshold
统计指标配置
基本统计配置
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: some_service
集群统计配置
clusters:
- name: example_service
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: example_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: example-service
port_value: 8080
分布式追踪配置
基本追踪配置
tracing:
http:
name: envoy.tracers.jaeger
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.JaegerConfig
collector_cluster: jaeger
collector_endpoint: "/api/v1/spans"
collector_endpoint_version: HTTP_JSON
Jaeger 集群配置
clusters:
- name: jaeger
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: jaeger
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: jaeger
port_value: 14268
路由器追踪配置
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
start_child_span: true
dynamic_stats: true
健康检查配置
HTTP 健康检查
clusters:
- name: health_check_service
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
health_checks:
- timeout: 1s
interval: 10s
unhealthy_threshold: 3
healthy_threshold: 2
http_health_check:
path: "/health"
expected_statuses:
- start: 200
end: 299
load_assignment:
cluster_name: health_check_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: health-service
port_value: 8080
TCP 健康检查
clusters:
- name: tcp_health_service
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
health_checks:
- timeout: 1s
interval: 10s
unhealthy_threshold: 3
healthy_threshold: 2
tcp_health_check:
send:
text: "PING"
receive:
- text: "PONG"
load_assignment:
cluster_name: tcp_health_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: tcp-health-service
port_value: 9090
管理接口配置
管理接口
admin:
address:
socket_address:
address: 0.0.0.0
port_value: 9901
access_log:
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/admin.log"
format: "[%START_TIME%] Admin: %REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL% %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION%\n"
结构化日志配置
结构化访问日志
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/structured.log"
typed_json_format:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.JSONLogFormat
fields:
timestamp:
string_value: "%START_TIME%"
method:
string_value: "%REQ(:METHOD)%"
path:
string_value: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
status_code:
number_value: "%RESPONSE_CODE%"
duration:
number_value: "%DURATION%"
upstream_service_time:
number_value: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
request_id:
string_value: "%REQ(X-REQUEST-ID)%"
user_agent:
string_value: "%REQ(USER-AGENT)%"
client_ip:
string_value: "%REQ(X-FORWARDED-FOR)%"
性能监控配置
延迟监控日志
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
access_log:
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: "/var/log/envoy/latency.log"
format: "[%START_TIME%] Latency: %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% %REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%\n"
filter:
duration_filter:
comparison:
op: GE
value:
default_value: 1000
最佳实践
1. 日志配置
- 使用结构化日志格式
- 配置适当的日志级别
- 实施日志轮转
- 监控日志大小
2. 统计指标
- 定义关键业务指标
- 设置合理的采样率
- 监控指标性能
- 定期审查指标
3. 追踪配置
- 配置适当的采样率
- 设置合理的追踪超时
- 监控追踪性能
- 保护敏感信息
4. 健康检查
- 配置适当的检查间隔
- 设置合理的阈值
- 监控健康状态
- 定期审查配置
注意事项
- 日志会影响系统性能
- 需要管理日志存储
- 追踪会增加延迟
- 需要保护敏感数据
- 健康检查增加系统开销
可观测性配置为 Envoy 提供了全面的监控能力,合理配置可以帮助快速定位和解决问题。