Securing Istio: Addressing Critical Security Gaps and Best Practices

Exploring security gaps in Istio and effective mitigation strategies, combined with best practices for multi-layered security.

Copyright Notice
This is an original article by Jimmy Song. You may repost it, but please credit this source: https://jimmysong.io/en/blog/securing-istio-addressing-critical-security-gaps-and-best-practices/
Click to show the outline

Introduction

Recently, the Wiz research team released a blog post that uncovered tenant isolation vulnerabilities in AI services, generating widespread attention. This study detailed security flaws across several AI service providers, particularly the SAP AI Core platform. Researchers were able to execute arbitrary code through legitimate AI training processes, subsequently moving laterally to take over services and gain access to customers’ private files and cloud credentials. These findings highlight the challenges that cloud services and management platforms face in ensuring isolation and sandbox environments.

In this context, Istio, as a crucial service mesh solution, faces similar security issues, especially in key functionalities like sidecar injection and traffic management. This blog aims to discuss how to protect the security of the Istio service mesh and provide a comprehensive set of mitigation measures. We will also discuss how multi-layer security strategies can effectively enhance Istio’s security to address challenges mentioned in the Wiz report.

Overview

Istio primarily manages east-west traffic within Kubernetes, offering detailed traffic management features such as request routing, load balancing, and fault recovery policies. While Istio offers essential security features such as traffic encryption, authentication, and authorization, it should not be viewed as a standalone firewall solution. To maintain robust security for services within the Istio mesh, it is crucial to complement Istio’s security capabilities with additional measures from the underlying network and infrastructure, such as Container Network Interface (CNI) plugins and secure container implementations.

Whether in Sidecar or Ambient mode, traffic is hijacked from application pods to data plane proxies for processing and forwarding. If application traffic is not successfully intercepted or is impersonated by a rogue application masquerading as Istio, security vulnerabilities can arise.

The diagram below illustrates where security vulnerabilities due to bypassing or impersonating Istio system users might occur.

image
“Security vulnerabilities” in bypassing Istio’s traffic hijacking

Next, we will explore specific situations where “security vulnerabilities” arise and the strategies to address them.

Bypassing Istio Sidecar Injection

At the Namespace Level

  • Scenario: Application teams misuse namespace labels to disable Istio Sidecar injection at the namespace level.
  • Mitigation Strategy: Platform teams abstract app deployment and restrict access to the raw Kubernetes namespace resources.
  • Monitoring: Use policy engines (like OPA Gatekeeper) to ensure compliance with namespace labels, and regularly review namespace configurations.

At the Pod Level

  • Scenario: Application teams misuse Pod labels to disable Istio Sidecar injection at the Pod level.
  • Mitigation Strategy:
    • Force all Pods to specify a UID that is not 1337.
    • Inspect all container images to check for UID 1337 and reject those images. This inspection can be performed using an admission webhook or by a central team managing the image registry.
  • Monitoring: Employ Admission Webhooks to enforce Sidecar injection, prohibit exclusion labels, and regularly scan and audit all pods to ensure every required pod has a Sidecar injected.

Bypassing Traffic Redirection to Istio Sidecar

Misuse of Traffic Redirection Annotations

  • Scenario: Application teams misuse Pod annotations to exclude certain inbound or outbound ports or IPs, thereby bypassing traffic redirection.
  • Mitigation Strategy: Platform teams abstract app deployment and restrict access to the raw Kubernetes Pod resources.
  • Monitoring: Use policy engines to detect and alert on non-compliant annotation use, regularly review Pod annotations.

Misuse of Pod UID

  • Scenario: Application teams misuse UID 1337 (the ID of the sidecar proxy) to bypass Istio’s Iptables redirection rules.
  • Mitigation Strategy: Platform teams abstract app deployment and restrict access to the raw Kubernetes Pod resources.
  • Monitoring: Prohibit or restrict the use of UID 1337, regularly audit Pod UID configurations to ensure no bypassing occurs.

Misuse of Pod Capabilities (NET_ADMIN, NET_RAW)

  • Scenario: Application teams misuse NET_ADMIN and NET_RAW capabilities to remove Istio Iptables rules.
  • Mitigation Strategy: Platform teams enable Istio CNI (to avoid granting elevated privileges to application teams) and restrict access to the raw Kubernetes Pod resources.
  • Monitoring: Regularly review and monitor Pod permission configurations to ensure no over-privileged actions are taken.

Bypassing Inbound Traffic Constraints

Misuse of PeerAuthentication

  • Scenario: Application teams create a PeerAuthentication resource for each namespace/workload, enabling the PERMISSIVE authentication mode.
  • Mitigation Strategy: Platform teams restrict access to the raw Istio PeerAuthentication resources.
  • Monitoring: Regularly review PeerAuthentication configurations to ensure all inbound traffic is encrypted as required.

Bypassing Outbound Traffic Constraints

Misuse of ServiceEntry

  • Scenario: Application teams create a ServiceEntry to directly access external services without going through an Egress gateway.
  • Mitigation Strategy: Platform teams restrict access to the raw Istio ServiceEntry resources.
  • Monitoring: Regularly review ServiceEntry configurations to ensure no bypassing occurs.

Misuse of ExternalName Services

  • Scenario: Application teams create a Kubernetes Service of type ExternalName to directly access external services without going through an Egress gateway.
  • Mitigation Strategy: Platform teams restrict access to the raw Kubernetes Service resources.
  • Monitoring: Regularly review the types of Kubernetes Service configurations to ensure no bypassing occurs.

Uncontrollably Changing Istio Sidecar Configuration

Misuse of Sidecar Resources

  • Scenario: Application teams create an Istio Sidecar resource for each workload and set the outboundTrafficPolicy field to ALLOW_ANY (overriding the possible global value REGISTRY_ONLY).
  • Mitigation Strategy: Platform teams restrict access to the raw Istio Sidecar resources.
  • Monitoring: Regularly review Sidecar resource configurations to ensure no global settings are overridden.

Misuse of EnvoyFilter

  • Scenario: Application teams create an EnvoyFilter that conflicts with existing Istio objects, potentially causing DoS attacks or violating security policies.
  • Mitigation Strategy: Platform teams restrict access to the raw Istio EnvoyFilter resources.
  • Monitoring: Regularly review EnvoyFilter configurations to ensure no improper use occurs.

Service Mesh as Part of a Layered Defense

The service mesh is described as a supplemental layer to existing security models, enhancing microservice security by adding finer-grained security policies on top of traditional security controls. However, the article emphasizes that service meshes cannot independently ensure comprehensive security for microservices but should be part of an overall security strategy.

image
Microservices security layered architecture

Service meshes primarily manage and control network traffic by deploying a lightweight proxy (sidecar) next to each service instance. This allows for precise traffic control and policy enforcement at the network level, such as traffic encryption, authentication, and authorization. Although service meshes offer features like traffic control, service discovery, and circuit breakers, these are essentially management of network traffic and are not sufficient to address all security issues. For instance, they cannot replace traditional security measures like application layer firewalls, intrusion detection systems, and data security.

Furthermore, service meshes rely on correct configuration and management, and improper configuration can lead to security vulnerabilities. Therefore, while service meshes are an indispensable part of modern microservices architectures, they should be combined with traditional security measures to form a comprehensive, multi-layered security strategy framework. Refer to How Service Mesh Layers Microservices Security with Traditional Security to Move Fast Safely for further insights on strengthening service mesh security.

Long-term Solutions and Community Collaboration

The Istio community conducts a security audit almost every year, see the results from 2021 and 2022. From these results, we can see that Istio’s security posture has greatly improved. Ensure that your Istio service mesh adheres to security best practices. Additionally, keep an eye on the Istio CVE Bulletins or use tools like Tetrate Istio Subscription that can scan for various CVEs in the Istio service mesh, deploying Istio versions that are FIPS compliant and FIPS certified.

Conclusion

Service meshes provide an additional layer of security for microservices architectures by managing control flows outside of the applications. This allows for enhanced communication security between services without impacting application performance. When deploying service meshes, it is recommended to use Istio’s Egress Gateway to manage outbound traffic, in conjunction with Kubernetes’ NetworkPolicy, to ensure all outbound traffic must pass through the gateway, thus preventing potential data leaks and other security threats.

References

Last updated on Sep 16, 2024