Demystifying Load Balancing in Istio: Multi-Cluster Routing Practice

A detailed explanation of load balancing in Istio, especially how routing and load balancing are implemented in multi-cluster setups, with a demonstration of multi-cluster routing.

Editor’s Note
This article introduces the load balancing types supported in Istio, then proposes solutions for load balancing in multi-cluster meshes. If you already understand load balancing in Istio, you can start reading directly from the Load Balancing in Multi-Cluster Meshes section.

In my previous blog post Why You Might Still Need Istio After Using Kubernetes, I mentioned that Istio is built on top of Kubernetes. The kube-proxy component in Kubernetes already has load balancing capabilities, but it only supports Layer 4 traffic load balancing and cannot implement advanced features like service timeouts and circuit breaking. Specifically, service mesh adds the following load balancing and resiliency features compared to Kubernetes:

  1. Layer 7 Load Balancing: Service mesh operates and manages traffic at the application layer (Layer 7), enabling more fine-grained traffic identification and control. This allows for more advanced load balancing strategies, such as routing and traffic distribution based on HTTP headers, URL paths, cookies, etc.

  2. Dynamic Load Balancing: Service mesh typically has automatic load balancing capabilities, dynamically distributing traffic based on real-time health status and performance metrics of backend services. This enables intelligent load balancing strategies that route traffic to the best-performing service instances.

  3. Failure Detection and Automatic Failover: Service mesh has advanced failure detection and automatic failover capabilities. It can detect failures in backend service instances and automatically transfer traffic from failed instances to healthy ones to ensure application availability.

  4. A/B Testing and Canary Releases: Service mesh allows implementing advanced deployment strategies like A/B testing and canary releases. These strategies enable dynamic traffic allocation between different service versions and decision-making based on results.

  5. Circuit Breaking and Retries: Service mesh typically includes circuit breaking and retry mechanisms to improve application availability and stability. It can automatically execute circuit breaking operations based on backend service performance and availability, and retry requests when necessary.

  6. Global Traffic Control: Service mesh provides centralized traffic control and policy definition, allowing global management of traffic across the entire service mesh. This enables unified security, monitoring, and policy implementation.

  7. Deeply Integrated Monitoring and Tracing: Service mesh typically integrates powerful monitoring and tracing tools that provide detailed information about traffic performance and visibility, helping with troubleshooting and performance optimization.

While Kubernetes provides basic load balancing capabilities, service mesh builds more advanced load balancing and traffic management features on top of it to meet the complex needs of microservice architectures.

Client-Side Load Balancing vs Server-Side Load Balancing

Client-side load balancing and server-side load balancing are two different load balancing methods, each with its own advantages in different scenarios and applications. Here’s an explanation of both, their applicable scenarios, implementation examples, and related open source projects:

Client-Side Load Balancing

The diagram for client-side load balancing is shown in Figure 1.

Figure 1: Client-side load balancing
Figure 1: Client-side load balancing
  • Definition: In client-side load balancing, load balancing decisions are made by the service consumer (client). The client-side load balancer typically maintains a list of service instances and selects the instance to send requests to based on configuration and policies.

  • Applicable Scenarios: Client-side load balancing is suitable for the following situations:

    • Multiple clients need to access the same set of backend services, and each client can select backend services according to their own needs and policies.
    • Service consumers need more traffic control and policy definition, such as A/B testing, canary releases, etc.
  • Implementation Examples: Common open source projects that implement client-side load balancing include:

    • Ribbon: Netflix Ribbon is an open source project for client-side load balancing that can be integrated with Spring Cloud.
    • Envoy: Envoy is a high-performance proxy server that supports client-side load balancing and is widely used in service mesh and microservice architectures.
    • NGINX: Although NGINX is typically used for reverse proxy, it can also be used as a client-side load balancer.

Server-Side Load Balancing

Server-side load balancing is shown in Figure 2.

Figure 2: Server-side load balancing
Figure 2: Server-side load balancing
  • Definition: In server-side load balancing, load balancing decisions are made by a server-side load balancer or proxy. The client only needs to send requests to the server, and the server decides which backend service instance to route the request to.

  • Applicable Scenarios: Server-side load balancing is suitable for the following situations:

    • The client doesn’t care about the specific instance of the backend service, only about sending requests to the service’s name or address.
    • Load balancing policies need to be globally configured on the server side, and clients don’t need to care about load balancing details.
  • Implementation Examples: Common open source projects that implement server-side load balancing include:

    • NGINX: NGINX can be used as a reverse proxy server to perform server-side load balancing, routing requests to backend services.
    • HAProxy: HAProxy is a high-performance load balancer typically used for server-side load balancing.
    • Amazon ELB (Elastic Load Balancer): Load balancing service provided by Amazon for routing requests to AWS backend service instances.

In practice, client-side and server-side load balancing are sometimes combined to meet specific needs. The choice of load balancing method typically depends on your architecture, deployment requirements, and performance requirements. Service meshes (like Istio) typically use client-side load balancing to implement fine-grained traffic control and policy definition, while in cloud service providers, server-side load balancing is typically used for auto-scaling and traffic management.

How Istio Implements Load Balancing

In service mesh (like Istio), client-side load balancing is typically implemented through Envoy proxy. Envoy is a high-performance proxy server that can be used to build the data plane of a service mesh for handling communication between services. Client-side load balancing is a load balancing strategy implemented on the service consumer (client) side that determines how requests should be routed to backend service instances.

The load balancing of a single-cluster single-network Istio service mesh is shown in Figure 3.

Figure 3: Load balancing of Istio service mesh in single cluster single network
Figure 3: Load balancing of Istio service mesh in single cluster single network

The following is the general process for implementing client-side load balancing in a service mesh:

  1. Sidecar Proxy: In a service mesh, each deployed service instance is typically associated with a Sidecar proxy (usually Envoy). This Sidecar proxy is located alongside the service instance and is responsible for handling inbound and outbound traffic for that service instance.

  2. Service Registration and Discovery: In a service mesh, service instance registration and discovery is typically handled by a service registry (Kubernetes’s service discovery mechanism). These registries maintain information about service instances, including their network addresses and health status.

  3. Client-Side Load Balancing Configuration: When a client (service consumer) sends a request, the Sidecar proxy executes load balancing policies. These load balancing policies typically operate on the list of service instances obtained from the service registry. Policies can select based on various factors such as weight, health status, latency, etc.

  4. Request Routing: Based on the load balancing policy, the Sidecar proxy routes requests to the selected backend service instance. This can include using algorithms like round-robin, weighted round-robin, least connection, etc. to select the target instance.

  5. Communication Handling: Once the target instance is selected, the Sidecar proxy forwards the request to that instance and then passes the response back to the client. It can also handle connection management, failure detection, and automatic failover tasks.

In summary, client-side load balancing is a load balancing strategy implemented on the service consumer side (usually the Envoy proxy) that enables service mesh to effectively distribute traffic and handle failures of backend service instances. This approach keeps load balancing decisions under the control of the service consumer and allows for more fine-grained traffic control and policy definition. Envoy proxy is one of the key components for implementing client-side load balancing, with rich configuration options to meet different load balancing needs.

Load Balancing Types in Istio

In Istio’s DestinationRule resource, the loadBalancer section is used to configure load balancing policies, controlling how requests are distributed to different service instances or versions, as shown in the figure below.

Figure 4: Load balancing configuration parameters in Istio
Figure 4: Load balancing configuration parameters in Istio

From the figure, we can see that Istio supports three types of load balancing:

  • simple: Simple load balancing based on common load balancing algorithms
  • consistentHashLB: Load balancing based on consistent hashing algorithm
  • localityLbSetting: Locality-based load balancing

The following are the meanings of fields related to load balancing configuration:

  1. simple: This section defines some simple load balancing policy options, including:
    • ROUND_ROBIN: Requests are distributed to all available backend instances in turn using round-robin.
    • LEAST_CONN: Requests will be routed to the backend instance with the fewest current connections.
    • RANDOM: Requests will be routed randomly to backend instances.
    • PASSTHROUGH: Istio will not perform load balancing but will route requests directly to one instance of the service, suitable for specific use cases.
  2. consistentHashLB: This section allows you to configure consistent hash load balancing, including:
    • httpHeaderName: Name of the HTTP header used for hash calculation.
    • httpCookie: Configuration of HTTP cookie used for hash calculation, including name, path, and time-to-live (TTL).
    • useSourceIp: Whether to use the request’s source IP address for hash calculation.
    • httpQueryParameterName: Name of the HTTP query parameter used for hash calculation.
    • ringHash: Configure ring hash load balancing, including minimum ring size (minimumRingSize).
    • maglev: Configure Maglev load balancing, including table size (tableSize) and minimum ring size (minimumRingSize).
  3. localityLbSetting: This section is used to configure locality load balancing settings, including:
    • distribute: Defines the distribution of requests, including origin (from) and destination (to).
    • failover: Defines failover, including origin (from) and destination (to).
    • failoverPriority: Failover priority settings.
    • enabled: Whether to enable locality load balancing.

These fields allow you to select appropriate load balancing policies according to your needs and configure additional options to ensure requests are distributed to backend service instances as desired. Different strategies and configuration options can meet various load balancing needs such as performance, reliability, and traffic control. For detailed introductions to these fields, see the Istio documentation.

How to Set Load Balancing for Services in Istio

As I mentioned in How to Understand VirtualService and DestinationRule in Istio, VirtualService is mainly used to set routing rules, while service resiliency (load balancing, timeout, retry, circuit breaking, etc.) needs to be maintained through both VirtualService and DestinationRule. Only when both resource types are deployed together can load balancing truly take effect.

The following are the general steps for setting up load balancing:

  1. Create a DestinationRule Resource: First, you need to create a DestinationRule resource that defines traffic policies and destination rules for the service. In the DestinationRule, you can specify the name of the service (host) for which to set load balancing and the load balancing policy.

    The following is an example of a DestinationRule that distributes traffic to two subsets with labels “version: v1” and “version: v2” and uses the ROUND_ROBIN load balancing policy:

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: my-destination-rule
    spec:
      host: my-service.example.com
      trafficPolicy:
        loadBalancer:
          simple: ROUND_ROBIN
      subsets:
        - name: v1
          labels:
            version: v1
        - name: v2
          labels:
            version: v2
    
  2. Apply DestinationRule: After creating a DestinationRule, apply it to the service you want to load balance. This can typically be done through Istio’s VirtualService resource by referencing the DestinationRule in the VirtualService.

    The following is an example of a VirtualService that routes traffic to the DestinationRule named “my-destination-rule”:

    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: my-virtual-service
    spec:
      hosts:
        - my-service.example.com
      http:
        - route:
            - destination:
                host: my-service.example.com
                subset: v1
          weight: 80
        - route:
            - destination:
                host: my-service.example.com
                subset: v2
          weight: 20
    

    In the example above, based on the weight configuration, 80% of traffic will be routed to subset v1, while 20% of traffic will be routed to subset v2.

  3. Apply Configuration: Finally, apply the VirtualService and DestinationRule resources to your Istio environment to ensure load balancing rules take effect.

    Use the kubectl command to apply VirtualService and DestinationRule to Istio:

    kubectl apply -f your-service.yaml
    

Through these steps, you can set load balancing policies for your services, distribute traffic to different service versions or instances as needed, and control traffic weights. This helps implement deployment strategies such as canary releases, A/B testing, and gray releases. Please adjust load balancing configuration according to your specific needs and environment.

Why Configure Separately?

In Istio, load balancing and routing are two different concepts typically used to control traffic and behavior between services, so they are usually configured in two different resource objects: DestinationRule for load balancing and VirtualService for routing. This separation design has several advantages:

  1. Modularity and Clarity: Separating load balancing and routing configuration into two resource objects makes configuration more modular and clear. This way, you can more easily understand and maintain these two aspects of configuration without making configuration objects too complex.

  2. Maintainability: Separating load balancing and routing configuration makes them easier to maintain and modify because they are located in different resource objects. This way, you can change load balancing policies for different needs without affecting routing rules, and vice versa.

  3. Reusability: Modular configuration allows you to more easily reuse configuration fragments. You can use the same load balancing policies or routing rules in different DestinationRule or VirtualService to improve configuration reusability.

  4. Fine-grained Control: Separated configuration allows you to have more fine-grained control over each aspect. You can customize different routing rules and load balancing policies for each service as needed to meet specific use cases and requirements.

Although load balancing and routing are usually configured separately, there is still a close relationship between them because routing rules determine how requests will be routed to backend services, while load balancing policies determine how traffic is distributed among the selected target services. Therefore, in Istio, these two configuration objects usually need to work together to achieve your traffic management needs. By configuring them separately, configuration becomes clearer and more maintainable, and allows for more flexibility to meet different needs.

Load Balancing in Multi-Cluster Meshes

In the microservices field, Istio has proven invaluable for managing service communication. While it excels in single-cluster scenarios, multi-cluster setups introduce unique challenges, particularly in load balancing. Next, we’ll demystify multi-cluster load balancing in Istio, providing you with a clear roadmap to solve this complex task.

Two-Tier Ingress Gateway: Key to Implementing Multi-Cluster Communication

In multi-cluster setups involving clusters from different vendors, the first step is to establish a gateway for each cluster. However, a key point that needs special attention is the need for a unique user access entry point. Although this gateway can be deployed in the same cluster, it’s generally recommended to place it in a separate cluster.

The deployment architecture of a two-tier ingress gateway is shown in Figure 4.

Figure 5: Two-tier ingress gateway
Figure 5: Two-tier ingress gateway

Components Required for Multi-Cluster Communication

For Istio-based multi-cluster meshes, typically multi-mesh multi-network mode, to enable mesh communication, we need to add a Tier-1 cluster and create the following components in each cluster:

  1. Ingress Gateway: Each mesh must have an ingress gateway.
  2. ServiceEntry: Used to allow clusters to discover each other’s endpoints.
  3. VirtualServices and DestinationRules: Critical for service discovery and routing within each cluster.

Practical Demo: A Multi-Cluster Demo

In this demo, I’ll cover three Kubernetes clusters on GKE, distributed in different regions as shown in Figure 5. Of course, you can also use different vendors or cross-vendor. Istio is deployed in each cluster, laying the foundation for multi-cluster communication.

A two-tier cluster structure is established: one specifically hosts the productpage service, and the other contains the complete bookinfo service suite.

Figure 6: Deployment of demo environment
Figure 6: Deployment of demo environment

Implementing Multi-Cluster Routing and Load Balancing

To implement advanced features such as load balancing and failover, solving the multi-cluster routing problem is crucial. Since Istio is also deployed in the Tier-1 cluster, the load balancing techniques discussed earlier can be applied to this gateway.

Key steps:

  1. Create an ingress gateway in each cluster and obtain the IP address of the load balancer used by the gateway.

  2. Create VirtualServices, DestinationRules, and ServiceEntries in each cluster. Ensure ServiceEntries include the entry points of each cluster’s ingress gateway.

  3. For further testing, retrieve the IP address of the Tier 1 gateway.

    export GATEWAY_IP=$(kubectl -n tier1 get service tier1-gateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
    

    Note: This step needs to be performed in the Tier-1 cluster.

The gateway in the Tier-1 cluster serves as a unified gateway entry. You can implement multi-cluster routing by configuring Gateway, VirtualService, DestinationRule, and ServiceEntry resource objects in this cluster, as shown in Figure 6.

Figure 7: Istio resources in Tier-1 cluster
Figure 7: Istio resources in Tier-1 cluster

In this demo, we’ll implement multi-cluster routing based on HTTP headers and prefixes. The final routing path is shown in Figure 7.

Figure 8: Routing path
Figure 8: Routing path

For operational details, refer to Unified Gateway in the TSB documentation.

Testing Setup

The demo continues with actual testing using the curl command:

  1. Request URL without HTTP headers.

    curl -Ss "http://bookinfo.tetrate.io/productpage" --resolve "bookinfo.tetrate.io:80:$GATEWAY_IP" -v > index1.html
    
  2. Request URL with HTTP headers indicating preferred cluster.

    curl -Ss "http://bookinfo.tetrate.io/productpage" --resolve "bookinfo.tetrate.io:80:$GATEWAY_IP" -v -H "X-CLUSTER-SELECTOR: gke-jimmy-us-west1-1" > index2.html
    curl -Ss "http://bookinfo.tetrate.io/productpage" --resolve "bookinfo.tetrate.io:80:$GATEWAY_IP" -v -H "X-CLUSTER-SELECTOR: gke-jimmy-us-west1-2" > index3.html
    

Verify results through the exported HTML files. Open the three files index1.html, index2.html, and index3.html in a browser respectively. You’ll see that in both page 1 and page 2, the reviews and details services are unavailable, while only in page 3 are all services accessible.

Multi-Cluster Load Balancing

The demo successfully demonstrates how to leverage HTTP headers and path routing. Routing is the foundation of load balancing. After implementing multi-cluster routing, you can add endpoints from Tier-2 clusters to a subset, thereby implementing load balancing configuration in DestinationRule.

You can solve the failover problem in Cluster 1 by configuring the ingress gateway in the Tier-2 cluster as an east-west gateway. Please refer to Istio documentation.

The Call for Automation

Although Istio provides various load balancing types based on Envoy, manually creating resource objects across multiple clusters is error-prone and inefficient. Automation, preferably adding an abstraction layer on top of Istio, is the next development stage.

Tetrate addresses this need with TSB, which is compatible with upstream Istio and provides a seamless solution for multi-cluster deployments. For more information, visit the Tetrate website.

Summary

Mastering multi-cluster load balancing in Istio is crucial for unlocking the full potential of microservices in complex environments. With careful configuration and the right tools, you can achieve robust and reliable communication between clusters, ensuring your applications run smoothly wherever they’re deployed. For more fine-grained load balancing adjustments, consider exploring EnvoyFilter. Thank you for joining us on this journey to demystify multi-cluster load balancing in Istio!

References

Jimmy Song

Jimmy Song

Focusing on research and open source practices in AI-Native Infrastructure and cloud native application architecture.

Post Navigation