gRPC Load Balancing inside Kubernetes

Mar 11, 2019 - gRPC Kubernetes Go

Context

I wanted to blog about this for years: how to connect to a Kubernete’s loadbalanced service?
How to deal with disconnections/re-connections, maintenance? What about gRPC specifically?
The answer is heavily connected to the network stack used by Kubernetes, but with the “Mesh Network” revolution, It’s not always clear how it works anymore and what the options are.

How it works

First I recommend you to watch this great yet simple video: Container Networking From Scratch, then the Services clusterIP documentation.

To make it simple when you create a Service in Kubernetes, it creates a layer 4 proxy and load balance connections to your pods using iptables, the service endpoint is one IP and a port hiding your real pods.

The Problem

A simple TCP load balancer is good enough for a lot of things especially for HTTP/1.1 since connections are mainly short lived, the clients will try to reconnect often, so it won’t stay connected to an old running pod.

But with gRPC over HTTP/2, a TCP connection is maintained open which could lead to issues, like staying connected to a dying pod or unbalancing the cluster because the clients will end on the older pods.

One solution is to use a more advanced proxy that knows about the higher layers.

Envoy, HAProxy and Traefik are layer 7 reverse proxy load balancers, they know about HTTP/2 (even about gRPC) and can disconnect a backend’s pod without the clients noticing.

Edge

On the edge of your Kubernetes cluster, you need a public IP, provided by your cloud provider via the Ingress directive it will expose your internal service.

To further control your request routing you need an Ingress Controller.
It’s a reverse proxy that knows about the Kubernetes clusters and can direct the requests to the right place. Envoy, HAProxy and Traefik can act as Ingress Controllers.

Internal Services & Service Mesh

In a Micro-services environment, most if not all your micro-services will also be clients to others micro-services.

Istio, a “Mesh Network” solution, use Envoy as a sidecar. This sidecar is configured from a central place (control plane) and makes each micro-service talking to each other through Envoy.

This way the client does not need to know about all the topology.

That’s great but in a controlled environment (yours), where you control all the clients, sending all the traffic through a proxy is not always necessary.

Client Load Balancing

In Kubernetes you can create a headless service; where there are no load balanced single endpoints anymore, the service pods are directly exposed, Kubernetes DNS will return all of them.

Here is an example service called geoipd scaled to 3.

Name:      geoipd
Address 1: 172.17.0.18 172-17-0-18.geoipd.default.svc.cluster.local
Address 2: 172.17.0.21 172-17-0-21.geoipd.default.svc.cluster.local
Address 3: 172.17.0.9 172-17-0-9.geoipd.default.svc.cluster.local

It’s up to your client to connect them all and load balance the connections.

In Go gRPC client side, a simple dns:/// notation will fetch the entries for you, then the roundrobin package will handle load balancing.

conn, err := grpc.Dial(
    "dns:///geoip:9200",
    grpc.WithBalancerName(roundrobin.Name),
)

This may sound like a good solution but it is not: the default refresh frequency is 30 minutes, meaning whenever you add new pods, it can take up to 30 minutes for them to start getting traffic! You can lower this problem by tweaking MaxConnectionAge on the gRPC server:

gsrv := grpc.NewServer(
    // MaxConnectionAge is just to avoid long connection, to facilitate load balancing
    // MaxConnectionAgeGrace will torn them, default to infinity
    grpc.KeepaliveParams(keepalive.ServerParameters{MaxConnectionAge: 2 * time.Minute}),
)

Even if you could refresh the list more often you wouldn’t know about pod eviction fast enough and you’d miss some traffic.

There is a nicer solution, implementing the gRPC client resolver for Kubernetes, talking to the Kubernetes API to get the endpoints and watch them constantly, this is exactly what Kuberesolver does.

// Register kuberesolver to grpc
kuberesolver.RegisterInCluster()

conn, err := grpc.Dial(
    "kubernetes:///geoipd:9200",
    grpc.WithBalancerName(roundrobin.Name),
)

By using kubernetes schema you tell kuberesolver to fetch and watch the endpoints for the geoipd service.

For this to work the pod must have GET and WATCH access to endpoints using a role:

kubectl create role pod-reader-role --verb=get --verb=watch --resource=endpoints,services 
kubectl create sa pod-reader-sa 
kubectl create rolebinding pod-reader-rb --role=pod-reader-role --serviceaccount=default:pod-reader-sa

Redeploy your app (the client) with the service account:

spec:
  serviceAccountName: pod-reader-sa

Deploy, scale up, scale down, kill your pods, your client is still sending traffic to a living pod !

I’m surprised it’s not mentioned more often, client load balancing did the job for years, the same apply inside Kubernetes environment.
It is fine for small to medium projects and can deal with a lot of traffic, this will do it for many of you unless if you are Netflix sized…

Conclusion

Load-balancing proxies are great tools, especially useful on the edge of your platform. “Mesh Network” solutions are nice additions to our tool set, but the cost of operating and debugging a full mesh network could be really expensive and overkill in some situations, while a client load balancing solution is simple and easy to grasp.

Thanks to Prune who helped me with this post, and to Robteix & diligiant for reviewing.