k8s_service_deep
Service
A Kubernetes Service is a resource you create to make a single, constant point of entry to a group of pods(selected by label selector) providing the same service. service has an IP address and port that never change while the service exists, but Pod address could change during upgrade, or pod is removed or deleted during scale, hence we SHOULD NOT access pod address directly for a service, we need a dedicated ip for the cased mentioned, that’s why service comes in.
More details about service, refer to k8s service
enable source ip persistence for a service
If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to "ClientIP" (the default is "None")
. You can also set the maximum session sticky time by setting service.spec.sessionAffinityConfig.clientIP.timeoutSeconds
appropriately. (the default value is 10800, which works out to be 3 hours).
kube-proxy
kube-proxy is a key component of any Kubernetes deployment. Its role is to load-balance traffic that is destined for services (via cluster IPs and node ports) to the correct backend pods
. Kube-proxy can run in one of three modes, each implemented with different data plane technologies: userspace, iptables, or IPVS.
The userspace mode is very old, slow, and definitely not recommended! we DO NOT discuss it here.
iptables vs IPVS
- IPVS has better performance with larger service and pods
- IPVS has more algorithms then iptables
- IPVS supports server health checking and connection retries, etc.
Note
- cluser ip of service, pod ip and endpoint are assigned by controller manager
- kube-proxy watches apiserver for service and endpoint object, then update iptables or IPVS rules.
- kube-proxy runs in each node(kube-system namespace)
Why not use round-robin DNS to replace kube-proxy?
A question that pops up every now and then is why Kubernetes relies on proxying to forward inbound traffic to backends. What about other approaches? For example, would it be possible to configure DNS records that have multiple A values (or AAAA for IPv6), and rely on round-robin name resolution?
There are a few reasons for using proxying for Services:
- There is a long history of DNS implementations not respecting record TTLs, and
caching the results of name lookups after they should have expired.
- Some apps do DNS lookups only once and
cache the results indefinitely.
- Even if apps and libraries did proper re-resolution, the low or zero TTLs on the DNS records could impose a high load on DNS that then becomes difficult to manage
Iptables
In this mode, kube-proxy watches the Kubernetes control plane for the addition and removal of Service and Endpoint objects. For each Service, it installs iptables rules, which capture traffic to the Service’s clusterIP and port, and redirect that traffic to one of the Service’s backend sets. For each Endpoint object, it installs iptables rules which select a backend Pod.
By default, kube-proxy in iptables mode chooses a backend at random.
If kube-proxy is running in iptables mode and the first Pod that’s selected does not respond, the connection fails, there is no try next pod!!!
When access service by cluster ip(inside cluster), OUTPUT chain is checked., while when access service by NodePort address, PREROUTING chain is checked, but both will jump to KUBE-SERVICE chain created by kube-proxy, more detail see section below enable iptables for kube-proxy.
Ipvs
n IPVS mode, kube-proxy watches Kubernetes Services and Endpoints, calls netlink interface to create IPVS rules accordingly and synchronizes IPVS rules with Kubernetes Services and Endpoints periodically. This control loop ensures that IPVS status matches the desired state. When accessing a Service, IPVS directs traffic to one of the backend Pods.
The IPVS proxy mode is based on netfilter hook function that is similar to iptables mode, but uses a hash table as the underlying data structure and works in the kernel space. That means kube-proxy in IPVS mode redirects traffic with lower latency than kube-proxy in iptables mode, with much better performance when synchronising proxy rules. Compared to the other proxy modes, IPVS mode also supports a higher throughput of network traffic.
IPVS provides more options for balancing traffic to backend Pods; these are:
- rr: round-robin
- lc: least connection (smallest number of open connections)
- dh: destination hashing
- sh: source hashing
- sed: shortest expected delay
- nq: never queue
When creating a ClusterIP type Service, IPVS proxier will do the following three things:
- Make sure a dummy interface exists in the node, defaults to kube-IPVS0
- Bind Service IP addresses to the dummy interface
- Create IPVS virtual servers for each Service IP address respectively
config
enable iptables mode
When access service by cluster ip(inside cluster), OUTPUT chain is checked., while when access service by NodePort address, PREROUTING chain is checked, but both will jump to KUBE-SERVICE chain created by kube-proxy.
Ipvs
1 | # set from beginning when create cluster Cluster Created by Kubeadm |
enable IPVS mode
1 | # load module <module_name> |
NOTE
When kube-proxy starts in IPVS proxy mode, it verifies whether IPVS kernel modules are available. If the IPVS kernel modules are not detected, then kube-proxy falls back to running in iptables proxy mode
.
debug kube-proxy
1 | # check process running |
coredns
Kubernetes DNS schedules a DNS Pod and Service on the cluster, and configures the kubelets to tell individual containers to use the DNS Service’s IP to resolve DNS names.
Every Service defined in the cluster (including the DNS server itself) is assigned a DNS name. By default, a client Pod's DNS search list includes the Pod's own namespace and the cluster's default domain
.
You can (and almost always should) set up a DNS service for your Kubernetes cluster using an add-on.
A cluster-aware DNS server, such as CoreDNS, watches the Kubernetes API for new Services and creates a set of DNS records for each one. If DNS has been enabled throughout your cluster then all Pods should automatically be able to resolve Services by their DNS name.
For example, if you have a Service called my-service in a Kubernetes namespace my-ns, the control plane and the DNS Service acting together create a DNS record for my-service.my-ns. Pods in the my-ns namespace should be able to find the service by doing a name lookup for my-service (my-service.my-ns would also work).
Pods in other namespaces must qualify the name as my-service.my-ns. These names will resolve to the cluster IP assigned for the Service.
Kubernetes also supports DNS SRV (Service) records for named ports. If the my-service.my-ns Service has a port named http with the protocol set to TCP, you can do a DNS SRV query for _http._tcp.my-service.my-ns to discover the port number for http, as well as the IP address.