Kubernetes 1.29 Release: Ambitious Mandala Theme¶
Author: mengjiao-liu
On December 13, 2023, Pacific Time, Kubernetes 1.29 with the theme "Mandala" is officially released.
This version is released 4 months after the previous version and is the third version of 2023, also the last version of this year.
The release team has tracked 49 enhancements in this version. Among them, 11 feature upgrades have been promoted to stable, 19 existing features have been optimized and upgraded to Beta, and there are as many as 19 brand new features at the Alpha level. Version 1.29 includes many important features and user experience optimizations, the next section of this article will provide detailed introduction to some important features.
Important Features¶
[Networking] KEP-3866: nftables as kube-proxy backend (Alpha)¶
In Kubernetes v1.29, Kubernetes uses nftables as the new backend for kube-proxy, and this feature is now in Alpha version. iptables has performance issues that cannot be fixed, and performance degradation increases as the size of the rule set increases. Due to its unfixable issues, the development speed of iptables in the kernel has slowed down significantly and most of it has stopped. New features will not be added to iptables, and new features and performance improvements are mainly being added to nftables. nftables can do everything that iptables can do, and do it better.
In particular, RedHat has announced that iptables has been deprecated in RHEL 9 and may be completely removed in RHEL 10 several years later. Debian has removed iptables from the required software packages in Debian 11 (Bullseye). For the above reasons, kube-proxy introduces nftables as the new backend for kube-proxy.
The subsequent goal of this feature is to make nftables the default backend for kube-proxy (replacing iptables and ipvs).
Administrators must enable the feature gate NFTablesProxyMode to make this feature available, and then run kube-proxy with the --proxy-mode=nftables flag.
It should be noted that although the nftables mode may be very similar to the iptables mode, certain CNI plugins, NetworkPolicy implementations, etc. may need to be updated to use it. This may cause some compatibility issues.
[Storage] KEP-2495: PV/PVC ReadWriteOncePod Access Mode (GA)¶
In Kubernetes, the access mode defines how persistent storage is used. These access modes are part of the PersistentVolume (PV) and PersistentVolumeClaim (PVC) specifications. ReadWriteOncePod was introduced as the fourth access mode (the previous three access modes are ReadWriteOnce, ReadOnlyMany, ReadWriteMany) in Kubernetes v1.22 as an Alpha feature and has reached GA in Kubernetes v1.29. The main purpose of this feature is to provide more flexible access control for storage resources, ensuring that only one Pod can access the storage in read-write mode. This restriction is very useful for certain specific use cases, such as ensuring that only one Pod can modify the data in the storage to prevent data conflicts or corruption.
This feature also updates the Kubernetes scheduler to support pod preemption related to ReadWriteOncePod storage. This means that if two Pods request the PersistentVolumeClaim with ReadWriteOncePod , the Pod with the highest priority will be granted access to the PersistentVolumeClaim , and any lower priority pods will be preempted from the node and will not be able to access the PersistentVolumeClaim .
The following example creates a new PVC using the ReadWriteOncePod access mode:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: single-writer-only
spec:
accessModes:
- ReadWriteOncePod
resources:
requests:
storage: 1Gi
By introducing the ReadWriteOncePod feature, Kubernetes allows users to better control access permissions to storage resources and improves the flexibility of managing storage in containerized environments.
[Auth] KEP-3299: KMS v2 Enhancement (GA)¶
One of the first things to consider when securing a Kubernetes cluster is encrypting static etcd data. KMS provides a way for providers to leverage keys stored in an external key service to perform this encryption.
Kubernetes KMS (Key Management Service) is crucial for the secure management and encryption of secrets. With the release of Kubernetes v1.29, the KMSv2 features with feature gates KMSv2 and KMSv2KDF have been upgraded to GA, and the KMSv1 feature gate is now disabled by default.
This has become a stable feature that focuses on improving the KMS plugin framework, which is crucial for secure management. These improvements ensure that Kubernetes secrets remain a powerful and secure way to store sensitive information.
[Node] KEP-753: Native Support for Sidecar Containers (Beta)¶
Native support for Sidecar containers was introduced as Alpha in Kubernetes v1.28 and upgraded to Beta in v1.29, with the SidecarContainers feature gate enabled by default.
Starting from Kubernetes v1.29, if your Pod contains one or more sidecar containers (init containers with an always restart policy), Kubelet will delay sending TERM signals to these Sidecar containers until the last main container is fully terminated. Sidecar containers will be terminated in the opposite order defined in the Pod spec. This ensures that Sidecar containers continue to serve other containers in the Pod until they are no longer needed.
[Auth/Apps] KEP-2799: Reduce Secret-based Service Account Tokens (Beta)¶
The legacy-service-account-token-cleaner controller, which runs as part of the kube-controller-manager, now has Beta support for the LegacyServiceAccountTokenCleanUp feature gate (enabled by default). The legacy-service-account-token-cleaner controller periodically deletes unused service account token keys that have not been used within the time specified by --legacy-service-account-token-clean-up-period (default is one year). The controller marks the Secret as invalid by adding a label named kubernetes.io/legacy-token-invalid-since with the current date as the value. If the invalid Secret is not used within the specified time, the controller deletes it.
The following is an example of an old token that has been marked with kubernetes.io/legacy-token-last-used and kubernetes.io/legacy-token-invalid-since labels:
apiVersion: v1
kind: Secret
metadata:
name: build-robot-secret
namespace: default
labels:
kubernetes.io/legacy-token-last-used: 2022-10-24
kubernetes.io/legacy-token-invalid-since: 2023-10-25
annotations:
kubernetes.io/service-account.name: build-robot
type: kubernetes.io/service-account-token
[Windows] KEP-1287: Pod Resource In-place Update for Windows (Alpha)¶
In Kubernetes v1.29, Windows support for Pod resource in-place update is introduced as Alpha, allowing resources to be changed without recreating Pods or restarting containers.
[Networking] KEP-1880: Dynamic Extension of IP Range for Service (Alpha)¶
Service is an abstraction that exposes applications running on a set of Pods. Services can have cluster-wide virtual IP addresses that are allocated from a pre-defined CIDR set in the kube-apiserver flag. However, users may want to add, remove, or adjust the existing IP ranges allocated for services without restarting the kube-apiserver.
This feature allows cluster administrators to dynamically adjust the size of the IP range allocated for services in the cluster using the ServiceCIDR object.
During the bootstrap installation, this feature creates a default ServiceCIDR object named kubernetes based on the value of the --service-cluster-ip-range command-line parameter of kube-apiserver .
To use this feature, users need to enable the MultiCIDRServiceAllocator feature gate and the networking.k8s.io/v1alpha1 API, and create or delete new ServiceCIDR objects to manage the available IP ranges for services.
Here is an example:
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1alpha1
kind: ServiceCIDR
metadata:
name: newservicecidr
spec:
cidrs:
- 10.96.0.0/24
EOF
[Scheduler] KEP-3633: Introduce MatchLabelKeys/MismatchLabelKeys to Pod Affinity and Pod Anti-Affinity (Alpha)¶
In Kubernetes v1.29, an enhancement feature is introduced as Alpha in PodAffinity/PodAntiAffinity. It improves the accuracy of calculations during rolling updates. This feature introduces an additional field MatchLabelKeys to PodAffinityTerm , which allows users to have finer control over the scope of PodAffinity or PodAntiAffinity on top of existing LabelSelector .
MatchLabelKeys/MismatchLabelKey is a set of Pod label keys used to select which Pods to consider. The key is used to look for the value from the incoming Pod labels. The Pod key-value labels obtained by MatchLabelKeys are combined with LabelSelector as key in (value) , and the Pod key-value obtained by MismatchLabelKeys is combined with LabelSelector as key notin (value) , to select existing Pod groups that will be considered when a Pod is incoming (anti-)affinity. Keys that do not exist in the incoming Pod labels will be ignored. The default value is an empty set. It is not allowed to have the same key in both MatchLabelKeys/MismatchLabelKey and LabelSelector at the same time. Additionally, if LabelSelector is not set, MatchLabelKeys/MatchLabelKeys cannot be set. This is an Alpha field and requires the feature gate MatchLabelKeysInPodAffinity to be enabled.
With MatchLabelSelector set, the scheduling of newly created Pods will be affected. All existing Pods will not be affected.
For example, with matchLabelKeys , Pods that have key-value pairs of app:database and pod-template-hash key with the same value will be scheduled to the same node.
apiVersion: apps/v1
kind: Deployment
metadata:
name: application-server
---
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- database
topologyKey: topology.kubernetes.io/zone
matchLabelKeys:
- pod-template-hash
[Scheduler] KEP-4247: QueueingHint Enables New Possibilities for Optimizing Pod Scheduling (Beta)¶
The throughput of the scheduler has been a perennial challenge for the Kubernetes project, and SIG Scheduling has been working hard to improve scheduling throughput with many enhancements.
The QueueingHint feature brings new possibilities for optimizing rescheduling efficiency, which can significantly reduce useless scheduling retries.
In v1.28, only one alpha plugin (DRA) supports QueueingHint, and in v1.29, some stable plugins begin to implement QueueingHints.
QueueingHint has been in Beta since its introduction in v1.28 and skips the Alpha phase, enabled by default. It is aimed at developers rather than users and can be controlled by the SchedulerQueueingHints feature gate. Because QueueingHint changes the critical path of the scheduler and adds some memory overhead, it depends on the busyness of the cluster.
[Instrumentation] KEP-727: Kubelet Resource Metrics (GA)¶
Kubelet exposes a /metrics/resource endpoint in Prometheus text exposition format using the Prometheus client library. It provides the metrics required by the cluster-level resource metrics API to replace the metrics provided by the Summary API, which are complex and numerous.
In Kubernetes v1.29, the following kubelet resource metrics have been upgraded to GA:
- container_cpu_usage_seconds_total
- container_memory_working_set_bytes
- container_start_time_seconds
- node_cpu_usage_seconds_total
- node_memory_working_set_bytes
- pod_cpu_usage_seconds_total
- pod_memory_working_set_bytes
- resource_scrape_error The deprecated (renamed) metric scrape_error has been renamed to resource_scrape_error
[Instrumentation] KEP-3466: Kubernetes Component Healthcheck SLIs (GA)¶
The metrics provided by the ComponentSLIs feature gate are now GA and unconditionally enabled at the /metrics/slis endpoint. The feature gate will be removed in 1.31.
Accessing the /metrics/slis endpoint returns data like the following:
# HELP kubernetes_healthcheck [ALPHA] This metric records the result of a single healthcheck.
# TYPE kubernetes_healthcheck gauge
kubernetes_healthcheck{name="autoregister-completion",type="healthz"} 1
kubernetes_healthcheck{name="autoregister-completion",type="readyz"} 1
kubernetes_healthcheck{name="etcd",type="healthz"} 1
kubernetes_healthcheck{name="etcd",type="readyz"} 1
kubernetes_healthcheck{name="etcd-readiness",type="readyz"} 1
kubernetes_healthcheck{name="informer-sync",type="readyz"} 1
kubernetes_healthcheck{name="log",type="healthz"} 1
kubernetes_healthcheck{name="log",type="readyz"} 1
kubernetes_healthcheck{name="ping",type="healthz"} 1
kubernetes_healthcheck{name="ping",type="readyz"} 1
# HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthchecks.
# TYPE kubernetes_healthchecks_total counter
kubernetes_healthchecks_total{name="autoregister-completion",status="error",type="readyz"} 1
kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="healthz"} 15
kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="readyz"} 14
kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15
kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15
kubernetes_healthchecks_total{name="etcd-readiness",status="success",type="readyz"} 15
kubernetes_healthchecks_total{name="informer-sync",status="error",type="readyz"} 1
kubernetes_healthchecks_total{name="informer-sync",status="success",type="readyz"} 14
kubernetes_healthchecks_total{name="log",status="success",type="healthz"} 15
kubernetes_healthchecks_total{name="log",status="success",type="readyz"} 15
kubernetes_healthchecks_total{name="ping",status="success",type="healthz"} 15
kubernetes_healthchecks_total{name="ping",status="success",type="readyz"} 15
[Storage] KEP-3107: NodeExpandSecret Added to CSI Persistent Volume Source (GA)¶
The NodeExpandSecret feature has been moved to GA in v1.29. This feature adds NodeExpandSecret to CSI persistent volume sources and allows CSI clients to send it as part of NodeExpandVolume requests to the CSI driver.
[Storage] KEP-3751: VolumeAttributesClass (Alpha)¶
The VolumeAttributesClass feature was introduced in v1.29 and is now in the Alpha stage, not enabled by default. This feature extends the Kubernetes Persistent Volume API to allow users to change volume attributes (e.g., IOPS or throughput) after they are provisioned.
It should be noted that in v1.29, the VolumeAttributesClass feature may only contain API changes and its functionality has not been implemented yet. The implementation is in external-provisioner and external-resizer , which will be released asynchronously after the release of Kubernetes v1.29.
CSI corresponding to the standard version is 1.9, and controller plugins for CSI implementations from various vendors must support the MODIFY_VOLUME capability.
[Instrumentation] KEP-3077: kube-scheduler Converted to Contextual Logging (Alpha)¶
The contextual logging feature introduced in Kubernetes v1.24 has now been successfully migrated to two components (kube-scheduler and kube-controller-manager) and some directories. This feature is aimed at providing Kubernetes with more useful logs for better troubleshooting and problem tracking. Currently, this feature is in the Alpha stage, and it can be enabled by setting the ContextualLogging feature gate.
Example:
I1113 08:43:37.029524 87144 default_binder.go:53] "Attempting to bind pod to node" logger="Bind.DefaultBinder" pod="kube-system/coredns-69cbfb9798-ms4pq" node="127.0.0.1"
[Node] KEP-127: Enable Pod Security Standards Support for User Namespaces (Alpha)¶
The UserNamespacesPodSecurityStandards feature gate has been added to enable support for Pod Security Standards in user namespaces. Enabling this feature will modify all Pod Security Standards rules to allow the setting of spec[.*].securityContext.[runAsNonRoot,runAsUser] . This feature should only be enabled if all nodes in the cluster support and enable user namespaces. The feature gate will not be upgraded or enabled by default in future Kubernetes versions.
[Node] KEP-4191: Kubelet Supports Image Filesystem Separation (Alpha)¶
Kubelet supports the separation of the image filesystem in v1.29 as Alpha, controlled by the KubeletSeparateDiskGC feature gate, which is not enabled by default. Kubelet can support separating the ImageFilesystem into writable and read-only layers, with the writable layer on the same disk as Kubelet and the images potentially on a separate filesystem.
[Node] KEP-4210: Add Support for ImageMaximumGCAge Field (Alpha)¶
The ImageMaximumGCAge field has been added to the Kubelet configuration, which allows users to set the maximum period of time that an image can be unused before being garbage collected. The field is controlled by the ImageMaximumGCAge feature gate, which is currently in Alpha and not enabled by default.
[Auth] KEP-4193: Enhanced Binding Service Account Tokens (Alpha)¶
Kubernetes v1.29 introduces 4 feature gates to enhance binding service account tokens. The kube-apiserver now adds support for the Alpha version by using the ServiceAccountTokenJTI feature gate to add the jti (JWT ID) claim to the service account tokens it issues. In addition, when the token is issued, the audit logs will add the authentication.kubernetes.io/credential-id audit annotation, and when the token is used for authentication, the authentication.kubernetes.io/credential-id entry will be added to the additional user info.
The kube-apiserver now adds support for the Alpha version by using the ServiceAccountTokenPodNodeInfo feature gate to add the node name (and the UID of the node if it exists) as additional claims in the service account tokens. These tokens are bound to Pods and when used for authentication, the authentication.kubernetes.io/node-name and authentication.kubernetes.io/node-uid will be provided as additional user info.
The kube-apiserver now adds support for the Alpha version by using the ServiceAccountTokenNodeBinding feature gate to allow tokens to be directly bound to nodes by TokenRequests and validated when used (through the ServiceAccountTokenNodeBindingValidation feature gate) to ensure that the node name and UID are still present.
Other Notices¶
- [Node] Fixed the issue of sysctl validation for Pod specs: Pods with hostNetwork are prohibited from configuring sysctl for network namespaces, and Pods with hostIPC are prohibited from configuring sysctl for IPC namespaces.
- [Scheduler] The selectedSpread plugin in kube-scheduler has been removed, please use the podTopologySpread plugin instead.
- [Scheduler] Fixed a regression issue in the scheduler framework since v1.27.0 when running score plugins. The number of SkippedScorePlugins may be greater than enabledScorePlugins , so the value of cap(len(enabledScorePlugins) - len(skippedScorePlugins)) in the initialization of the slice may be negative, which is not allowed.
- [Scheduler] Added scheduling hints to the QueueingHint plugin in kube-scheduler.
- [Storage] Contributed to the design of the VolumeAttributesClass API.
- [Instrumentation] Fully converted kube-scheduler to contextual logging.
- [Networking] Promoted PodHostIPs condition to Beta.
- [Kubeadm] Allowed deployment of kubelet that is 3 versions older than kubeadm (N-3). This aligns with recent changes made by SIG Architecture that extend the support skew between the control plane and kubelet.
- [Kubeadm] Fixed the invalid namespace field of clusterrole .
- [Test] Updated test framework methods and added tests to increase test coverage.
In the v1.29 release cycle, DaoCloud has contributed to sig-node, sig-scheduling, sig-storage, sig-instrumentation, sig-network, and kubeadm related contents, including but not limited to:
- [Node] Fixed the issue of sysctl validation for Pod specs.
- [Scheduler] Removed the selectedSpread plugin in kube-scheduler and replaced it with the podTopologySpread plugin.
- [Scheduler] Fixed a regression issue in the scheduler framework since v1.27.0.
- [Scheduler] Added scheduling hints to the QueueingHint plugin in kube-scheduler.
- [Storage] Participated in the design of the VolumeAttributesClass API.
- [Instrumentation] Fully converted kube-scheduler to contextual logging.
- [Networking] Promoted PodHostIPs condition to Beta.
- [Kubeadm] Allowed deployment of kubelet that is 3 versions older than kubeadm (N-3).
- [Kubeadm] Fixed the invalid namespace field of clusterrole .
- [Test] Updated the test framework methods and increased test coverage.
In the v1.29 release process, DaoCloud participated in over 100 issue fixes and feature developments, with approximately 111 commits from the author. For more details, please refer to the contribution list (17 of the over 200 contributors in this version are from DaoCloud).
During the Kubernetes v1.29 release cycle, several R&D engineers from DaoCloud achieved notable accomplishments. Among them, Paco became the first Chinese member to be selected for the Kubernetes Steering Committee. Cai Wei became the CNCF Fall 2023 Cloud Native Global Ambassador, as well as the organizer and host of KCD Shenzhen 2023. Liu Mengjiao became the WG Structured Logging Lead and a Kubernetes/klog Reviewer. Guo Qifeng became a Calico Big Cat Ambassador.
Upgrade Considerations¶
Serious Issues with EventedPLEG¶
The EventedPLEG feature (event-driven PLEG) was upgraded to Beta in Kubernetes v1.27, but it is not enabled by default. Serious issues were discovered during testing in v1.29, so it is recommended not to enable it! The community is currently investigating and fixing the issue, but the specific cause has not yet been identified.
Removal of in-tree Cloud Providers Upgraded to Beta¶
The removal of in-tree cloud providers in Kubernetes v1.29 has been upgraded to Beta. Users need to pay attention to the feature gates DisableCloudProviders and DisableKubeletCloudCredentialProvider , which are now set to true by default. This means that there is no built-in integration with any cloud providers (such as Azure, GCE, vSphere) in the default runtime. If you still need this feature, please set the feature gates DisableCloudProviders and DisableKubeletCloudCredentialProvider to false or use an external cloud controller manager.
For more information on how to enable and run an external cloud controller manager, please refer to the Running Cloud Controller Manager documentation.
If you need a cloud controller manager for old in-tree cloud providers, please refer to the following links:
- Cloud provider AWS
- Cloud provider Azure
- Cloud provider GCE
- Cloud provider OpenStack
- Cloud provider vSphere
Freezing of Kubernetes Legacy Package Repositories¶
The Kubernetes legacy package repositories (apt.kubernetes.io and yum.kubernetes.io) were frozen on September 13, 2023. Starting from Kubernetes v1.29, packages will only be released to the community-owned repository (pkgs.k8s.io). The deprecated old repositories and their contents are expected to be removed in January 2024. The Kubernetes project strongly recommends migrating to the new community-owned repositories as soon as possible!
For more details and instructions on how to migrate, please refer to the Legacy Package Repository Deprecation blog post.
Other Changes to Note¶
- Removal of the ClusterCIDR Alpha API for networking: The SIG has raised reasonable doubts about this feature, and after months of community discussions, the existing code still in Alpha has been removed.
- Deprecation of the status.nodeInfo.kubeProxyVersion field for Nodes: This field is inaccurate as it is set by kubelet, which actually doesn't know the kube-proxy version or whether kube-proxy is running.
- Deprecation of the flowcontrol.apiserver.k8s.io/v1beta2 API version for FlowSchema and PriorityLevelConfiguration , and the upgrade of flowcontrol.apiserver.k8s.io/v1beta3 to flowcontrol.apiserver.k8s.io/v1 . If your manifests or client software use the deprecated Beta API group, you should update them before upgrading to v1.29.
Version Sign¶
The theme of this release is: Mandala ✨
Embark on a journey through the universe of Kubernetes v1.29 with us!
This version is inspired by the beautiful art form of Mandala, which is a symbol of the perfect expression of the universe. In this release, there were approximately 40 release team members supported by hundreds of community contributors who worked hard to turn challenges into joy for millions of people around the world. The Mandala theme reflects the interconnected and vibrant nature of the community. Each contributor is an essential part, adding unique energy like the diverse patterns in Mandala art. Kubernetes grows and thrives in collaboration, responding to the harmony in Mandala creation.
The release sign was created by Mario Jason Braganza (based on Mandala art, thanks to Fibrel Ojalá), symbolizing the mini-universe of the Kubernetes project and all its members.
In the spirit of the transformative symbol of Mandala, Kubernetes v1.29 celebrates the evolution of our project. Like stars in the Kubernetes universe, every contributor, user, and supporter illuminates the path. Together, we have created a world of possibilities.
Historical Documents¶
- Kubernetes 1.28 Shocking Release, Sidecar Containers Coming
- The Most Feature-Packed in Nearly Two Years! Kubernetes 1.27 Officially Released
- Kubernetes Officially Releases v1.26, Significant Stability Improvements
- Kubernetes 1.25 Officially Released, Major Breakthroughs in Multiple Aspects
- Kubernetes 1.24 Towards a Mature Kubernetes
- Kubernetes 1.23 Officially Released, What Enhancements Are There?
- Kubernetes 1.22 Breaks Your Imagination: Swap Can Be Enabled, PSP Replacement Solution Released, and More
- Kubernetes 1.21 Shocking Release | PSP to Be Deprecated, BareMetal Gets Enhanced
References¶
- Kubernetes Enhancement Proposals https://kep.k8s.io/
- Kubernetes 1.29 Release Team https://github.com/kubernetes/sig-release/blob/master/releases/release-1.29
- Kubernetes 1.29 Changelog https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.29.md
- Kubernetes 1.29 Theme Discussion https://github.com/kubernetes/sig-release/discussions/2349