Skip to content

Commit 106bcbd

Browse files
committed
Update autoscaling from zero enhancement proposal with support for platform-aware autoscale from zero
This commit updates the contract between the cluster-autoscaler Cluster API provider and the infrastructure provider's controllers that reconcile the Infrastructure Machine Template to support platform-aware autoscale from 0 in clusters consisting of nodes heterogeneous in CPU architecture and OS. With this commit, the infrastructure providers implementing controllers to reconcile the status of their Infrastructure Machine Templates for supporting autoscale from 0 will be able to fill the status.nodeInfo stanza with additional information about the nodes. The status.nodeInfo stanza has type corev1.NodeSystemInfo to reflect the same content, the rendered nodes' objects would store in their status field. The cluster-autoscaler can use that information to build the node template labels `kubernetes.io/arch` and `kubernetes.io/os` if that information is present. Suppose the pending pods that trigger the cluster autoscaler have a node selector or a requiredDuringSchedulingIgnoredDuringExecution node affinity concerning the architecture or operating system of the node where they can execute. In that case, the autoscaler will be able to filter the nodes groups options according to the architecture or operating system requested by the pod. The users could already provide this information to the cluster autoscaler through the labels capacity annotation. However, there is no similar capability to support future labels/taints through information set by the reconcilers of the status of Infrastructure Machine Templates.
1 parent ccaea78 commit 106bcbd

File tree

1 file changed

+20
-3
lines changed

1 file changed

+20
-3
lines changed

docs/proposals/20210310-opt-in-autoscaling-from-zero.md

+20-3
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,8 @@ node group. But, during a scale from zero situation (ie when a node group has ze
107107
autoscaler needs to acquire this information from the infrastructure provider.
108108

109109
An optional status field is proposed on the Infrastructure Machine Template which will be populated
110-
by infrastructure providers to contain the CPU, memory, and GPU capacities for machines described by that
111-
template. The cluster autoscaler will then utilize this information by reading the appropriate
110+
by infrastructure providers to contain the CPU, CPU architecture, memory, and GPU capacities for machines
111+
described by that template. The cluster autoscaler will then utilize this information by reading the appropriate
112112
infrastructure reference from the resource it is scaling (MachineSet or MachineDeployment).
113113

114114
A user may override the field in the associated infrastructure template by applying annotations to the
@@ -160,6 +160,10 @@ the template. Internally, this field will be represented by a Go `map` type uti
160160
for the keys and `k8s.io/apimachinery/pkg/api/resource.Quantity` as the values (similar to how resource
161161
limits and requests are handled for pods).
162162

163+
Additionally, the status field could contain information about the node, such as the architecture and
164+
operating system. This information is not required for the autoscaler to function, but it can be useful in
165+
scenarios where the autoscaler needs to make decisions for clusters with heterogeneous node groups in architecture, OS, or both.
166+
163167
It is worth mentioning that the Infrastructure Machine Templates are not usually reconciled by themselves.
164168
Each infrastructure provider will be responsible for determining the best implementation for adding the
165169
status field based on the information available on their platform.
@@ -175,6 +179,7 @@ const (
175179
// DockerMachineTemplateStatus defines the observed state of a DockerMachineTemplate
176180
type DockerMachineTemplateStatus struct {
177181
Capacity corev1.ResourceList `json:"capacity,omitempty"`
182+
NodeInfo *corev1.NodeSystemInfo `json:"nodeInfo,omitempty"`
178183
}
179184
180185
// DockerMachineTemplate is the Schema for the dockermachinetemplates API.
@@ -186,7 +191,7 @@ type DockerMachineTemplate struct {
186191
Status DockerMachineTemplateStatus `json:"status,omitempty"`
187192
}
188193
```
189-
_Note: the `ResourceList` and `ResourceName` referenced are from k8s.io/api/core/v1`_
194+
_Note: the `ResourceList`, `ResourceName` and `NodeSystemInfo` referenced are from k8s.io/api/core/v1`_
190195

191196
When used as a manifest, it would look like this:
192197

@@ -204,8 +209,16 @@ status:
204209
memory: 500mb
205210
cpu: "1"
206211
nvidia.com/gpu: "1"
212+
nodeInfo:
213+
architecture: arm64
214+
operatingSystem: linux
207215
```
208216

217+
The information stored in the `status.nodeInfo` field is rendered as labels for the node object that is created within
218+
the node group by the cluster autoscaler and fed into the cluster autoscaler's scheduler simulator `framework.NodeInfo` struct.
219+
In particular, the `architecture` and `operatingSystem` fields are used to determine the simulated node's labels
220+
`kubernetes.io/arch` and `kubernetes.io/os`. This logic will be implemented in the cluster autoscaler's ClusterAPI cloud provider code.
221+
209222
#### MachineSet and MachineDeployment Annotations
210223

211224
In cases where a user needs to provide specific resource information for a
@@ -229,6 +242,8 @@ metadata:
229242
capacity.cluster-autoscaler.kubernetes.io/memory: "500mb"
230243
capacity.cluster-autoscaler.kubernetes.io/cpu: "1"
231244
capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk: "100Gi"
245+
node-info.cluster-autoscaler.kubernetes.io/cpu-architecture: "arm64"
246+
node-info.cluster-autoscaler.kubernetes.io/os: "linux"
232247
```
233248
_Note: the annotations will be defined in the cluster autoscaler, not in cluster-api._
234249

@@ -246,6 +261,8 @@ metadata:
246261
capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute"
247262
```
248263

264+
If the `capacity.cluster-autoscaler.kubernetes.io/labels` annotation specifies a label that would otherwise be generated from the `capacity` or `node-info` annotations, the autoscaler will use the label defined in `capacity.cluster-autoscaler.kubernetes.io/labels`, overriding any labels produced by processing the other annotations.
265+
249266
### Security Model
250267

251268
This feature will require the service account associated with the cluster autoscaler to have

0 commit comments

Comments
 (0)