Skip to content

Commit dec049c

Browse files
Kalyan Reddy DaidaKalyan Reddy Daida
Kalyan Reddy Daida
authored and
Kalyan Reddy Daida
committed
Welcome to Stack Simplify
1 parent f94f449 commit dec049c

File tree

2 files changed

+197
-0
lines changed

2 files changed

+197
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
# EKS - Cluster Autoscaler
2+
3+
## Step-01: Introduction
4+
- The Kubernetes Cluster Autoscaler automatically adjusts the number of nodes in your cluster when pods fail to launch due to lack of resources or when nodes in the cluster are underutilized and their pods can be rescheduled onto other nodes in the cluster.
5+
6+
## Step-02: Verify if our NodeGroup as --asg-access
7+
- We need to ensure that we have a parameter named `--asg-access` present during the cluster or nodegroup creation.
8+
- Verify the same when we created our cluster node group
9+
10+
### What will happen if we use --asg-access tag?
11+
- It enables IAM policy for cluster-autoscaler
12+
- Lets review our nodegroup IAM role for the same.
13+
- Go to Services -> IAM -> Roles -> eksctl-eksdemo1-nodegroup-XXXXXX
14+
- Click on **Permissions** tab
15+
- You should see a inline policy named `eksctl-eksdemo1-nodegroup-eksdemo1-ng-private1-PolicyAutoScaling` in the list of policies associated to this role.
16+
17+
## Step-03: Deploy Cluster Autoscaler
18+
```
19+
# Deploy the Cluster Autoscaler to your cluster
20+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
21+
22+
# Add the cluster-autoscaler.kubernetes.io/safe-to-evict annotation to the deployment
23+
kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"
24+
```
25+
## Step-04: Edit Cluster Autoscaler Deployment to add Cluster name and two more parameters
26+
```
27+
kubectl -n kube-system edit deployment.apps/cluster-autoscaler
28+
```
29+
- **Add cluster name**
30+
```yml
31+
# Before Change
32+
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
33+
34+
# After Change
35+
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/eksdemo1
36+
```
37+
38+
- **Add two more parameters**
39+
```yml
40+
- --balance-similar-node-groups
41+
- --skip-nodes-with-system-pods=false
42+
```
43+
- **Sample for reference**
44+
```yml
45+
spec:
46+
containers:
47+
- command:
48+
- ./cluster-autoscaler
49+
- --v=4
50+
- --stderrthreshold=info
51+
- --cloud-provider=aws
52+
- --skip-nodes-with-local-storage=false
53+
- --expander=least-waste
54+
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/eksdemo1
55+
- --balance-similar-node-groups
56+
- --skip-nodes-with-system-pods=false
57+
```
58+
59+
## Step-05: Set the Cluster Autoscaler Image related to our current EKS Cluster version
60+
- Open https://github.com/kubernetes/autoscaler/releases
61+
- Find our release version (example: 1.16.n) and update the same.
62+
- Our Cluster version is 1.16 and our cluster autoscaler version is 1.16.5 as per above releases link
63+
```
64+
# Template
65+
# Update Cluster Autoscaler Image Version
66+
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.XY.Z
67+
68+
69+
# Update Cluster Autoscaler Image Version
70+
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.16.5
71+
```
72+
73+
## Step-06: Verify Image version got updated
74+
```
75+
kubectl -n kube-system get deployment.apps/cluster-autoscaler -o yaml
76+
```
77+
- **Sample partial output**
78+
```yml
79+
spec:
80+
containers:
81+
- command:
82+
- ./cluster-autoscaler
83+
- --v=4
84+
- --stderrthreshold=info
85+
- --cloud-provider=aws
86+
- --skip-nodes-with-local-storage=false
87+
- --expander=least-waste
88+
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/eksdemo1
89+
- --balance-similar-node-groups
90+
- --skip-nodes-with-system-pods=false
91+
image: us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.16.5
92+
```
93+
94+
## Step-07: View Cluster Autoscaler logs to verify that it is monitoring your cluster load.
95+
```
96+
kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
97+
```
98+
- Sample log reference
99+
```log
100+
I0607 09:14:37.793323 1 pre_filtering_processor.go:66] Skipping ip-192-168-60-30.ec2.internal - node group min size reached
101+
I0607 09:14:37.793332 1 pre_filtering_processor.go:66] Skipping ip-192-168-27-213.ec2.internal - node group min size reached
102+
I0607 09:14:37.793408 1 static_autoscaler.go:440] Scale down status: unneededOnly=true lastScaleUpTime=2020-06-07 09:12:27.367461648 +0000 UTC m=+37.138078060 lastScaleDownDeleteTime=2020-06-07 09:12:27.367461724 +0000 UTC m=+37.138078135 lastScaleDownFailTime=2020-06-07 09:12:27.367461801 +0000 UTC m=+37.138078213 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=true
103+
I0607 09:14:47.803891 1 static_autoscaler.go:192] Starting main loop
104+
I0607 09:14:47.804234 1 utils.go:590] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
105+
I0607 09:14:47.804251 1 filter_out_schedulable.go:65] Filtering out schedulables
106+
I0607 09:14:47.804319 1 filter_out_schedulable.go:130] 0 other pods marked as unschedulable can be scheduled.
107+
I0607 09:14:47.804343 1 filter_out_schedulable.go:130] 0 other pods marked as unschedulable can be scheduled.
108+
I0607 09:14:47.804351 1 filter_out_schedulable.go:90] No schedulable pods
109+
I0607 09:14:47.804366 1 static_autoscaler.go:334] No unschedulable pods
110+
I0607 09:14:47.804376 1 static_autoscaler.go:381] Calculating unneeded nodes
111+
I0607 09:14:47.804392 1 pre_filtering_processor.go:66] Skipping ip-192-168-60-30.ec2.internal - node group min size reached
112+
I0607 09:14:47.804401 1 pre_filtering_processor.go:66] Skipping ip-192-168-27-213.ec2.internal - node group min size reached
113+
I0607 09:14:47.804460 1 static_autoscaler.go:440] Scale down status: unneededOnly=true lastScaleUpTime=2020-06-07 09:12:27.367461648 +0000 UTC m=+37.138078060 lastScaleDownDeleteTime=2020-06-07 09:12:27.367461724 +0000 UTC m=+37.138078135 lastScaleDownFailTime=2020-06-07 09:12:27.367461801 +0000 UTC m=+37.138078213 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=true
114+
115+
```
116+
117+
## Step-08: Deploy simple Application
118+
```
119+
# Deploy Application
120+
kubectl apply -f kube-manifests/
121+
```
122+
123+
## Step-09: Cluster Scale UP: Scale our application to 30 pods
124+
- In 2 to 3 minutes, one after the other new nodes will added and pods will be scheduled on them.
125+
- Our max number of nodes will be 4 which we provided during nodegroup creation.
126+
```
127+
# Terminal - 1: Keep monitoring cluster autoscaler logs
128+
kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
129+
130+
# Terminal - 2: Scale UP the demo application to 30 pods
131+
kubectl get pods
132+
kubectl get nodes
133+
kubectl scale --replicas=30 deploy ca-demo-deployment
134+
kubectl get pods
135+
136+
# Terminal - 2: Verify nodes
137+
kubectl get nodes -o wide
138+
```
139+
## Step-10: Cluster Scale DOWN: Scale our application to 1 pod
140+
- It might take 5 to 20 minutes to cool down and come down to minimum nodes which will be 2 which we configured during nodegroup creation
141+
```
142+
# Terminal - 1: Keep monitoring cluster autoscaler logs
143+
kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
144+
145+
# Terminal - 2: Scale down the demo application to 1 pod
146+
kubectl scale --replicas=1 deploy ca-demo-deployment
147+
148+
# Terminal - 2: Verify nodes
149+
kubectl get nodes -o wide
150+
```
151+
152+
## Step-11: Clean-Up
153+
- We will leave cluster autoscaler and undeploy only application
154+
```
155+
kubectl delete -f kube-manifests/
156+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: ca-demo-deployment
5+
labels:
6+
app: ca-nginx
7+
spec:
8+
replicas: 1
9+
selector:
10+
matchLabels:
11+
app: ca-nginx
12+
template:
13+
metadata:
14+
labels:
15+
app: ca-nginx
16+
spec:
17+
containers:
18+
- name: ca-nginx
19+
image: stacksimplify/kubenginx:1.0.0
20+
ports:
21+
- containerPort: 80
22+
resources:
23+
requests:
24+
cpu: "200m"
25+
memory: "200Mi"
26+
---
27+
apiVersion: v1
28+
kind: Service
29+
metadata:
30+
name: ca-demo-service-nginx
31+
labels:
32+
app: ca-nginx
33+
spec:
34+
type: NodePort
35+
selector:
36+
app: ca-nginx
37+
ports:
38+
- port: 80
39+
targetPort: 80
40+
nodePort: 31233
41+

0 commit comments

Comments
 (0)