@@ -60,13 +60,100 @@ kubectl run --generator=run-pod/v1 apache-bench -i --tty --rm --image=httpd -- a
60
60
61
61
## Step-07: CloudWatch Log Insights
62
62
- View Container logs
63
+ - View Container Performance Logs
63
64
65
+ ## Step-08: Container Insights - Log Insights in depth
66
+ - Log Groups
67
+ - Log Insights
68
+ - Create Dashboard
64
69
65
- ## Step-08: CloudWatch Alarms from metrics
66
- - Create Alarms
70
+ ### Create Graph for Avg Node CPU Utlization
71
+ - DashBoard Name: EKS-Performance
72
+ - Widget Type: Bar
73
+ - Log Group: /aws/containerinsights/eksdemo1/performance
74
+ ```
75
+ STATS avg(node_cpu_utilization) as avg_node_cpu_utilization by NodeName
76
+ | SORT avg_node_cpu_utilization DESC
77
+ ```
78
+
79
+ ### Container Restarts
80
+ - DashBoard Name: EKS-Performance
81
+ - Widget Type: Table
82
+ - Log Group: /aws/containerinsights/eksdemo1/performance
83
+ ```
84
+ STATS avg(number_of_container_restarts) as avg_number_of_container_restarts by PodName
85
+ | SORT avg_number_of_container_restarts DESC
86
+ ```
87
+
88
+ ### Cluster Node Failures
89
+ - DashBoard Name: EKS-Performance
90
+ - Widget Type: Table
91
+ - Log Group: /aws/containerinsights/eksdemo1/performance
92
+ ```
93
+ stats avg(cluster_failed_node_count) as CountOfNodeFailures
94
+ | filter Type="Cluster"
95
+ | sort @timestamp desc
96
+ ```
97
+ ### CPU Usage By Container
98
+ - DashBoard Name: EKS-Performance
99
+ - Widget Type: Bar
100
+ - Log Group: /aws/containerinsights/eksdemo1/performance
101
+ ```
102
+ stats pct(container_cpu_usage_total, 50) as CPUPercMedian by kubernetes.container_name
103
+ | filter Type="Container"
104
+ ```
67
105
106
+ ### Pods Requested vs Pods Running
107
+ - DashBoard Name: EKS-Performance
108
+ - Widget Type: Bar
109
+ - Log Group: /aws/containerinsights/eksdemo1/performance
110
+ ```
111
+ fields @timestamp, @message
112
+ | sort @timestamp desc
113
+ | filter Type="Pod"
114
+ | stats min(pod_number_of_containers) as requested, min(pod_number_of_running_containers) as running, ceil(avg(pod_number_of_containers-pod_number_of_running_containers)) as pods_missing by kubernetes.pod_name
115
+ | sort pods_missing desc
116
+ ```
117
+
118
+ ### Application log errors by container name
119
+ - DashBoard Name: EKS-Performance
120
+ - Widget Type: Bar
121
+ - Log Group: /aws/containerinsights/eksdemo1/application
122
+ ```
123
+ stats count() as countoferrors by kubernetes.container_name
124
+ | filter stream="stderr"
125
+ | sort countoferrors desc
126
+ ```
127
+
128
+ - ** Reference** : https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-view-metrics.html
129
+
130
+
131
+ ## Step-09: Container Insights - CloudWatch Alarms
132
+ ### Create Alarms - Node CPU Usage
133
+ - ** Specify metric and conditions**
134
+ - ** Select Metric:** Container Insights -> ClusterName -> node_cpu_utilization
135
+ - ** Metric Name:** eksdemo1_node_cpu_utilization
136
+ - ** Threshold Value:** 4
137
+ - ** Important Note:** Anything above 4% of CPU it will send a notification email, ideally it should 80% or 90% CPU but we are giving 4% CPU just for load simulation testing
138
+ - ** Configure Actions**
139
+ - ** Create New Topic:** eks-alerts
140
+
141
+ - Click on ** Create Topic**
142
+ - ** Important Note:**** Complete Email subscription sent to your email id.
143
+ - ** Add name and description**
144
+ - ** Name:** EKS-Nodes-CPU-Alert
145
+ - ** Descritption:** EKS Nodes CPU alert notification
146
+ - Click Next
147
+ - ** Preview**
148
+ - Preview and Create Alarm
149
+ - ** Add Alarm to our custom Dashboard**
150
+ - Generate Load & Verify Alarm
151
+ ```
152
+ # Generate Load
153
+ kubectl run --generator=run-pod/v1 apache-bench -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://sample-nginx-service.default.svc.cluster.local/
154
+ ```
68
155
69
- ## Step-09 : Clean-Up Container Insights
156
+ ## Step-10 : Clean-Up Container Insights
70
157
```
71
158
# Template
72
159
curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/cluster-name/;s/{{region_name}}/cluster-region/" | kubectl delete -f -
@@ -75,7 +162,7 @@ curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-i
75
162
curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/eksdemo1/;s/{{region_name}}/us-east-1/" | kubectl delete -f -
76
163
```
77
164
78
- ## Step-10 : Clean-Up Application
165
+ ## Step-11 : Clean-Up Application
79
166
```
80
167
# Delete Apps
81
168
kubectl delete -f kube-manifests/
0 commit comments