Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Support for BoostrapSelfManagedAddons flag for EKS cluster creation #5222

Merged
merged 1 commit into from
Mar 18, 2025

Conversation

jas-nik
Copy link
Contributor

@jas-nik jas-nik commented Nov 20, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

Add flag to support BootstrapSelfManagedAddons to provision Bare EKS cluster without default addons (coreDNS, kube-proxy, aws-vpc-cni)

https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html

By default, EKS installs multiple networking add-ons during cluster creation. This includes the Amazon VPC CNI, CoreDNS, and kube-proxy.

If you’d like to disable the installation of these default networking add-ons, use the parameter below. This may be used for alternate CNIs, such as Cilium. Review the EKS API reference for more information.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Add flag to support BootstrapSelfManagedAddons to provision Bare EKS cluster without default addons (coreDNS, kube-proxy, aws-vpc-cni)

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 20, 2024
@k8s-ci-robot k8s-ci-robot added needs-priority needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @jas-nik. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@adriananeci
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 20, 2024
@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 20, 2024

"failed to create new managed VPC: failed to create vpc: The maximum number of VPCs has been reached"

😞

@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 20, 2024

/retest

@JonnieDoe
Copy link

/lgtm

@k8s-ci-robot
Copy link
Contributor

@JonnieDoe: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 20, 2024
@JonnieDoe
Copy link

/retest

1 similar comment
@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 25, 2024

/retest

@nrb
Copy link
Contributor

nrb commented Dec 5, 2024

@jas-nik Sorry about the delay here, and thank you for this contribution.

I've been troubleshooting EKS CI issues, and this behavior is very welcome :)

I do ask that you add some tests for this case. We'll need at least a cluster template and to point to that template in the e2e test config.

@jas-nik
Copy link
Contributor Author

jas-nik commented Jan 27, 2025

@nrb Apologies, this fell off my radar. I'll get them added

@damdo
Copy link
Member

damdo commented Feb 4, 2025

Hey @jas-nik CAPA doesn't do merge commits, would you be able to rebase instead?
Thanks!

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 4, 2025
@nrb
Copy link
Contributor

nrb commented Feb 4, 2025

/test pull-cluster-api-provider-aws-test

Hit VPC limit.

@damdo
Copy link
Member

damdo commented Feb 5, 2025

/test pull-cluster-api-provider-aws-test

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 6, 2025

@damdo @nrb @richardcase would you be able to help with the VPC limit issue?

@damdo
Copy link
Member

damdo commented Feb 6, 2025

/test pull-cluster-api-provider-aws-test

@nrb
Copy link
Contributor

nrb commented Feb 6, 2025

@jas-nik Unfortunately we can't allocate more VPCs right now. The best we can do is retry the tests at off-peak times.

@nrb
Copy link
Contributor

nrb commented Feb 6, 2025

/retest

@richardcase
Copy link
Member

@jas-nik @nrb @damdo - are getting the "VPC limit" error when running just the unit tests or the e2e tests? If it's the unit tests, (i.e. via /test pull-cluster-api-provider-aws-test), which i guess is the case from the discussion, then that should be within our control to change as it shouldn't be fitting AWS and this may be coming from out "resource counting" code.

Also, if it is the e2e then potentially we can increase the service limits.

@richardcase
Copy link
Member

Are we sure thats the real error? We have a test the checks for maximum number of VPCs:

g.Expect(err.Error()).To(ContainSubstring("The maximum number of VPCs has been reached"))

I will have a look at the logs in the morning to check.

@richardcase
Copy link
Member

richardcase commented Feb 6, 2025

Looking at the logs it seems to be multiple issues with tests:

  • FAIL: TestDefaultingWebhook (0.59s) (and sub tests)
  • FAIL: TestCreateCluster/cluster_create_with_2_subnets (0.00s)
  • FAIL: TestCreateIPv6Cluster (0.01s)

Its worth running just these tests locally to see whats going on.

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 6, 2025

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api-provider-aws/5336/pull-cluster-api-provider-aws-test/1887504193065848832 - One of the latest PRs build passed even after VPC limit failure, so it might be a red herring after all. Thank you for chiming in. Still need to investigate what is the actual failure.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 28, 2025
@k8s-ci-robot k8s-ci-robot added do-not-merge/contains-merge-commits and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 13, 2025
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/contains-merge-commits size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 13, 2025
@jas-nik
Copy link
Contributor Author

jas-nik commented Mar 13, 2025

@nrb @richardcase @damdo Looks like the issue was due to the test cases expecting default value for BootstrapSelfManagedAddons flag. Adjusting expectedSpec for them solves the issue but not sure if that is the right approach here. Should I add a default value for the flag or remove the default value all together since AWS already considers default as True.

@nrb
Copy link
Contributor

nrb commented Mar 17, 2025

Should I add a default value for the flag or remove the default value all together since AWS already considers default as True.

I'm fine with the code you've got as-is. Having it explicit in the CAPA side is nice for users to see at a glance.

Overall this looks good to me, just one question.

@adriananeci
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 18, 2025
@nrb
Copy link
Contributor

nrb commented Mar 18, 2025

/approve

I'll make a separate issue for the VPC limits; we seem to hit that a lot, so we likely need to up some limit or adjust the test code that Richard linked to.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2025
@k8s-ci-robot k8s-ci-robot merged commit e4961f8 into kubernetes-sigs:main Mar 18, 2025
17 checks passed
@nrb
Copy link
Contributor

nrb commented Mar 18, 2025

Issue for the maximum VPC flake: #5415

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants