Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add support for optionally setting lr_max_steps in the learning rate scheduler, enabling training to stop at a specified step using Trainer.max without requiring modifications to the full LR schedule. #749

Open
dorotat-nv opened this issue Mar 12, 2025 · 0 comments
Labels

Comments

@dorotat-nv
Copy link
Collaborator

dorotat-nv commented Mar 12, 2025

Problem & Motivation

In Evo2, using the --max-steps argument to stop training at a specific step also modifies the learning rate schedule. This makes it difficult to test partial convergence training that stops at a given step without altering the intended LR schedule.
File: sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py

Remove then SignalAfterGivenStepCallback from the training script

BioNeMo Framework Version

7428f5f

Category

Model/Training

Proposed Solution

introduce a new optional argument ie lr_scheduler_steps which, when passed, sets lr rate scheduler number of steps instead of max_steps

Expected Benefits

max_steps can be used to control length of the training when lr_scheduler_steps is used to define schedule of lr

Code Example

@dorotat-nv dorotat-nv changed the title Add support for optionally setting lr_max_steps in the learning rate scheduler, enabling training to stop at a specified step using Trainer.max without requiring modifications to the full LR schedule. [Feature] Add support for optionally setting lr_max_steps in the learning rate scheduler, enabling training to stop at a specified step using Trainer.max without requiring modifications to the full LR schedule. Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant