Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

stduhpf
Copy link
Contributor

@stduhpf stduhpf commented Feb 12, 2025

Implements this paper: https://arxiv.org/abs/2410.02416

TLDR:

APG is a set of 3 modilfications for CFG:

  • reverse momentum: The CFG update is getting steered away from (or closer to) the previous step's CFG update ( --apg-momentum)
  • normalization: the L2 norm of the CFG update is clamped to some value "norm threshold" (--apg-nt)
  • projection: the CFG update (out_uncond-out_cond) is orthogonally projected on the same "direction" as out_cond. The final update is linearly interpolated between the original update and the projected update with the parameter "eta" (--apg-eta)

No extra forward pass is required, so the performance cost is negligible.

Thanks mostly to the normalization, but also the projection, this allows to take adventage of very large CFG scales without getting deep-fried output images. I'm not sure how usefull the reverse momentum really is, but it was in the paper so I added it too (I think it prevents the CFG from going too much "in the same direction" at every step?).

Usage

[your usual command with cfg here] --apg-eta 0 --apg-nt 5 --apg-momentum -0.5

Recommanded values:

  • eta: between 0 and 1, closer to 0 seems better. In the paper, they recommend setting it to 0 altogether
  • norm: threshold between 1 and 15 (setting it to zero or under disables the normalization)
  • momentum: negative, ideally between 0 and -1

Feel free to play around with the settings, going outside of the recommended ranges can have interesting effects, especially with eta and momentum.


I also added an experimental smoothing parameter (--apg-nt-smoothing) for the normalization. In the paper they're using a "saturate" function (min(1,threshold/norm)), which has two potential issues: it has a kink (not continuously differentiable), and is not invertible as all input values outside of the $[0,1]$ range get mapped to $1$.

This experimental feature remplaces the $min(1,x)$ function with $\frac{x}{\left(1+x^{\frac{1}{p}}\right)^{p}}$, which is smooth and invertible. It is equivalent to $f(x)=x$ for small values of $x$ (just like the min) and perfectly approximates to the original $min(1,x)$ as the value of $p$ goes to $0$.


Edit: I also added unconditionnal SLG (--slg-uncond) (I stole the idea from deepbeepmeep/Wan2GP#61)

Just a simpler version of SLG (Skip Layer Guidance, introduced in #451) for DiT models.

Default SLG requires a third forward pass of the network with some layers skipped. This increase the computing time by a bit under 50% for the SLG steps, wich isn't ideal.

Unconditionnal SLG skips layers during the same unconditionnal pass used for CFG/APG. It seems to be about as effective as normal SLG, but it's even faster than CFG, thanks to the layers being skipped.

Downside: it's less flexible, --slg-scale should be kept to 0 and --cfg-scale now controls both the CFG and the SLG.
Upside: It's faster.

setting both --slg-scale != 0 and --slg-uncond at the same time will most likely degrade image quality while using more compute. It's possible, but not recommended. (Maybe it could be worth to investigate skipping a different sets of layers with normal slg and unconditionnal slg, but we're getting too far out of scope for this PR)

@stduhpf stduhpf changed the title Add support for APG (adaptive projected guidance) Add support for APG (adaptive projected guidance) + unconditionnal SLG Mar 13, 2025
fix default slg params
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant