photoMaker issue with two or more generated images / SDXL sample steps #207

Jonathhhan · 2024-03-19T02:39:43Z

Very nice feature, thanks. It seems that only the first generated image works after loading the sd_ctx (multiple images work with batch size > 1).

I used the Newton images from the example and the prompt: "man img, man with futuristic clothes".

This is the first image:

And this is the second image:

And with SDXL-Turbo photoMaker seems to need less than the fixed sample 50 steps...

The text was updated successfully, but these errors were encountered:

Green-Sky · 2024-03-19T12:25:57Z

It looks like the cfg scale was too high for the first image.

Jonathhhan · 2024-03-19T13:21:50Z

@Green-Sky yes, I used 7 and not the recommended 5.

bssrdf · 2024-03-19T13:55:57Z

@Jonathhhan, could you provide the full command line with SDXL and Photomaker model files? In particular, did you use the file from https://huggingface.co/bssrdf/PhotoMaker?

Here are what I can generate using Newton example images and your prompt with batch size 2.

bin/sd -m ../models/RealVisXL_V3.0.safetensors  --stacked-id-embd-dir ../models/photomaker-v1.safetensors --input-id-images-dir examples/newton_man -p "man img, man with futuristic clothes"  --cfg-scale 7 --sampling-method euler -H 1024 -W 1024  -b 2 -o newton_issu01.png
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from '../models/RealVisXL_V3.0.safetensors'
[INFO ] model.cpp:705  - load ../models/RealVisXL_V3.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[WARN ] stable-diffusion.cpp:200  - !!!It looks like you are using SDXL model. If you find that the generated images are completely black, try specifying SDXL VAE FP16 Fix with the --vae parameter. You can find it here: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
[INFO ] model.cpp:705  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] lora.hpp:38   - loading LoRA from '../models/photomaker-v1.safetensors'
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from '../models/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:400  - total params memory size = 7182.38MB (VRAM 7182.38MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from '../models/RealVisXL_V3.0.safetensors' completed, taking 88.15s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_3.jpg'
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.09s
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 548 ms
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 20 to 50 for PHOTOMAKER
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 157 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/2 - seed 42
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
  |==================================================| 50/50 - 1.84it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.58s
[INFO ] stable-diffusion.cpp:1732 - generating image: 2/2 - seed 43
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
  |==================================================| 50/50 - 1.79it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.52s
[INFO ] stable-diffusion.cpp:1777 - generating 2 latent images completed, taking 55.12s
[INFO ] stable-diffusion.cpp:1779 - decoding 2 latents
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.15s
[INFO ] stable-diffusion.cpp:1789 - latent 2 decoded, taking 1.17s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 2.31s
[INFO ] stable-diffusion.cpp:1810 - txt2img completed in 57.60s
save result image to 'newton_issu01.png'
save result image to 'newton_issu01_2.png'
double free or corruption (fasttop)
Aborted

They look fine.

Jonathhhan · 2024-03-19T14:43:43Z

@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:

System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
New BaseEngine 00000202288E6220
New GLFWEngine 00000202288E6220
[DEBUG] stable-diffusion.cpp:145  - Using CUDA backend
[notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1.
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:705  - load data/models/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:176  - loading vae from 'data/models/vae/vae.safetensors'
[INFO ] model.cpp:705  - load data/models/vae/vae.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/vae/vae.safetensors'
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[DEBUG] stable-diffusion.cpp:195  - ggml tensor size = 432 bytes
[DEBUG] ggml_extend.hpp:884  - clip params backend buffer size =  1564.36 MB(VRAM) (713 tensors)
[DEBUG] ggml_extend.hpp:884  - unet params backend buffer size =  4900.07 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:884  - vae params backend buffer size =  159.68 MB(VRAM) (248 tensors)
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] ggml_extend.hpp:884  - lora params backend buffer size =  354.38 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] lora.hpp:74   - finished loaded lora
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] ggml_extend.hpp:884  - pmid params backend buffer size =  623.48 MB(VRAM) (407 tensors)
[DEBUG] stable-diffusion.cpp:296  - loading vocab
[DEBUG] clip.hpp:164  - vocab size: 49408
[DEBUG] clip.hpp:175  -  trigger word img already in vocab
[DEBUG] stable-diffusion.cpp:316  - loading weights
[DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[INFO ] stable-diffusion.cpp:415  - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:464  - finished loaded file
[DEBUG] upscaler.cpp:19   - Using CUDA backend
[INFO ] upscaler.cpp:32   - Upscaler weight type: f16
[INFO ] esrgan.hpp:164  - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] ggml_extend.hpp:884  - esrgan params backend buffer size =   8.53 MB(VRAM) (192 tensors)
[INFO ] model.cpp:708  - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format
[DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth
[INFO ] esrgan.hpp:183  - esrgan model loaded


[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 86 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 61 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 54 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s


[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 55 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s

bssrdf · 2024-03-19T15:24:06Z

@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:

System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
New BaseEngine 00000202288E6220
New GLFWEngine 00000202288E6220
[DEBUG] stable-diffusion.cpp:145  - Using CUDA backend
[notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1.
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:705  - load data/models/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:176  - loading vae from 'data/models/vae/vae.safetensors'
[INFO ] model.cpp:705  - load data/models/vae/vae.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/vae/vae.safetensors'
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[DEBUG] stable-diffusion.cpp:195  - ggml tensor size = 432 bytes
[DEBUG] ggml_extend.hpp:884  - clip params backend buffer size =  1564.36 MB(VRAM) (713 tensors)
[DEBUG] ggml_extend.hpp:884  - unet params backend buffer size =  4900.07 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:884  - vae params backend buffer size =  159.68 MB(VRAM) (248 tensors)
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] ggml_extend.hpp:884  - lora params backend buffer size =  354.38 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] lora.hpp:74   - finished loaded lora
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] ggml_extend.hpp:884  - pmid params backend buffer size =  623.48 MB(VRAM) (407 tensors)
[DEBUG] stable-diffusion.cpp:296  - loading vocab
[DEBUG] clip.hpp:164  - vocab size: 49408
[DEBUG] clip.hpp:175  -  trigger word img already in vocab
[DEBUG] stable-diffusion.cpp:316  - loading weights
[DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[INFO ] stable-diffusion.cpp:415  - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:464  - finished loaded file
[DEBUG] upscaler.cpp:19   - Using CUDA backend
[INFO ] upscaler.cpp:32   - Upscaler weight type: f16
[INFO ] esrgan.hpp:164  - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] ggml_extend.hpp:884  - esrgan params backend buffer size =   8.53 MB(VRAM) (192 tensors)
[INFO ] model.cpp:708  - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format
[DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth
[INFO ] esrgan.hpp:183  - esrgan model loaded
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 86 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 61 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 54 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 55 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s

Sorry, I mis-read your first message 😊
Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

Jonathhhan · 2024-03-20T09:21:50Z

Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).

bssrdf · 2024-03-20T13:44:33Z

Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).

@Jonathhhan, can you provide details about how to run 2 txt2img without reloading sd_ctx? Did you change the code in main.cpp?

Jonathhhan · 2024-03-20T14:00:23Z

@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion
In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp

fszontagh · 2024-03-20T15:37:44Z

@Jonathhhan did you set the "isFreeParamsImmediatly" to false?

Jonathhhan · 2024-03-20T15:39:59Z

did you set the "isFreeParamsImmediatly" to false?

@fszontagh Yes.

bssrdf · 2024-03-21T00:08:47Z

@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp

@Jonathhhan, I have reproduced the issue and implemented a fix. Please wait for the merged PR or you can try the branch. Thanks for reporting the bug.

Jonathhhan · 2024-03-21T00:12:17Z

@bssrdf thanks (I can confirm that it works now).

Jonathhhan changed the title ~~photoMaker~~ photoMaker issues with batch size and multiple images Mar 19, 2024

Jonathhhan changed the title ~~photoMaker issues with batch size and multiple images~~ photoMaker issue with two or more generated images / SDXL sample steps Mar 19, 2024

bssrdf mentioned this issue Mar 21, 2024

apply pmid lora only once for multiple txt2img calls #208

Merged

Jonathhhan closed this as completed Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

photoMaker issue with two or more generated images / SDXL sample steps #207

photoMaker issue with two or more generated images / SDXL sample steps #207

Jonathhhan commented Mar 19, 2024 •

edited

Loading

Green-Sky commented Mar 19, 2024

Jonathhhan commented Mar 19, 2024 •

edited

Loading

bssrdf commented Mar 19, 2024 •

edited

Loading

Jonathhhan commented Mar 19, 2024 •

edited

Loading

bssrdf commented Mar 19, 2024 •

edited

Loading

Jonathhhan commented Mar 20, 2024 •

edited

Loading

bssrdf commented Mar 20, 2024

Jonathhhan commented Mar 20, 2024

fszontagh commented Mar 20, 2024

Jonathhhan commented Mar 20, 2024 •

edited

Loading

bssrdf commented Mar 21, 2024

Jonathhhan commented Mar 21, 2024 •

edited

Loading

photoMaker issue with two or more generated images / SDXL sample steps #207

photoMaker issue with two or more generated images / SDXL sample steps #207

Comments

Jonathhhan commented Mar 19, 2024 • edited Loading

Green-Sky commented Mar 19, 2024

Jonathhhan commented Mar 19, 2024 • edited Loading

bssrdf commented Mar 19, 2024 • edited Loading

Jonathhhan commented Mar 19, 2024 • edited Loading

bssrdf commented Mar 19, 2024 • edited Loading

Jonathhhan commented Mar 20, 2024 • edited Loading

bssrdf commented Mar 20, 2024

Jonathhhan commented Mar 20, 2024

fszontagh commented Mar 20, 2024

Jonathhhan commented Mar 20, 2024 • edited Loading

bssrdf commented Mar 21, 2024

Jonathhhan commented Mar 21, 2024 • edited Loading

Jonathhhan commented Mar 19, 2024 •

edited

Loading

Jonathhhan commented Mar 19, 2024 •

edited

Loading

bssrdf commented Mar 19, 2024 •

edited

Loading

Jonathhhan commented Mar 19, 2024 •

edited

Loading

bssrdf commented Mar 19, 2024 •

edited

Loading

Jonathhhan commented Mar 20, 2024 •

edited

Loading

Jonathhhan commented Mar 20, 2024 •

edited

Loading

Jonathhhan commented Mar 21, 2024 •

edited

Loading