Instead of .to('cpu') / .to('cuda'), models are now fully del'd from
GPU (no CPU intermediate) and reloaded on demand:
- _unload_i23d_worker(): del + gc.collect() + empty_cache()
- _ensure_i23d_worker(): lazy reload from pretrained if None
- _unload_tex_pipeline(): del + gc.collect() + empty_cache()
- _ensure_tex_pipeline(): lazy load from tex_conf if None
generation_all() flow in low_vram_mode:
shape gen → _unload_i23d_worker → _ensure_tex_pipeline →
texture gen → _unload_tex_pipeline
(shape model reloads on next _gen_shape call via _ensure_i23d_worker)
Startup: tex_pipeline NOT loaded in low_vram_mode (only tex_conf stored),
reducing startup VRAM from ~13.5GB to ~7.25GB.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
36 KiB
36 KiB