Files
Hunyuan3D_2.1_Low_VRAM/gradio_app.py
Akasei 3cd767a18d fix(gradio): prevent OOM on 16GB RAM by fully deleting models between uses
Previous hybrid strategy (i23d in CPU RAM, tex del'd) still caused OOM:
- i23d in CPU RAM: ~7GB
- tex loading from disk: ~7GB peak in RAM before GPU transfer
- Total: ~14GB > 16GB system RAM → OOM Killer

New strategy: fully delete both models between uses.
Neither model persists in CPU RAM between requests.
Peak RAM during any load: ~7GB (one model staging to GPU).

Changes:
- Replace _offload_i23d_to_cpu/_restore_i23d_to_gpu with
  _unload_i23d_worker/_ensure_i23d_worker (full del + reload)
- Add double gc.collect() + empty_cache before each load
- Skip i23d startup load in low_vram_mode (load on first request)
- Both models reload from local HF cache (~20-30s each)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-16 22:39:03 +08:00

38 KiB