Files
Hunyuan3D_2.1_Low_VRAM/gradio_app.py
Akasei 6534f4ba15 fix: adaptive VRAM strategy + force rembg CPU to prevent OOM
Two root causes of CUDA OOM fixed:

1. onnxruntime-gpu CUDAExecutionProvider pre-allocated ~12GB VRAM arena
   for bria-rmbg background removal, starving PyTorch models.
   Fix: force CPUExecutionProvider in BackgroundRemover (rembg is
   lightweight, runs fine on CPU, frees all VRAM for shape/tex).

2. Previous 'always delete' strategy was wasteful on high-RAM machines.
   New adaptive strategy checks available system RAM at runtime:
   - RAM >= 16GB free: offload i23d to CPU (.to('cpu')) — fast, ~1s
   - RAM <  16GB free: full del + reload from disk — safe, ~20-30s
   This gives instant model switching on 32GB+ machines while keeping
   16GB machines safe from OOM Killer.

Helper functions:
- _prepare_for_tex(): adaptive offload/delete based on RAM check
- _ensure_i23d_worker(): restore from CPU (fast) or disk (slow)
- _get_available_ram_gb(): reads /proc/meminfo
- _can_offload_to_cpu(): threshold check with logging

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-16 22:57:32 +08:00

39 KiB