Hunyuan3D_2.1_Low_VRAM

Author	SHA1	Message	Date
Akasei	f192c86c60	fix(oom): use mmap=True for checkpoint loading + malloc_trim + expandable_segments Root cause: torch.load() reads 6.9GB .ckpt into Python heap + model params in CPU RAM = ~14GB peak, exceeding 16GB system RAM → OOM Killer. Fix 1 - mmap=True on all torch.load() calls (torch 2.7 supports this): With mmap, checkpoint storage is file-backed (not heap). Only the model parameters (also ~7GB) exist in physical RAM during loading. Peak RAM drops from ~14GB to ~7GB — within safe limits on 16GB machines. Files changed: pipelines.py, hunyuan3ddit.py, model.py (×2), flow_matching_sit.py Fix 2 - malloc_trim(0) after every gc.collect(): Forces glibc to return freed heap pages to OS immediately, so Python's memory pool doesn't hoard freed model memory before the next load. Fix 3 - PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True: Prevents CUDA allocator fragmentation between model switches. Fix 4 - Adaptive threshold recalculated: With mmap loading, loading a model requires ~7.5GB (model params) not 14GB. CPU offload threshold lowered from 16GB → 10.5GB, enabling fast path on machines with more headroom. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-16 23:18:16 +08:00
Akasei	6534f4ba15	fix: adaptive VRAM strategy + force rembg CPU to prevent OOM Two root causes of CUDA OOM fixed: 1. onnxruntime-gpu CUDAExecutionProvider pre-allocated ~12GB VRAM arena for bria-rmbg background removal, starving PyTorch models. Fix: force CPUExecutionProvider in BackgroundRemover (rembg is lightweight, runs fine on CPU, frees all VRAM for shape/tex). 2. Previous 'always delete' strategy was wasteful on high-RAM machines. New adaptive strategy checks available system RAM at runtime: - RAM >= 16GB free: offload i23d to CPU (.to('cpu')) — fast, ~1s - RAM < 16GB free: full del + reload from disk — safe, ~20-30s This gives instant model switching on 32GB+ machines while keeping 16GB machines safe from OOM Killer. Helper functions: - _prepare_for_tex(): adaptive offload/delete based on RAM check - _ensure_i23d_worker(): restore from CPU (fast) or disk (slow) - _get_available_ram_gb(): reads /proc/meminfo - _can_offload_to_cpu(): threshold check with logging Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-16 22:57:32 +08:00
Akasei	474001da6b	feat(rembg): switch background removal to bria-rmbg model Replace default u2net with bria-rmbg-2.0 for better quality. BackgroundRemover now accepts model_name param (defaults to 'bria-rmbg'). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-16 22:14:21 +08:00
HuiwenShi	c9b21668e2	Create is_watertight.py	2025-09-24 11:35:53 +08:00
HuiwenShi	5b6885dcf4	Update chamfer_distance.py	2025-09-23 14:10:26 +08:00
HuiwenShi	34746fcbc2	Create chamfer_distance.py	2025-09-23 11:46:01 +08:00
s572915912	b3dd50ba37	Update misc.py repair	2025-08-06 01:14:49 +08:00
s572915912	d9fc4d31bf	Update hunyuandit-mini-overfitting-flowmatching-dinol518-bf16-lr1e4-4096.yaml repair	2025-08-06 01:12:13 +08:00
s572915912	f4e0307665	Update train_deepspeed.sh	2025-07-11 18:32:16 +08:00
s572915912	f0a008279e	Update pipelines.py	2025-07-11 16:51:33 +08:00
s572915912	dc2ea32d76	Update hunyuandit-mini-overfitting-flowmatching-dinol518-bf16-lr1e4-4096.yaml	2025-07-11 16:47:40 +08:00
s572915912	96349ad5d0	Update train_deepspeed.sh	2025-07-11 16:43:40 +08:00
s572915912	de7996251d	Update hunyuandit-mini-overfitting-flowmatching-dinol518-bf16-lr1e4-4096.yaml	2025-07-11 16:37:32 +08:00
s572915912	af935af688	Update train_deepspeed.sh	2025-07-11 16:36:46 +08:00
s572915912	f2f19d74a8	Update hunyuandit-mini-overfitting-flowmatching-dinol518-bf16-lr1e4-4096.yaml add explain	2025-07-11 15:53:01 +08:00
s572915912	8cd92830fb	Update train_deepspeed.sh auto detect	2025-07-11 15:51:55 +08:00
s572915912	b06e6ddf37	Update pipelines.py	2025-07-11 02:29:25 +08:00
Huiwenshi	d0b85dc7d9	fix some	2025-06-26 20:08:17 +08:00
Huiwenshi	e59169a8ec	update readme	2025-06-26 16:34:51 +08:00
Huiwenshi	7c92655a0d	fix shape training	2025-06-26 16:03:44 +08:00
Huiwenshi	4d67e18386	update	2025-06-14 01:39:07 +08:00
Huiwenshi	c88bee648e	init	2025-06-13 23:53:14 +08:00

22 Commits