Hunyuan3D_2.1_Low_VRAM

Author	SHA1	Message	Date
Akasei	9bee8e1844	refactor(gradio): replace CPU offload with direct GPU unload/lazy-load Instead of .to('cpu') / .to('cuda'), models are now fully del'd from GPU (no CPU intermediate) and reloaded on demand: - _unload_i23d_worker(): del + gc.collect() + empty_cache() - _ensure_i23d_worker(): lazy reload from pretrained if None - _unload_tex_pipeline(): del + gc.collect() + empty_cache() - _ensure_tex_pipeline(): lazy load from tex_conf if None generation_all() flow in low_vram_mode: shape gen → _unload_i23d_worker → _ensure_tex_pipeline → texture gen → _unload_tex_pipeline (shape model reloads on next _gen_shape call via _ensure_i23d_worker) Startup: tex_pipeline NOT loaded in low_vram_mode (only tex_conf stored), reducing startup VRAM from ~13.5GB to ~7.25GB. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-16 21:15:56 +08:00
Akasei	5d0405dc68	feat(gradio): apply VRAM optimization and fix texture config - generation_all(): offload i23d_worker to CPU before texture gen, restore after — mirrors batch_generate.py sequential strategy. Prevents OOM when both models peak simultaneously on RTX 3080. - Change texture config: max_num_view 8→9, resolution 768→512. 768 resolution OOMs (14.6GB activation); 512 is practical max for RTX 3080 20GB. max_views 9 gives better texture coverage. - Only active when --low_vram_mode flag is passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-16 21:05:14 +08:00
WncFht	00fa3ac012	feat: 为 gradio_app.py 加上 enable_flashvdm	2025-07-13 11:44:49 +08:00
HuiwenShi	8f7b4be92e	Update gradio_app.py	2025-06-16 22:13:47 +08:00
HuiwenShi	3f102487ba	Update gradio_app.py	2025-06-16 22:12:54 +08:00
Zeqiang Lai	d2465f0427	Update gradio_app.py	2025-06-14 15:36:20 +08:00
Huiwenshi	dd93e7ce4e	fix some	2025-06-14 14:32:20 +08:00
Huiwenshi	c88bee648e	init	2025-06-13 23:53:14 +08:00

8 Commits