Akasei 70289d04d7 fix: eliminate OOM on RTX 3080 via load_state_dict(assign=True) + low-VRAM mode
Root cause: torch.load() with mmap=True returns fp16 tensors, but
load_state_dict() without assign=True widens them fp16→fp32 in-place,
doubling CPU anon-rss (7 GB fp16 ckpt → 14 GB fp32 params). Combined
with the 2 GB Gradio server baseline, this exceeded the 15 GB physical
RAM limit on the second generation request.

Fix: add assign=True to all load_state_dict calls in pipelines.py and
autoencoders/model.py. With assign=True the mmap fp16 tensors are
assigned directly as model parameters without any fp16→fp32 copy.
When model.to('cuda') is then called, the mmap pages (file-backed,
evictable) are streamed directly to VRAM — CPU anon-rss stays near 0.

Peak RSS is now ~3.9 GB instead of 14.7 GB (killed) across all rounds.

gradio_app.py changes:
- low_vram_mode always takes the full-delete path (never CPU offload)
- glibc malloc tuning at startup (MALLOC_ARENA_MAX=1, malloc_trim)
- preemptive gc.collect(2) + malloc_trim + empty_cache at generation start
- _rlog() memory logging at each major step for monitoring

pipelines.py:
- load_state_dict(..., assign=True) for model, vae, conditioner
- del ckpt after state dict assignment to release mmap fd early

autoencoders/model.py:
- load_state_dict(..., assign=True) in from_single_file
- load_state_dict(..., assign=True) in init_from_ckpt

Verified: 4 consecutive Playwright WebUI rounds (shape+texture) pass
with no OOM. API two-round test also passes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-17 02:03:43 +08:00
2025-07-28 17:40:27 +08:00
2025-06-14 14:32:20 +08:00
2025-06-28 11:49:13 +02:00
2025-06-30 19:14:36 +02:00
2025-06-28 11:49:13 +02:00
2025-06-28 11:49:13 +02:00
fix
2025-06-14 14:53:27 +08:00
2025-10-17 18:10:07 +08:00
2025-06-28 11:49:13 +02:00
2025-07-30 23:21:51 +08:00
2025-08-27 14:52:15 +08:00
2025-06-14 14:32:20 +08:00
2025-06-28 11:49:13 +02:00
2025-06-30 19:14:36 +02:00
2025-06-13 23:53:14 +08:00


🔥 News

  • Jul 26, 2025: 🤗 We release the first open-source, simulation-capable, immersive 3D world generation model, HunyuanWorld-1.0!
  • Jun 19, 2025: 👋 We present the technical report of Hunyuan3D-2.1, please check out the details and spark some discussion!
  • Jun 13, 2025: 🤗 We release the first production-ready 3D asset generation model, Hunyuan3D-2.1!

Join our Wechat and Discord group to discuss and find help from us.

Wechat Group Xiaohongshu X Discord

🤗 Community Contribution Leaderboard

  1. By @visualbruno
  1. By @VR-Jobs

☯️ Hunyuan3D 2.1

Architecture

Tencent Hunyuan3D-2.1 is a scalable 3D asset creation system that advances state-of-the-art 3D generation through two pivotal innovations: Fully Open-Source Framework and Physically-Based Rendering (PBR) Texture Synthesis. For the first time, the system releases full model weights and training code, enabling community developers to directly fine-tune and extend the model for diverse downstream applications. This transparency accelerates academic research and industrial deployment. Moreover, replacing the prior RGB-based texture model, the upgraded PBR pipeline leverages physics-grounded material simulation to generate textures with photorealistic light interaction (e.g., metallic reflections, subsurface scattering).

Performance

We have evaluated Hunyuan3D 2.1 with other open-source as well as close-source 3d-generation methods. The numerical results indicate that Hunyuan3D 2.1 surpasses all baselines in the quality of generated textured 3D assets and the condition following ability.

Model ULIP-T(⬆) ULIP-I(⬆) Uni3D-T(⬆) Uni3D-I(⬆)
Michelangelo 0.0752 0.1152 0.2133 0.2611
Craftsman 0.0745 0.1296 0.2375 0.2987
TripoSG 0.0767 0.1225 0.2506 0.3129
Step1X-3D 0.0735 0.1183 0.2554 0.3195
Trellis 0.0769 0.1267 0.2496 0.3116
Direct3D-S2 0.0706 0.1134 0.2346 0.2930
Hunyuan3D-Shape-2.1 0.0774 0.1395 0.2556 0.3213
Model CLIP-FiD(⬇) CMMD(⬇) CLIP-I(⬆) LPIPS(⬇)
SyncMVD-IPA 28.39 2.397 0.8823 0.1423
TexGen 28.24 2.448 0.8818 0.1331
Hunyuan3D-2.0 26.44 2.318 0.8893 0.1261
Hunyuan3D-Paint-2.1 24.78 2.191 0.9207 0.1211

🎁 Models Zoo

It takes 10 GB VRAM for shape generation, 21GB for texture generation and 29GB for shape and texture generation in total.

Model Description Date Size Huggingface
Hunyuan3D-Shape-v2-1 Image to Shape Model 2025-06-14 3.3B Download
Hunyuan3D-Paint-v2-1 Texture Generation Model 2025-06-14 2B Download

🤗 Get Started with Hunyuan3D 2.1

Hunyuan3D 2.1 supports Macos, Windows, Linux. You may follow the next steps to use Hunyuan3D 2.1 via:

Install Requirements

We test our model with Python 3.10 and PyTorch 2.5.1+cu124.

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

cd hy3dpaint/custom_rasterizer
pip install -e .
cd ../..
cd hy3dpaint/DifferentiableRenderer
bash compile_mesh_painter.sh
cd ../..

wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P hy3dpaint/ckpt

Code Usage

We designed a diffusers-like API to use our shape generation model - Hunyuan3D-Shape and texture synthesis model - Hunyuan3D-Paint.

import sys
sys.path.insert(0, './hy3dshape')
sys.path.insert(0, './hy3dpaint')
from textureGenPipeline import Hunyuan3DPaintPipeline
from textureGenPipeline import Hunyuan3DPaintPipeline, Hunyuan3DPaintConfig
from hy3dshape.pipelines import Hunyuan3DDiTFlowMatchingPipeline

# let's generate a mesh first
shape_pipeline = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained('tencent/Hunyuan3D-2.1')
mesh_untextured = shape_pipeline(image='assets/demo.png')[0]

paint_pipeline = Hunyuan3DPaintPipeline(Hunyuan3DPaintConfig(max_num_view=6, resolution=512))
mesh_textured = paint_pipeline(mesh_path, image_path='assets/demo.png')

Gradio App

You could also host a Gradio App in your own computer via:

python3 gradio_app.py \
  --model_path tencent/Hunyuan3D-2.1 \
  --subfolder hunyuan3d-dit-v2-1 \
  --texgen_model_path tencent/Hunyuan3D-2.1 \
  --low_vram_mode

🔗 BibTeX

If you found this repository helpful, please cite our reports:

@misc{hunyuan3d2025hunyuan3d,
    title={Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material},
    author={Tencent Hunyuan3D Team},
    year={2025},
    eprint={2506.15442},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@misc{hunyuan3d22025tencent,
    title={Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation},
    author={Tencent Hunyuan3D Team},
    year={2025},
    eprint={2501.12202},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@misc{yang2024hunyuan3d,
    title={Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation},
    author={Tencent Hunyuan3D Team},
    year={2024},
    eprint={2411.02293},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgements

We would like to thank the contributors to the TripoSG, Trellis, DINOv2, Stable Diffusion, FLUX, diffusers, HuggingFace, CraftsMan3D, Michelangelo, Hunyuan-DiT, and HunyuanVideo repositories, for their open research and exploration.

Star History

Star History Chart
Description
优化混元3D2.1模型,使其能够在VRAM<=16GB的设备上运行
Readme MIT 270 MiB