Home >Technology peripherals >AI >Nvidia's Llama-Mesh: A Guide With Examples

Nvidia's Llama-Mesh: A Guide With Examples

Christopher NolanOriginal: 2025-03-01 09:39:11724browse

NVIDIA's groundbreaking LLaMA-Mesh model bridges the gap between text and 3D mesh generation. This innovative model allows users to create 3D meshes from simple text descriptions and conversely, identify objects from their 3D mesh data. This represents a significant leap forward in machine learning, bringing us closer to achieving Artificial General Intelligence (AGI) by enhancing 3D spatial understanding. Professionals and hobbyists alike will find LLaMA-Mesh a valuable asset, streamlining 3D modeling workflows in applications like Blender.

This guide explores LLaMA-Mesh's capabilities through practical examples, highlighting both its potential and limitations.

What is LLaMA-Mesh?

LLaMA-Mesh, developed by NVIDIA, extends the power of Large Language Models (LLMs) into the 3D realm. Unlike previous models, it seamlessly integrates text and 3D data, enabling 3D mesh creation using natural language prompts. Built upon a fine-tuned LLaMA-3.1-8B-Instruct base, it encodes 3D mesh data using the text-based OBJ file format.

Accessing LLaMA-Mesh

LLaMA-Mesh is accessible in three ways:

Local Execution (Hugging Face): Run the model locally via the Hugging Face repository.
Blender Add-on: Utilize the model as a Blender add-on for direct integration within the software.
Online Demo (Hugging Face): Access a convenient online demo on the Hugging Face platform.

The online demo's 4096-token limit contrasts with the full model's 8K token capacity, emphasizing the need for local execution to harness its full potential. The demo's interface is shown below:

Nvidia's Llama-Mesh: A Guide With Examples

Setting Up LLaMA-Mesh

This guide demonstrates running LLaMA-Mesh using Google Colab's A100 GPU runtime. The same principles apply to local execution with sufficient computational resources. The Hugging Face repository provides the necessary code. Key steps include importing libraries, downloading the model and tokenizer, setting the pad_token, and using standard Hugging Face workflows for inference. The code snippets below illustrate the process:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Zhengyi/LLaMA-Mesh"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto").cuda()

if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id

prompt = "Create a 3D model of an original designer chair."
inputs = tokenizer(prompt, return_tensors="pt", padding=True)
input_ids = inputs.input_ids.cuda()
output = model.generate(
    input_ids,
    attention_mask=inputs['attention_mask'],
    max_length=8000,
)

Default hyperparameters are used for fair comparison with the online demo.

LLaMA-Mesh Examples

Three examples of increasing complexity illustrate LLaMA-Mesh's performance:

Example 1: A Chair: Both the online demo and the Colab-run model generated chair meshes, but with varying levels of detail and realism.
- Online Demo Output:
- Colab Output:
Example 2: A Torus: The model struggled to accurately represent the torus's central hole, even with increased context.
- Online Demo Output:
- Colab Output:
- Correct Torus:
Example 3: Klein Bottle: The online demo failed to generate a mesh, while the Colab version produced a result far from the correct geometry.
- Colab Output:
- Correct Klein Bottle:

These examples show LLaMA-Mesh's strength in creative, simple designs but its limitations with precise geometric and complex shapes.

Conclusion

LLaMA-Mesh, despite being in its early stages, demonstrates significant potential for rapid 3D mesh generation. Future improvements could address limitations in handling complex geometries and expand compatibility with 3D printing technologies.

The above is the detailed content of Nvidia's Llama-Mesh: A Guide With Examples. For more information, please follow other related articles on the PHP Chinese website!

for while format include Token using Interface default this llama agi Access

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：vLLM: Setting Up vLLM Locally and on Google Cloud for CPUNext article：vLLM: Setting Up vLLM Locally and on Google Cloud for CPU

See more

Nvidia's Llama-Mesh: A Guide With Examples

Related articles