Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea-AI-php.cn

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

王林

Nov 29, 2023 pm 02:17 PM

aiModel

In the field of computer graphics, triangle meshes are the main way to represent 3D geometric objects, and are also the most commonly used 3D resource expression methods in games, movies, and virtual reality interfaces. The industry usually uses triangular meshes to simulate the surfaces of complex objects, such as buildings, vehicles, animals, etc. At the same time, common geometric transformations, geometry detection, rendering and shading operations also need to be performed based on triangle meshes

Compared with other 3D shape representations such as point clouds or voxels, triangles Meshes provide a more coherent surface representation: more controllable, easier to manipulate, more compact, and can be directly applied in modern rendering pipelines, achieving higher visual quality with fewer primitives

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

Previously, researchers have tried to use representation methods such as voxels, point clouds and neural fields to generate 3D models. These representation methods also need to be converted into meshes through post-processing. for use in downstream applications, such as isosurface processing using the Marching Cubes algorithm

Unfortunately, this approach results in an overly dense mesh and an overly detailed mesh, often Bumpy errors caused by over-smoothing and isosurfacing will appear, as shown in the following image:

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

Compare 3D meshes modeled by 3D modeling professionals are more compact in representation while maintaining sharp detail with fewer triangles.

Many researchers have long hoped to solve the task of automatically generating triangle meshes to further simplify the process of creating 3D assets.

In a recent paper, researchers proposed a new solution: MeshGPT, which directly generates the mesh representation as a set of triangles.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

The paper link can be found at: https://nihalsid.github.io/mesh-gpt/static/MeshGPT.pdf

Inspired by the Transformer language generation model, they adopted a direct sequence generation method to synthesize triangle meshes into triangle sequences

Following the paradigm of text generation, researchers first learned a vocabulary of triangles, where triangles were encoded as latent quantized embeddings. To encourage the learned triangle embeddings to preserve local geometric and topological features, we employ a graph convolutional encoder. These triangle embeddings are then decoded by a ResNet decoder, which processes the sequence of tokens representing the triangles to generate the vertex coordinates of the triangles. Finally, the researchers trained a GPT-based architecture based on the learned vocabulary to automatically generate a sequence of triangles representing the mesh, and achieved the advantages of clear edges and high fidelity.

Experiments across multiple categories on the ShapeNet dataset show that MeshGPT significantly improves the quality of generated 3D meshes compared to existing techniques, Shape coverage improved by an average of 9%, and FID scores improved by 30 points.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

On social media platforms, MeshGPT has also sparked heated discussions:

Someone once said : "This is the truly revolutionary idea."

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

One netizen pointed out that the highlight of this method is that it overcomes other The biggest obstacle to the 3D modeling approach is the ability to edit.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

Some people boldly predict that maybe all the problems that have not been solved since the 1990s can be inspired by Transformer:

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

There are also users engaged in 3D/movie production-related industries who expressed concerns about their careers:

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

However, some people pointed out that based on the generation examples provided in the paper, this method has not yet reached the stage of large-scale application. A professional modeler can create these meshes in less than 5 minutes

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

This commenter stated,The next step might be to have the LLM control the generation of the 3D seeds and add the image model to the autoregressive part of the architecture. After reaching this step, the production of 3D assets for games and other scenes can be automated on a large scale.

Next, let’s take a look at the research details of the MeshGPT paper.

Overview of Method

Inspired by the progress of large language models, the researchers developed a sequence-based method that uses triangular meshes as Triangular sequences are autoregressively generated. This method produces clean, coherent and compact meshes with sharp edges and high fidelity.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

The researchers first learned geometric vocabulary embeddings from large 3D object meshes to be able to encode and decode triangles. Then, based on the learned embedding vocabulary, the Transformer for grid generation is trained in an autoregressive manner for index prediction.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

#To learn the triangle vocabulary, the researchers used a graph convolutional encoder that operates on the triangles of the grid and their neighborhoods to extract Rich geometric features capture the intricate details of 3D shapes. These features are quantized as Embedding in the codebook through residual quantization, effectively reducing the sequence length of the grid representation. After sorting, these embedded information are decoded by a one-dimensional ResNet guided by the reconstruction loss. This stage lays the foundation for subsequent training of Transformer.

Next, the researchers used these quantized geometric embeddings to train a pure decoder transformer similar to GPT. They do this by extracting a sequence of geometric embeddings in mesh triangles and training the transformer to predict the codebook index of the next embedding in the sequence

After training, the transformer can autoregressively sample to predict the embedding sequences, and then decodes these embeddings to generate novel and diverse mesh structures showing efficient, irregular triangles similar to human-drawn meshes.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

MeshGPT uses a graph convolution encoder to process mesh surfaces and uses geometric neighborhood information to capture and represent 3D Strong features of complex shape details are then quantized into codebook embeddings using a residual quantization method. This approach ensures better reconstruction quality compared to simple vector quantization. Guided by the reconstruction loss, MeshGPT sorts and decodes the quantized embeddings via ResNet.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

This study uses the Transformer model to generate grid sequences as token indexes from the pre-trained codebook vocabulary library. During training, the image encoder extracts features from mesh surfaces and quantizes them into a set of surface embeddings. These embeddings are tiled, marked with start and end tokens, and then fed into the above GPT type Transformer model. The decoder is optimized with a cross-entropy loss to predict the subsequent codebook index of each embedding

Experimental results

This study combines MeshGPT with common Comparative experiments were conducted on mesh generation methods, including:

Polygen, which generates polygonal meshes by first generating vertices and then generating faces conditioned on the vertices;
BSPNet represents the mesh through convex decomposition;
AtlasNet represents the 3D mesh as the deformation of multiple 2D planes.

Additionally, the study compared MeshGPT with the neural field-based SOTA method GET3D.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

As shown in Figure 6, Figure 7 and Table 1, MeshGPT outperforms the baseline method in all 4 categories. MeshGPT can generate sharp, compact meshes with finer geometric details.

Specifically, compared with Polygen, MeshGPT can generate shapes with more complex details, and Polygen is more likely to accumulate errors during the inference process; AtlasNet often suffers from folding artifacts ), resulting in lower diversity and lower shape quality; BSPNet using planar BSP trees tends to produce blocky shapes with unusual triangulation patterns; GET3D produces good high-level shape structures, but has too many triangles and imperfect planar surfaces .

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

##As shown in the table As shown in 2, the study also allowed users to evaluate the quality of meshes generated by MeshGPT, with MeshGPT significantly outperforming AtlasNet, Polygen, and BSPNet in terms of shape and triangulation quality. Most users preferred the shape quality (68%) and triangulation quality (73%) generated by MeshGPT compared to GET3D.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

The rewritten content is: novel shape. As shown in Figure 8, MeshGPT is able to generate novel shapes beyond the training dataset, ensuring that the model does more than just retrieve existing shapes

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

Shape completion. As shown in Figure 9 below, MeshGPT can also infer multiple possible completions based on a given local shape and generate multiple shape hypotheses.

Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea

The above is the detailed content of Transformer revolutionizes 3D modeling, MeshGPT generation effect alarms professional modelers, netizens: revolutionary idea. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Tesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserApr 22, 2025 am 11:48 AM

Since 2008, I've championed the shared-ride van—initially dubbed the "robotjitney," later the "vansit"—as the future of urban transportation. I foresee these vehicles as the 21st century's next-generation transit solution, surpas

Sam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailApr 22, 2025 am 11:29 AM

Revolutionizing the Checkout Experience Sam's Club's innovative "Just Go" system builds on its existing AI-powered "Scan & Go" technology, allowing members to scan purchases via the Sam's Club app during their shopping trip.

Nvidia's AI Omniverse Expands At GTC 2025Apr 22, 2025 am 11:28 AM

Nvidia's Enhanced Predictability and New Product Lineup at GTC 2025 Nvidia, a key player in AI infrastructure, is focusing on increased predictability for its clients. This involves consistent product delivery, meeting performance expectations, and

Exploring the Capabilities of Google's Gemma 2 ModelsApr 22, 2025 am 11:26 AM

Google's Gemma 2: A Powerful, Efficient Language Model Google's Gemma family of language models, celebrated for efficiency and performance, has expanded with the arrival of Gemma 2. This latest release comprises two models: a 27-billion parameter ver

The Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaApr 22, 2025 am 11:21 AM

This Leading with Data episode features Dr. Kirk Borne, a leading data scientist, astrophysicist, and TEDx speaker. A renowned expert in big data, AI, and machine learning, Dr. Borne offers invaluable insights into the current state and future traje

AI For Runners And Athletes: We're Making Excellent ProgressApr 22, 2025 am 11:12 AM

There were some very insightful perspectives in this speech—background information about engineering that showed us why artificial intelligence is so good at supporting people’s physical exercise. I will outline a core idea from each contributor’s perspective to demonstrate three design aspects that are an important part of our exploration of the application of artificial intelligence in sports. Edge devices and raw personal data This idea about artificial intelligence actually contains two components—one related to where we place large language models and the other is related to the differences between our human language and the language that our vital signs “express” when measured in real time. Alexander Amini knows a lot about running and tennis, but he still

Jamie Engstrom On Technology, Talent And Transformation At CaterpillarApr 22, 2025 am 11:10 AM

Caterpillar's Chief Information Officer and Senior Vice President of IT, Jamie Engstrom, leads a global team of over 2,200 IT professionals across 28 countries. With 26 years at Caterpillar, including four and a half years in her current role, Engst

New Google Photos Update Makes Any Photo Pop With Ultra HDR QualityApr 22, 2025 am 11:09 AM

Google Photos' New Ultra HDR Tool: A Quick Guide Enhance your photos with Google Photos' new Ultra HDR tool, transforming standard images into vibrant, high-dynamic-range masterpieces. Ideal for social media, this tool boosts the impact of any photo,

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

Dreamweaver Mac version

Visual web development tools

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Hot Topics

Where is the login entrance for gmail email?

7638

CakePHP Tutorial

1391

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

150