Tencent's Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis-AI-php.cn

Tencent's Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

王林

Oct 26, 2023 pm 09:13 PM

industryHunyuan large modelVincent diagram large model

In 2023, the accelerator button will be pressed for the launch of large models, and Vincentian graphics will be one of the hottest application directions.

Since the birth of Stable Diffusion, large models of Wenshengtu have been emerging at home and abroad, and it felt like "fighting between gods" for a while. Each technology iteration brings rapid improvements in model generation effects and speed.

Just today, Tencent Hunyuan Model also announced the latest progress: Vincentian graph capability is officially launched.

#As soon as we tried it out, we saw Hunyuan Model’s understanding of the broad and profound Chinese food culture. Here I chose the "ant climbing the tree" that makes many large models difficult, but the Hunyuan is easily generated:

Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

The question is, the current Wenshengtu large model is so large, does the Hunyuan large model have any other special advantages?

According to the official introduction, in terms of algorithms and models, the current Vincentian large model still has some challenges, such as insufficient semantic understanding, unreasonable image structure, Problems such as insufficient picture details and low quality.

#Tencent has long begun to explore AI-generated images in advertising scenarios, and the relevant accumulation is quite profound. This Hunyuan large model upgrade’s Wenshengtu capability precisely hopes to solve the three problems of “semantics, content, and texture”.

According to reports, compared with other large models, Tencent Hunyuan’s Wen Sheng Tu has obvious advantages in the realism of portraits and scenes. At the same time, in the Chinese landscape It has good performance in generating scenes such as animation and games.

Hands-on test: Hunyuan Wensheng Tu, what’s the difference?

# To do a good job in "Wen Sheng Tu", a full understanding of "Wen" is crucial.

In terms of semantic understanding, the Hunyuan Wensheng graph model adopts a Chinese and English bilingual fine-grained model, and at the same time realizes bilingualism based on Chinese and English bilingual modeling Understand, and improve the model's ability to perceive details and generate effects through optimization algorithms.

Prior to this, although popular models like Stable Diffusion supported Chinese to a certain extent, their core data set LAION-5B was still mainly Westernized content, which was I don’t understand enough about Chinese language, food, culture, and customs.

The Hunyuan Wenshengtu model is a native Chinese Wenshengtu model. Regardless of the Chinese poems or idioms input by the user, the user can be directly asked to create paintings.

In terms of content rationality, Hunyuanwenshengtu enhances the image two-dimensional space position perception ability of the algorithm model and integrates the human skeleton and human hands Prior information such as structure is introduced into the generation process to make the generated image structure more reasonable and improve the problem of unreasonable human structure and hands generated by AI.

In terms of picture texture, Hunyuanwenshengtu is based on a multi-model fusion method to improve the generated texture. After optimization, the portrait model (hair, wrinkles, etc.) effect of Hunyuan Wenshengtu has been improved by 30%, and the scene model (vegetation, ripples, etc.) effect has been improved by 25%.

#The technical advantages in these three aspects have obviously improved the Hunyuan large model Wenshengtu product experience.

#In order to verify the above capabilities, this website set some questions and conducted a thorough test on the Hunyuan large model at the first time.

Since Hunyuan is a native Chinese model, it naturally understands "ancient Chinese language" better than other similar products. We first let it draw based on ancient poems.

We selected a very artistic ancient poem "When you are drunk, you don't know the sky is in the water, and the boat is full of clear dreams and the stars are overwhelming" to test to see if the Hunyuan large model can generate extreme Picture-like pictures.

Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

In the poem "Boat at Guazhou", the line "The spring breeze turns green again on the south bank of the river, when will the bright moon shine back on me?" writes the homesickness of countless wanderers. . As a result of the generation of Hunyuan, images such as "spring light", "water bank", and "bright moon" are extracted and combined organically, making people feel like they are in a poetic scene after seeing it:

Then comes the interesting "Chinese Food Painting" session. Let's take a classic test on "Shredded Pork with Fish Flavor":

From the Chinese food paintings that make people go crazy, to the current level of eating just by looking at the pictures, we can also feel the continuous evolution of Vincentian painting technology.

Let’s take a look at how Hunyuan does on the industry-recognized problem of “realistic portraits”:

We know that Midjourney became popular in the first place because of the photo of the couple below, which people can’t tell was not generated by AI.

## , let’s examine the ability of the Hunyuan large model to generate “cheating”. The prompt used is:

How do you feel about the realism? In our opinion, the details mentioned in Prompt are sufficient.

This is what Tencent emphasizes: the Hunyuan large model improves the perception of details and the generation effect through optimization algorithms. This ability can only be reflected in many specific scenes. Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

For example, in an animation scene, a deer is running in the forest, causing fallen leaves to fly up, the moon is very bright and big, and birds are flying in the sky, creating a sense of atmosphere. CG style, side view".

Does it look like the scene in the animation you watched when you were a kid?

In addition, in animation creation, the application potential of Vincentian diagrams is huge. Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

The prompt we gave to the Hunyuan large model is "Generate 3D, anime style, 1 girl, blond hair, smile, short hair, city background":

What do you think of the generation effect? Can it be used directly as wallpaper?

#What are the self-developed technologies behind Wenshengtu?

If a worker wants to do his job well, he must first sharpen his tools, and the same is true for large models.

We learned that in addition to innovative model algorithms, the Tencent Hunyuan large model can achieve such a Wensheng picture effect that is in line with the Chinese local atmosphere, and it is also inseparable from high-quality pictures. Text matching data, self-developed machine learning framework and powerful computing infrastructure.

Tencent Hunyuan Large Model has formed a full-link self-developed technology path from model algorithm to machine learning framework to AI infrastructure. Multi-level technological accumulation means that the evolution of large models requires one step at a time, starting from practice and improving in practice.

First let’s look at the data engineering that supports model training.

# For any AI, especially large models, data is one of the three indispensable elements. The same is true for the large-model text generation function. Image and text data, especially the matching data between images and texts, has a decisive impact on the generation effect.

However, not all existing data on the Internet can be used immediately. The big problem is that the text description of the picture may not be accurate, which leads to a large number of problems. The quality of most image-text matching data is relatively poor. If used, even if the training time is very long, the model generation effect will still not meet expectations, which will also affect the stability of the generation quality and subsequent iteration efficiency.

# Therefore, improving the quality of image and text data has become the "first hurdle" to ensure the effect of Vincentian images. At this time, it is often necessary to improve data quality through engineering methods, support model training, optimization and upgrade, and build a moat for the algorithm model.

Faced with the problem of image and text matching data, the response strategy of Tencent Hunyuanwenshengtu team is as follows: first, refine the Chinese prompts in a fine-grained manner to improve the correlation between images and texts. Maximize data quality; then adopt a strategy of layering and grading training data to gradually optimize the model and maximize data effects; and finally build a data flywheel, which is the key to rapid iteration of large models. Based on feedback from online users using large models, the team automatically builds training data to speed up model iteration and maximize data efficiency.

#The data quality, effect and efficiency have been improved, which lays the foundation for a good Vincent chart effect. The machine learning framework to be discussed next is equally important.

A powerful machine learning framework or platform will greatly improve the speed and efficiency of developers in building, training and deploying models. Tencent has developed its own Angel machine learning platform for large model training and inference scenarios, which mainly includes AngelPTM for training and AngelHCF for inference.

AngelPTM adopts the ZeRO-Cache optimization strategy and becomes a powerful tool for super-large model training. It expands the capacity of single-machine models through storage management, improves resource utilization through multi-stream asynchronously, and uses video memory to Management improves memory efficiency. In addition, 4D parallelism is used to increase the upper limit of available video memory, reduce communication pressure on kilocards, and release computing potential. The automatic training renewal mechanism supports automatic fault tolerance of kilocard failures and reduces interruption time. The model training situation is also monitored in real time, and the collaborative algorithm optimizes the model training direction.

Currently, AngelPTM realizes high-speed training of hundreds of billions of mixed element base models in parallel based on the industry's first ZeRO-Cache mechanism 4D. The training speed is compared to the mainstream open source framework (DeepSpeed -Chat) increased by 1 times.

^{ZeRO-Cache Overview.}

AngelHCF mainly customizes diversified service strategies, parallel strategies, framework acceleration (covering common GPU acceleration methods), and model compression (supports commonly used compression in the industry Methods) and efficient model debugging capabilities at five levels to improve the reasoning performance of large models. The inference speed is 1.3 times higher than that of the industry's mainstream framework (FasterTransformer).

Tencent said that its Angel machine learning platform has leading performance and can help provide a better infrastructure system and help large models run at high speed. This allows the Hunyuan large model to generate high-quality images while also greatly improving the generation speed.

With high-quality data and efficient machine learning framework, the continuous operation of large models still faces the test of computing power. After all, in the era of large models, computing power is king.

The function of Tencent Hunyuan Wenshengtu is inseparable from the powerful computing infrastructure provided by Tencent Cloud. In April 2023, Tencent Cloud released a new generation of HCC high-performance computing cluster, using the latest generation of Xinghai self-developed servers, and based on self-developed network and storage architecture, achieving 3.2T ultra-high interconnect bandwidth, TB-level throughput capacity and 10 million level IOPS. The computing power performance of the new generation cluster is improved by 3 times compared with the previous generation and more than 12 times compared with the traditional computing cluster solution.

# While strengthening the underlying hardware, the upper-layer software capabilities must also go hand in hand. The new generation HCC cluster integrates Tencent Cloud's self-developed TACO training acceleration engine and has made a lot of system-level optimizations from the network protocol, communication strategy, AI framework, and model compilation levels. This comprehensive set of ecological training acceleration solutions can not only help customers lower the AI optimization threshold and improve AI training performance, but also greatly reduce training tuning and computing power costs.

It seems that the three major factors that restrict large models, algorithm, data and computing power, are no longer a problem in Tencent Hunyuan large model. Naturally, the quality and effect of Vincentian drawings are also guaranteed.

The effect is "false and real",

The ability of Wenshengtu has been embedded in Tencent advertising scenes

The Hunyuan large model Wenshengtu ability we saw today was not achieved overnight, but a real process of evolution.

At the 2023 Tencent Global Digital Ecosystem Conference held last month, Tencent’s Hunyuan large model was officially unveiled. Jiang Jie, vice president of Tencent Group, said at the time that Hunyuan is always on the road. Tencent will continue to evolve Hunyuan’s capabilities and hopes to bring surprises to everyone every month.

Currently, Tencent has 180 internal businesses connected to the Hunyuan large model, including Tencent Conference, Tencent Documents, Enterprise WeChat, Tencent Advertising and WeChat Search. . At the same time, customers from multiple industries such as retail, education, finance, medical care, media, transportation, government affairs, etc. also call Tencent Hunyuan API through Tencent Cloud. The application areas include intelligent question and answer, content creation, data analysis, code assistant and other scenarios.

The newly opened Vincentian graph capability is the biggest surprise that Tencent’s Hunyuan model brings to us, demonstrating its leading capabilities in the field of automatic image generation. Of course, Tencent Hunyuan Wenshengtu is also gradually evolving, and more Wenshengtu related and Wenshengtu functions will be developed in the future. We can look forward to a wave of it.

Currently, Hunyuanwen’s image-generating capabilities have been embedded in Tencent’s advertising scenarios, such as generating product advertisements or advertising images. In multiple rounds of evaluations under the advertising business, the case excellence rate and advertiser adoption rate of Tencent Hunyuan Wenshengtu reached 86% and 26% respectively, which are both higher than similar models.

# Let’s first look at the following example, which requires the Hunyuan large model to generate a hotel room. Judging from the effects, the Hunyuan Wensheng picture effect is obviously better after the upgrade, the design and quality are greatly improved, and the details are richer. Even comparing it to Midjourney, the results are comparable.

The character class generation scene has a similar effect. After the upgrade, the portraits generated by Hunyuan are more realistic, such as facial skin color, wrinkles and other details.

In addition to advertising scenes, Tencent is also constantly exploring other demand scenarios for Wenshengtu, such as generating game elements and game characters in game scenes, and generating novel accessories in content scenes. Pictures, illustrations, cloud business scenarios open hybrid capabilities to customers in different industries.

No matter how powerful the model is, it must be used by more people and continue to receive feedback, so that it can make further progress.

It can be foreseen that Tencent products will usher in an explosion of Hunyuan Wenshengtu capabilities in the future, and users will also experience more of the charm brought by AIGC.

The above is the detailed content of Tencent's Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete

Can't use ChatGPT! Explaining the causes and solutions that can be tested immediately [Latest 2025]May 14, 2025 am 05:04 AM

ChatGPT is not accessible? This article provides a variety of practical solutions! Many users may encounter problems such as inaccessibility or slow response when using ChatGPT on a daily basis. This article will guide you to solve these problems step by step based on different situations. Causes of ChatGPT's inaccessibility and preliminary troubleshooting First, we need to determine whether the problem lies in the OpenAI server side, or the user's own network or device problems. Please follow the steps below to troubleshoot: Step 1: Check the official status of OpenAI Visit the OpenAI Status page (status.openai.com) to see if the ChatGPT service is running normally. If a red or yellow alarm is displayed, it means Open

Calculating The Risk Of ASI Starts With Human MindsMay 14, 2025 am 05:02 AM

On 10 May 2025, MIT physicist Max Tegmark told The Guardian that AI labs should emulate Oppenheimer’s Trinity-test calculus before releasing Artificial Super-Intelligence. “My assessment is that the 'Compton constant', the probability that a race to

An easy-to-understand explanation of how to write and compose lyrics and recommended tools in ChatGPTMay 14, 2025 am 05:01 AM

AI music creation technology is changing with each passing day. This article will use AI models such as ChatGPT as an example to explain in detail how to use AI to assist music creation, and explain it with actual cases. We will introduce how to create music through SunoAI, AI jukebox on Hugging Face, and Python's Music21 library. Through these technologies, everyone can easily create original music. However, it should be noted that the copyright issue of AI-generated content cannot be ignored, and you must be cautious when using it. Let’s explore the infinite possibilities of AI in the music field together! OpenAI's latest AI agent "OpenAI Deep Research" introduces: [ChatGPT]Ope

What is ChatGPT-4? A thorough explanation of what you can do, the pricing, and the differences from GPT-3.5!May 14, 2025 am 05:00 AM

The emergence of ChatGPT-4 has greatly expanded the possibility of AI applications. Compared with GPT-3.5, ChatGPT-4 has significantly improved. It has powerful context comprehension capabilities and can also recognize and generate images. It is a universal AI assistant. It has shown great potential in many fields such as improving business efficiency and assisting creation. However, at the same time, we must also pay attention to the precautions in its use. This article will explain the characteristics of ChatGPT-4 in detail and introduce effective usage methods for different scenarios. The article contains skills to make full use of the latest AI technologies, please refer to it. OpenAI's latest AI agent, please click the link below for details of "OpenAI Deep Research"

Explaining how to use the ChatGPT app! Japanese support and voice conversation functionMay 14, 2025 am 04:59 AM

ChatGPT App: Unleash your creativity with the AI assistant! Beginner's Guide The ChatGPT app is an innovative AI assistant that handles a wide range of tasks, including writing, translation, and question answering. It is a tool with endless possibilities that is useful for creative activities and information gathering. In this article, we will explain in an easy-to-understand way for beginners, from how to install the ChatGPT smartphone app, to the features unique to apps such as voice input functions and plugins, as well as the points to keep in mind when using the app. We'll also be taking a closer look at plugin restrictions and device-to-device configuration synchronization

How do I use the Chinese version of ChatGPT? Explanation of registration procedures and feesMay 14, 2025 am 04:56 AM

ChatGPT Chinese version: Unlock new experience of Chinese AI dialogue ChatGPT is popular all over the world, did you know it also offers a Chinese version? This powerful AI tool not only supports daily conversations, but also handles professional content and is compatible with Simplified and Traditional Chinese. Whether it is a user in China or a friend who is learning Chinese, you can benefit from it. This article will introduce in detail how to use ChatGPT Chinese version, including account settings, Chinese prompt word input, filter use, and selection of different packages, and analyze potential risks and response strategies. In addition, we will also compare ChatGPT Chinese version with other Chinese AI tools to help you better understand its advantages and application scenarios. OpenAI's latest AI intelligence

5 AI Agent Myths You Need To Stop Believing NowMay 14, 2025 am 04:54 AM

These can be thought of as the next leap forward in the field of generative AI, which gave us ChatGPT and other large-language-model chatbots. Rather than simply answering questions or generating information, they can take action on our behalf, inter

An easy-to-understand explanation of the illegality of creating and managing multiple accounts using ChatGPTMay 14, 2025 am 04:50 AM

Efficient multiple account management techniques using ChatGPT | A thorough explanation of how to use business and private life! ChatGPT is used in a variety of situations, but some people may be worried about managing multiple accounts. This article will explain in detail how to create multiple accounts for ChatGPT, what to do when using it, and how to operate it safely and efficiently. We also cover important points such as the difference in business and private use, and complying with OpenAI's terms of use, and provide a guide to help you safely utilize multiple accounts. OpenAI

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055612 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Nordhold: Fusion System, Explained

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Hot Topics

1673

1429

1333

1278

1257