search
HomeTechnology peripheralsAITencent's Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

In 2023, the accelerator button will be pressed for the launch of large models, and Vincentian graphics will be one of the hottest application directions.

Since the birth of Stable Diffusion, large models of Wenshengtu have been emerging at home and abroad, and it felt like "fighting between gods" for a while. Each technology iteration brings rapid improvements in model generation effects and speed.

Just today, Tencent Hunyuan Model also announced the latest progress: Vincentian graph capability is officially launched.

#As soon as we tried it out, we saw Hunyuan Model’s understanding of the broad and profound Chinese food culture. Here I chose the "ant climbing the tree" that makes many large models difficult, but the Hunyuan is easily generated:

Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis

The question is, the current Wenshengtu large model is so large, does the Hunyuan large model have any other special advantages?

According to the official introduction, in terms of algorithms and models, the current Vincentian large model still has some challenges, such as insufficient semantic understanding, unreasonable image structure, Problems such as insufficient picture details and low quality.

#Tencent has long begun to explore AI-generated images in advertising scenarios, and the relevant accumulation is quite profound. This Hunyuan large model upgrade’s Wenshengtu capability precisely hopes to solve the three problems of “semantics, content, and texture”.

According to reports, compared with other large models, Tencent Hunyuan’s Wen Sheng Tu has obvious advantages in the realism of portraits and scenes. At the same time, in the Chinese landscape It has good performance in generating scenes such as animation and games.

Hands-on test: Hunyuan Wensheng Tu, what’s the difference?

# To do a good job in "Wen Sheng Tu", a full understanding of "Wen" is crucial.

In terms of semantic understanding, the Hunyuan Wensheng graph model adopts a Chinese and English bilingual fine-grained model, and at the same time realizes bilingualism based on Chinese and English bilingual modeling Understand, and improve the model's ability to perceive details and generate effects through optimization algorithms.

Prior to this, although popular models like Stable Diffusion supported Chinese to a certain extent, their core data set LAION-5B was still mainly Westernized content, which was I don’t understand enough about Chinese language, food, culture, and customs.

The Hunyuan Wenshengtu model is a native Chinese Wenshengtu model. Regardless of the Chinese poems or idioms input by the user, the user can be directly asked to create paintings.

In terms of content rationality, Hunyuanwenshengtu enhances the image two-dimensional space position perception ability of the algorithm model and integrates the human skeleton and human hands Prior information such as structure is introduced into the generation process to make the generated image structure more reasonable and improve the problem of unreasonable human structure and hands generated by AI.

In terms of picture texture, Hunyuanwenshengtu is based on a multi-model fusion method to improve the generated texture. After optimization, the portrait model (hair, wrinkles, etc.) effect of Hunyuan Wenshengtu has been improved by 30%, and the scene model (vegetation, ripples, etc.) effect has been improved by 25%.

#The technical advantages in these three aspects have obviously improved the Hunyuan large model Wenshengtu product experience.

#In order to verify the above capabilities, this website set some questions and conducted a thorough test on the Hunyuan large model at the first time.

Since Hunyuan is a native Chinese model, it naturally understands "ancient Chinese language" better than other similar products. We first let it draw based on ancient poems.

We selected a very artistic ancient poem "When you are drunk, you don't know the sky is in the water, and the boat is full of clear dreams and the stars are overwhelming" to test to see if the Hunyuan large model can generate extreme Picture-like pictures.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
In the poem "Boat at Guazhou", the line "The spring breeze turns green again on the south bank of the river, when will the bright moon shine back on me?" writes the homesickness of countless wanderers. . As a result of the generation of Hunyuan, images such as "spring light", "water bank", and "bright moon" are extracted and combined organically, making people feel like they are in a poetic scene after seeing it:
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
Then comes the interesting "Chinese Food Painting" session. Let's take a classic test on "Shredded Pork with Fish Flavor":
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
From the Chinese food paintings that make people go crazy, to the current level of eating just by looking at the pictures, we can also feel the continuous evolution of Vincentian painting technology.

Let’s take a look at how Hunyuan does on the industry-recognized problem of “realistic portraits”:

We know that Midjourney became popular in the first place because of the photo of the couple below, which people can’t tell was not generated by AI.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
## , let’s examine the ability of the Hunyuan large model to generate “cheating”. The prompt used is:

How do you feel about the realism? In our opinion, the details mentioned in Prompt are sufficient.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
This is what Tencent emphasizes: the Hunyuan large model improves the perception of details and the generation effect through optimization algorithms. This ability can only be reflected in many specific scenes. Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
For example, in an animation scene, a deer is running in the forest, causing fallen leaves to fly up, the moon is very bright and big, and birds are flying in the sky, creating a sense of atmosphere. CG style, side view".

Does it look like the scene in the animation you watched when you were a kid?
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
In addition, in animation creation, the application potential of Vincentian diagrams is huge. Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
The prompt we gave to the Hunyuan large model is "Generate 3D, anime style, 1 girl, blond hair, smile, short hair, city background":
What do you think of the generation effect? Can it be used directly as wallpaper?
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis#What are the self-developed technologies behind Wenshengtu?

If a worker wants to do his job well, he must first sharpen his tools, and the same is true for large models.
We learned that in addition to innovative model algorithms, the Tencent Hunyuan large model can achieve such a Wensheng picture effect that is in line with the Chinese local atmosphere, and it is also inseparable from high-quality pictures. Text matching data, self-developed machine learning framework and powerful computing infrastructure.
Tencent Hunyuan Large Model has formed a full-link self-developed technology path from model algorithm to machine learning framework to AI infrastructure. Multi-level technological accumulation means that the evolution of large models requires one step at a time, starting from practice and improving in practice.
First let’s look at the data engineering that supports model training.
# For any AI, especially large models, data is one of the three indispensable elements. The same is true for the large-model text generation function. Image and text data, especially the matching data between images and texts, has a decisive impact on the generation effect.
However, not all existing data on the Internet can be used immediately. The big problem is that the text description of the picture may not be accurate, which leads to a large number of problems. The quality of most image-text matching data is relatively poor. If used, even if the training time is very long, the model generation effect will still not meet expectations, which will also affect the stability of the generation quality and subsequent iteration efficiency.
# Therefore, improving the quality of image and text data has become the "first hurdle" to ensure the effect of Vincentian images. At this time, it is often necessary to improve data quality through engineering methods, support model training, optimization and upgrade, and build a moat for the algorithm model.
Faced with the problem of image and text matching data, the response strategy of Tencent Hunyuanwenshengtu team is as follows: first, refine the Chinese prompts in a fine-grained manner to improve the correlation between images and texts. Maximize data quality; then adopt a strategy of layering and grading training data to gradually optimize the model and maximize data effects; and finally build a data flywheel, which is the key to rapid iteration of large models. Based on feedback from online users using large models, the team automatically builds training data to speed up model iteration and maximize data efficiency.
#The data quality, effect and efficiency have been improved, which lays the foundation for a good Vincent chart effect. The machine learning framework to be discussed next is equally important.

A powerful machine learning framework or platform will greatly improve the speed and efficiency of developers in building, training and deploying models. Tencent has developed its own Angel machine learning platform for large model training and inference scenarios, which mainly includes AngelPTM for training and AngelHCF for inference.

AngelPTM adopts the ZeRO-Cache optimization strategy and becomes a powerful tool for super-large model training. It expands the capacity of single-machine models through storage management, improves resource utilization through multi-stream asynchronously, and uses video memory to Management improves memory efficiency. In addition, 4D parallelism is used to increase the upper limit of available video memory, reduce communication pressure on kilocards, and release computing potential. The automatic training renewal mechanism supports automatic fault tolerance of kilocard failures and reduces interruption time. The model training situation is also monitored in real time, and the collaborative algorithm optimizes the model training direction.

Currently, AngelPTM realizes high-speed training of hundreds of billions of mixed element base models in parallel based on the industry's first ZeRO-Cache mechanism 4D. The training speed is compared to the mainstream open source framework (DeepSpeed -Chat) increased by 1 times.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
ZeRO-Cache Overview.

AngelHCF mainly customizes diversified service strategies, parallel strategies, framework acceleration (covering common GPU acceleration methods), and model compression (supports commonly used compression in the industry Methods) and efficient model debugging capabilities at five levels to improve the reasoning performance of large models. The inference speed is 1.3 times higher than that of the industry's mainstream framework (FasterTransformer).

Tencent said that its Angel machine learning platform has leading performance and can help provide a better infrastructure system and help large models run at high speed. This allows the Hunyuan large model to generate high-quality images while also greatly improving the generation speed.

With high-quality data and efficient machine learning framework, the continuous operation of large models still faces the test of computing power. After all, in the era of large models, computing power is king.

The function of Tencent Hunyuan Wenshengtu is inseparable from the powerful computing infrastructure provided by Tencent Cloud. In April 2023, Tencent Cloud released a new generation of HCC high-performance computing cluster, using the latest generation of Xinghai self-developed servers, and based on self-developed network and storage architecture, achieving 3.2T ultra-high interconnect bandwidth, TB-level throughput capacity and 10 million level IOPS. The computing power performance of the new generation cluster is improved by 3 times compared with the previous generation and more than 12 times compared with the traditional computing cluster solution.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
# While strengthening the underlying hardware, the upper-layer software capabilities must also go hand in hand. The new generation HCC cluster integrates Tencent Cloud's self-developed TACO training acceleration engine and has made a lot of system-level optimizations from the network protocol, communication strategy, AI framework, and model compilation levels. This comprehensive set of ecological training acceleration solutions can not only help customers lower the AI ​​optimization threshold and improve AI training performance, but also greatly reduce training tuning and computing power costs.

It seems that the three major factors that restrict large models, algorithm, data and computing power, are no longer a problem in Tencent Hunyuan large model. Naturally, the quality and effect of Vincentian drawings are also guaranteed.

The effect is "false and real",
The ability of Wenshengtu has been embedded in Tencent advertising scenes

The Hunyuan large model Wenshengtu ability we saw today was not achieved overnight, but a real process of evolution.

At the 2023 Tencent Global Digital Ecosystem Conference held last month, Tencent’s Hunyuan large model was officially unveiled. Jiang Jie, vice president of Tencent Group, said at the time that Hunyuan is always on the road. Tencent will continue to evolve Hunyuan’s capabilities and hopes to bring surprises to everyone every month.

Currently, Tencent has 180 internal businesses connected to the Hunyuan large model, including Tencent Conference, Tencent Documents, Enterprise WeChat, Tencent Advertising and WeChat Search. . At the same time, customers from multiple industries such as retail, education, finance, medical care, media, transportation, government affairs, etc. also call Tencent Hunyuan API through Tencent Cloud. The application areas include intelligent question and answer, content creation, data analysis, code assistant and other scenarios.

The newly opened Vincentian graph capability is the biggest surprise that Tencent’s Hunyuan model brings to us, demonstrating its leading capabilities in the field of automatic image generation. Of course, Tencent Hunyuan Wenshengtu is also gradually evolving, and more Wenshengtu related and Wenshengtu functions will be developed in the future. We can look forward to a wave of it.

Currently, Hunyuanwen’s image-generating capabilities have been embedded in Tencent’s advertising scenarios, such as generating product advertisements or advertising images. In multiple rounds of evaluations under the advertising business, the case excellence rate and advertiser adoption rate of Tencent Hunyuan Wenshengtu reached 86% and 26% respectively, which are both higher than similar models.

# Let’s first look at the following example, which requires the Hunyuan large model to generate a hotel room. Judging from the effects, the Hunyuan Wensheng picture effect is obviously better after the upgrade, the design and quality are greatly improved, and the details are richer. Even comparing it to Midjourney, the results are comparable.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
The character class generation scene has a similar effect. After the upgrade, the portraits generated by Hunyuan are more realistic, such as facial skin color, wrinkles and other details.
Tencents Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis
In addition to advertising scenes, Tencent is also constantly exploring other demand scenarios for Wenshengtu, such as generating game elements and game characters in game scenes, and generating novel accessories in content scenes. Pictures, illustrations, cloud business scenarios open hybrid capabilities to customers in different industries.

No matter how powerful the model is, it must be used by more people and continue to receive feedback, so that it can make further progress.

It can be foreseen that Tencent products will usher in an explosion of Hunyuan Wenshengtu capabilities in the future, and users will also experience more of the charm brought by AIGC.

The above is the detailed content of Tencent's Hunyuan large model has been upgraded again, with shocking release of Vincentian graph capabilities and comprehensive actual measurement and analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete
A Comprehensive Guide to ExtrapolationA Comprehensive Guide to ExtrapolationApr 15, 2025 am 11:38 AM

Introduction Suppose there is a farmer who daily observes the progress of crops in several weeks. He looks at the growth rates and begins to ponder about how much more taller his plants could grow in another few weeks. From th

The Rise Of Soft AI And What It Means For Businesses TodayThe Rise Of Soft AI And What It Means For Businesses TodayApr 15, 2025 am 11:36 AM

Soft AI — defined as AI systems designed to perform specific, narrow tasks using approximate reasoning, pattern recognition, and flexible decision-making — seeks to mimic human-like thinking by embracing ambiguity. But what does this mean for busine

Evolving Security Frameworks For The AI FrontierEvolving Security Frameworks For The AI FrontierApr 15, 2025 am 11:34 AM

The answer is clear—just as cloud computing required a shift toward cloud-native security tools, AI demands a new breed of security solutions designed specifically for AI's unique needs. The Rise of Cloud Computing and Security Lessons Learned In th

3 Ways Generative AI Amplifies Entrepreneurs: Beware Of Averages!3 Ways Generative AI Amplifies Entrepreneurs: Beware Of Averages!Apr 15, 2025 am 11:33 AM

Entrepreneurs and using AI and Generative AI to make their businesses better. At the same time, it is important to remember generative AI, like all technologies, is an amplifier – making the good great and the mediocre, worse. A rigorous 2024 study o

New Short Course on Embedding Models by Andrew NgNew Short Course on Embedding Models by Andrew NgApr 15, 2025 am 11:32 AM

Unlock the Power of Embedding Models: A Deep Dive into Andrew Ng's New Course Imagine a future where machines understand and respond to your questions with perfect accuracy. This isn't science fiction; thanks to advancements in AI, it's becoming a r

Is Hallucination in Large Language Models (LLMs) Inevitable?Is Hallucination in Large Language Models (LLMs) Inevitable?Apr 15, 2025 am 11:31 AM

Large Language Models (LLMs) and the Inevitable Problem of Hallucinations You've likely used AI models like ChatGPT, Claude, and Gemini. These are all examples of Large Language Models (LLMs), powerful AI systems trained on massive text datasets to

The 60% Problem — How AI Search Is Draining Your TrafficThe 60% Problem — How AI Search Is Draining Your TrafficApr 15, 2025 am 11:28 AM

Recent research has shown that AI Overviews can cause a whopping 15-64% decline in organic traffic, based on industry and search type. This radical change is causing marketers to reconsider their whole strategy regarding digital visibility. The New

MIT Media Lab To Put Human Flourishing At The Heart Of AI R&DMIT Media Lab To Put Human Flourishing At The Heart Of AI R&DApr 15, 2025 am 11:26 AM

A recent report from Elon University’s Imagining The Digital Future Center surveyed nearly 300 global technology experts. The resulting report, ‘Being Human in 2035’, concluded that most are concerned that the deepening adoption of AI systems over t

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)