Kuaishou proposes a billion-level multi-modal short video encyclopedia system

Home

Technology peripherals

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

王林

May 20, 2023 pm 05:10 PM

Kuaiped

Introduction

Currently, more and more short video users not only hope to use their fragmented time for leisure and entertainment, but also begin to hope to be able to use short videos in Get more knowledge on the platform. In 2021, Kuaishou’s pan-knowledge content playback volume increased by 58.11% year-on-year, and the platform had more than 33 million pan-knowledge live broadcasts throughout the year [1]. In order to better understand and organize pan-knowledge videos, Kuaishou MMU teamed up with Harbin Institute of Technology and others to propose the industry's first multi-modal short video encyclopedia - "Kuaipedia": using multi-modal and knowledge graph technology to extract information from massive short videos中Mining large-scale high-quality knowledge videos and structuring them to form a systematic short video encyclopedia knowledge base, provide users with a better knowledge acquisition experience, while inspiring creators to create high-quality knowledge content and build a healthy knowledge sharing ecosystem.

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

## Paper link: https://www.php.cn/link/b0da9d8dd88178e3bb138e08742eb2e2

##Project homepage: ##https://www.php.cn/link/1a725948eb0c738707b5c026a65ba618##The team mined hundreds of millions of knowledge videos from Kuaishou’s massive short videos, structured them, and built a video encyclopedia system of tens of millions of entries and knowledge points. The proposal of "Kuaipedia" helps the academic community to promote AI to understand world knowledge through multi-modal information, and has great imagination space for implementation in the industry.

Introduction

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

Encyclopedia, dating back to Greece and Rome, is also 17- An outstanding achievement of the French Enlightenment in the 18th century. Knowledge encyclopedia usually refers to a reference book or compendium that briefly introduces all human knowledge or a specific field or subject. With the rapid development of the Internet, online encyclopedia has become a new carrier of knowledge, such as Wikipedia, Baidu Encyclopedia, etc. However, these encyclopedias usually use pictures, texts and tables as carriers, making it difficult to express some knowledge that requires vivid demonstration, such as tutorial (How-to) knowledge. Figure 1 shows the dilemma of using pictures and text to tell the knowledge of "Shiba Inu" - "how to draw". Through short videos, we can explain and learn this knowledge very well.

See the specific video

https://www.php.cn /link/70e9dbe24ba303f2d25ac34d3ae945c5.

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia Figure 1: The dilemma of knowledge transfer in how-to knowledge with pictures and texts, pictures and texts come from short videos Frame screenshot

With the continuous iteration of the content industry and media forms, short videos have increasingly become the main medium for knowledge disseminators, especially in the dissemination of knowledge about some skills and expertise. It is natural. some advantages. At present, although there are public online encyclopedias with video content, they are usually in the form of brief introductions (such as Encyclopedia of Instant Understanding), and short videos are not utilized to the maximum extent. Therefore, the expressive ability of short videos in knowledge encyclopedias has been underestimated. Severely underestimated. For example, when people talk about "Shiba Inu", in addition to the "introduction", people also pay attention to "how to choose", "how to comb the hair", "how to correct food protection", etc. Therefore, we believe that structuring knowledge-based short videos into a structured short video encyclopedia is an effective way to understand world knowledge and help humans spread knowledge more efficiently.

Reference national standards Popular science knowledge , the skill (How) category boils down to ##tutorial knowledge, in Kuaishou’s massive videos Discover high-quality knowledge videos. In addition, we present the body of knowledge extracted from the short video in the form of entries (such as Shiba Inu) , and extract the specific knowledge points explained in the video (such as Shiba Inu-selection, Shiba Inu - food protection and correction, etc.), ultimately forming a short video encyclopedia knowledge system, as shown in Figure 2.

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

##Figure 2: Quick Knowledge - Overview of Multi-modal Short Video Encyclopedia

The proposal of "Kuaipedia" has the following contributions:

"Kuaipedia" Definition: We have pioneered a new multi-modal knowledge encyclopedia, which is based on entries, knowledge points, knowledge-based short videos and the relationships between them. constitute. This is the industry's first structured multi-modal short video encyclopedia.

Methods to build large-scale short video encyclopedia: We propose the use of knowledge videos A combination of recognition, entry knowledge point mining, and multi-modal knowledge links is used to build a large-scale short video encyclopedia. And pioneered the task of "multimodal knowledge linking" as an extension and extension of traditional entity linking.

Applications full of potential and imagination: Academically, "fast "Knowledge" uses a brand-new short video organization form of knowledge points, which can break through the upper limit of the current machine understanding of world knowledge by relying only on graphic knowledge graph (KG). In some downstream tasks of KG, such as entity linking, entity classification, or NLP, CV, etc. It has great potential for downstream tasks of content understanding. In the industry, forms such as "Kuaizhi" can help short video platforms operate efficiently, organize content, and improve users' understanding of knowledge and consumption efficiency.

Technical Overview

In order to achieve the above-mentioned short video encyclopedia structure, the core technology includes the following three main steps, as shown in Figure 3.

Knowledge video recognition: Through multi-modal video pre-training model, understanding And identify knowledge-based videos in massive videos;

Mining entry and knowledge points: Build the entry system “top-down” through the integration of multi-source knowledge bases , and then "bottom-up" builds the relationship between terms and knowledge points by mining user search queries to form an entry knowledge point tree;

Multi-modal knowledge link: Innovatively expands the traditional "entity linking" task and proposes to link videos to words through multi-modal content understanding technology "Multi-modal knowledge linking" task on a certain knowledge point (such as food protection correction) of articles (such as Shiba Inu).

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

Figure 3: Quick Knowledge Construction Technology Link

Through a large number of detailed manual evaluations, the knowledge points and videos mined by KuaiZhi have a high accuracy and quality. For more detailed algorithms and experimental data, please refer to the paper or our Github homepage (see the beginning of the article).

##Apply

First of all, multi-modal short video encyclopedia systems such as "Kuaipedia" have great potential in academia to promote the development of AI technology for understanding world knowledge. On the one hand, "Quick Knowledge" breaks through the limitations of graphics, text and tables, and describes an entity or concept through richer knowledge points and short videos. This approach can promote the development of multi-modal knowledge graph technology. On the other hand, these knowledge points and short videos help AI to better understand world knowledge, especially some How-to knowledge that is difficult to express in pictures and texts. This kind of multi-modal knowledge can enhance AI's understanding of the world and improve AI's understanding of the world. Downstream applications in KG, NLP, CV and other fields are very helpful. On the task of CCKS entity linking, we have proven that the simple introduction of "quick knowledge" multi-modal knowledge can effectively improve BERT's performance in entity linking and entity classification.

In addition, the implementation of "Kuaizhi" in the industry is very imaginative. In the process of expanding the short video ecology to "pan-knowledge", the existing form By constraining its communication methods, "Kuaizhi" can improve the operation and distribution efficiency of the platform through structured content and better meet users' demands for knowledge. We first tried to implement this technology in the health category. The Kuaishou Health team had previously mined a batch of high-quality PUGC content purely manually using disease types as the organizational dimension. However, there were imperfections in the disease knowledge system and the level of authoritative knowledge videos. With small pain points, it is difficult to efficiently build a complete, large-scale, and structured disease video system. After using the technology of "Kuaizhi", a batch of high-quality knowledge points and knowledge videos with Kuaishou characteristics are automatically mined, which enriches the disease content and is more efficient than purely manual construction. Dozens of times. Currently, this batch of content has been launched on the selected page of Kuaishou App: click on the "bottom bar" of a disease-related video in the selected video stream to evoke the "Kuaishou Health" half-screen page, and users can consume related content under the entry to which the video belongs. Knowledge points and related knowledge videos are shown in Figure 4.

Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia

Figure 4: Kuai Zhi is implemented in the health scene

In addition to health, "Kuaizhi" also covers knowledge content in many fields such as education, food, agriculture, rural areas and farmers, parent-child, law, technology, finance, etc., and has great application potential.

Conclusion

Faced with the development prospects of general knowledge content in the short video industry, we proposed the "Kuaipedia" multi-modal short video encyclopedia system. Starting from the massive short video content, we mined hundreds of millions of high-quality knowledge videos through multi-modal knowledge graph construction technology, and structured the knowledge content to build the industry's first large-scale systematic short video encyclopedia knowledge base, which has great significance in academic circles. There is great potential and room for imagination in the world and industry.

Introduction of the author

First author: Pan Haojie

Member of Kuaishou MMU Knowledge Graph Center, leader of the KuaiZhi project, graduated from Zhejiang University and Hong Kong University of Science and Technology with a bachelor's and master's degree, and was responsible for large-scale NLP algorithms at Alibaba Cloud PAI and framework, published more than 10 papers in top conferences and journals such as ACL, EMNLP, KDD, AIJ, etc., and a number of domestic and US patents, see Zhihu for details. Join Kuaishou in 2021.

Corresponding author: Fu Ruiji

He is the head of Kuaishou MMU Knowledge Graph Center. He graduated from Harbin Institute of Technology with a bachelor's degree, master's degree and Ph.D., and is a postdoctoral fellow at the University of Science and Technology of China. He once served as the deputy director of iFlytek AI Research Institute of HKUST and won the first prize of Wu Wenjun Artificial Intelligence Technology Progress Award. He has published many academic papers in international conferences and journals such as ACL, EMNLP, Coling, IJCAI, TASLP, etc., and applied for (obtained) more than 40 national invention patents. Join Kuaishou in 2021.

Cooperating teacher: Liu Ming

Professor/doctoral supervisor, Department of Computing, Harbin Institute of Technology. He has successively presided over many fund projects such as the National Key R&D Program Project, the National Natural Science Foundation, the China Postdoctoral Science Foundation Special Grant, the China Postdoctoral Science Foundation General Grant First Class Grant, and the Heilongjiang Provincial General Fund. Won the first prize of Heilongjiang Province Science and Technology Award, Harbin City Science and Technology Achievements, and the first prize of the 6th National Youth Artificial Intelligence Innovation and Entrepreneurship Conference. In recent years, he has published more than 20 CCFA/B papers as the first author or corresponding author, participated in the editing of one textbook, and translated one into English. He serves as the knowledge graph field chair of NLPCC2020, CCKS2020, and COLING2022, CCKS2019 publishing chair, CCKS2021 evaluation chair, and CCKS2022 workshop chair.

References

[1] Kuaishou, 2022 Kuaishou Pan-Knowledge Content Ecosystem Report.

[2] National Standards Committee: Knowledge Management Framework, GB/T 23703.

The above is the detailed content of Kuaishou proposes a billion-level multi-modal short video encyclopedia system - Kuaipedia. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Tesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserApr 22, 2025 am 11:48 AM

Since 2008, I've championed the shared-ride van—initially dubbed the "robotjitney," later the "vansit"—as the future of urban transportation. I foresee these vehicles as the 21st century's next-generation transit solution, surpas

Sam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailApr 22, 2025 am 11:29 AM

Revolutionizing the Checkout Experience Sam's Club's innovative "Just Go" system builds on its existing AI-powered "Scan & Go" technology, allowing members to scan purchases via the Sam's Club app during their shopping trip.

Nvidia's AI Omniverse Expands At GTC 2025Apr 22, 2025 am 11:28 AM

Nvidia's Enhanced Predictability and New Product Lineup at GTC 2025 Nvidia, a key player in AI infrastructure, is focusing on increased predictability for its clients. This involves consistent product delivery, meeting performance expectations, and

Exploring the Capabilities of Google's Gemma 2 ModelsApr 22, 2025 am 11:26 AM

Google's Gemma 2: A Powerful, Efficient Language Model Google's Gemma family of language models, celebrated for efficiency and performance, has expanded with the arrival of Gemma 2. This latest release comprises two models: a 27-billion parameter ver

The Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaApr 22, 2025 am 11:21 AM

This Leading with Data episode features Dr. Kirk Borne, a leading data scientist, astrophysicist, and TEDx speaker. A renowned expert in big data, AI, and machine learning, Dr. Borne offers invaluable insights into the current state and future traje

AI For Runners And Athletes: We're Making Excellent ProgressApr 22, 2025 am 11:12 AM

There were some very insightful perspectives in this speech—background information about engineering that showed us why artificial intelligence is so good at supporting people’s physical exercise. I will outline a core idea from each contributor’s perspective to demonstrate three design aspects that are an important part of our exploration of the application of artificial intelligence in sports. Edge devices and raw personal data This idea about artificial intelligence actually contains two components—one related to where we place large language models and the other is related to the differences between our human language and the language that our vital signs “express” when measured in real time. Alexander Amini knows a lot about running and tennis, but he still

Jamie Engstrom On Technology, Talent And Transformation At CaterpillarApr 22, 2025 am 11:10 AM

Caterpillar's Chief Information Officer and Senior Vice President of IT, Jamie Engstrom, leads a global team of over 2,200 IT professionals across 28 countries. With 26 years at Caterpillar, including four and a half years in her current role, Engst

New Google Photos Update Makes Any Photo Pop With Ultra HDR QualityApr 22, 2025 am 11:09 AM

Google Photos' New Ultra HDR Tool: A Quick Guide Enhance your photos with Google Photos' new Ultra HDR tool, transforming standard images into vibrant, high-dynamic-range masterpieces. Ideal for social media, this tool boosts the impact of any photo,

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),