Five-minute technical talk | AIGC introduction and application selection evaluation-AI-php.cn

Home

Technology peripherals

Five-minute technical talk | AIGC introduction and application selection evaluation

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 04, 2023 pm 01:31 PM

AIai

五分钟技术趣谈 | AIGC介绍与应用选型评估

Part 01 Introduction to AIGC

AIGC (AI- Generated Content (artificial intelligence production content) refers to the production method that uses AI technology to automatically or assist in generating various forms of content such as text, code, images, voices, videos, and 3D objects. AIGC represents a new trend in the development of AI technology, from perceiving and understanding the world to generating and creating the world, and from analytical capabilities to creative capabilities. AIGC has also brought about changes in content creation, improving the quality, efficiency and diversity of content.

1.1 Text generation

Text generation refers to using AI technology automatically generates text content that conforms to grammar and logic based on given input (such as keywords, pictures, voices, etc.), which is an important aspect of AIGC.

The application scenarios of text generation are very rich, including news writing, novel creation, marketing copywriting, customer service Q&A, chat robots, educational coaching, knowledge graphs, and summary generation wait.

➤ Wen Xinyiyan: A large AI model launched by Baidu that supports multi-modal output and can perform literary creation, business copywriting creation, and mathematical logic calculations , Chinese understanding, multi-modal generation, etc.

➤ ChatGPT: A chat application based on the GPT series model launched by OpenAI. Currently, the GPT-4 model has been launched. ChatGPT based on the GPT-4 model can analyze images. And interact with text and pictures.

1.2 Code generation

Includes code completion, Code refactoring, code optimization, code annotation, etc. can cover a variety of programming languages and fields. Based on OpenAI's GPT-4 model, it is even possible to generate the corresponding website code based on a hand-drawn product prototype draft.

➤ Github Copilot: An AI-assisted programming tool developed based on the OpenAI Codex model. It supports dozens of programming languages and can perform real-time processing based on code or comments. It can provide code suggestions and entire functions in the editor, and can also achieve a pair programming experience through chat interaction.

➤ Cursor: an independent IDE software that integrates OpenAI’s GPT model. Similar to Github Copilot, Cursor can write code, edit code and chat through AI.

1.3 Image generation

Image generation refers to using Artificial intelligence technology, based on given input (such as natural language, images, videos, etc.), automatically generates images that conform to semantics and aesthetics, is an important aspect of AIGC. Image generation has a wide range of application scenarios, including artistic creation, entertainment media, education and training, e-commerce marketing, medical diagnosis, etc.

➤ Wenxin Yige: An AI art and creative assistance platform launched by Baidu. Paintings can be automatically generated based on text description and style selection.

➤ DALL-E2: A generative model based on an adaptive multi-modal encoder launched by OpenAI in (such as text, pictures, etc.) to automatically generate high-quality images.

➤ Midjourney: An AI painting tool released in March 2022. It can generate pictures based on natural language, select the artistic styles of different painters, and also Identify specific lenses or photography terms. Paintings generated by this tool have won first prize in art competitions.

1.4 Video generation

Video generation is mainly divided into two types: video editing and independent video generation. Video editing can be used for video super-scoring, repair and editing. Autonomous video generation can be used for image-to-video conversion, or for generating matching videos given descriptive text. The following are some related applications:

➤ Deepfake: This is an AI video generation platform based on GAN technology, which can realize face changing, voice conversion, and expression imitation and other functions. Users only need to upload a picture or a video as a reference, and the video will be automatically generated.

➤ Make-A-Video: An AI system launched by Meta Company that can convert text into video. It can create one-of-a-kind videos filled with vibrant colors, people, and scenery based on just a few words or lines of text.

1.5 3D modeling

AIGC-based 3D Modeling technology refers to the use of artificial intelligence technology to automatically generate 3D models that comply with semantics and aesthetics based on given input (such as natural language, images, etc.). This area is currently in an early stage of exploration. The following are some related applications or models:

➤ AICommand: an open source AI command plug-in based on Unity that can generate 3D scenes through text descriptions and Text is adjusted and optimized for 3D scenes. (https://github.com/keijiro/AICommand)

➤ ICON: An open source AI model that generates 3D character modeling based on character pictures (https: //github.com/YuliangXiu/ICON). You can experience and download the generated 3D model online: https://huggingface.co/spaces/Yuliang/ICON

五分钟技术趣谈 | AIGC介绍与应用选型评估

Part 02 AIGC Application and Model Evaluation

After ChatGPT was launched by OpenAI at the end of 2022, the cumulative number of users exceeded 100 million in just two months. It quickly became popular all over the world. As a result, the AI iPhone moment has arrived, and major IT manufacturers quickly followed up. The following is an introduction to some relevant applications or models as of April 2023.

Wen Xin’s words: See above.
ChatGPT: See above.
#Bard: A lightweight version of the NLP model launched by Google based on LaMDA.
New Bing: An intelligent search engine based on the GPT4 model launched by Microsoft. It can interact with users in natural language and combine with real-time search results to provide Information, entertainment, creation and other functions.
ChatGLM: A large conversational language model launched by Tsinghua University based on GLM architecture, open source and supporting Chinese and English bilinguals. Low-cost minimal model construction can be carried out based on CPU, and the model can also be developed and fine-tuned secondaryly.
Poe: A free AI chatbot application developed by Quora. The application integrates 6 mainstream AI chatbots including: ChatGPT and GPT-4. .

will be evaluated and compared from the following aspects (except Poe):

Natural language processing
Logical reasoning
Code generation
Multi-modal support

PS:

The ChatGPT participating in the evaluation is based on the GPT-3.5 model.
The ChatGLM participating in the evaluation is only the minimized model: chatglm-6b-int4-qe. For practical applications, the chatglm-6b model that requires GPU memory should be built, and the quality of answers will be greatly improved.

##2.1 Natural Language Processing

Evaluation Content:

➪Multiple rounds of dialogue: Let’s create a children’s story together. The rule is that I say something first and you say something next, alternating. It ends when I say "I'm done with the story." Do you understand?

➪Language understanding: My boss said 1 1=3. Everything my boss said is right, so 1 1=3, right?

➪Language Translation: Translate this passage into English: One flower blooming alone is not spring, but a hundred flowers blooming together fill the garden.

➪Emotional analysis: Analyze the emotional color of this passage: I like this new movie very much. It made me laugh many times and moved me. Cried.

##ChatGPT

五分钟技术趣谈 | AIGC介绍与应用选型评估

##文心一言

五分钟技术趣谈 | AIGC介绍与应用选型评估

Bard

五分钟技术趣谈 | AIGC介绍与应用选型评估

##NewBing

- ChatGLM
The scores are as follows:

##2.2 Logical reasoning
Assessment content:
➪ In a There are five books on the shelf: red book, green book, blue book, orange book and yellow book. The green book is to the left of the yellow book, the yellow book is the third from the left, the red book is the second from the left, and the blue book is on the far right. What is the order of these books?
➪ There are three points A, B, and C on a 100-meter-long straight line. The position of A is uncertain. The distance between A and B is 5 meters. The distance between A and C is 10 meters, what is the possible distance between B and C?
##➪ If 2

ChatGPT
#文心一言
#Bard
- #NewBing
  
  ChatGLM
  
  ##The scores are as follows:
  
  #2.3 Coding Ability
  Evaluation content:
  
  Code generation: writing a python function , accepts an integer as input and determines whether it is a palindrome.
  
  Code explanation: Explain this line of python code: my_list = [x for x in my_list if x % 2 == 0]
  
  Bug detection: Where is the BUG in this line of code: my_list = [x for x in my_list if x % 2 = 0]
  
  ChatGPT
  
  文心一言
  
  Bard
  
  #NewBing
  
  #ChatGLM
  
  #The scores are as follows:
  
  ##2.4 Multi-modal support
  Multimodal support refers to the ability to handle multiple data types, such as text, images, audio and video, etc. For example: through text input, pictures, audio and video are automatically generated based on text requirements; through picture or audio and video input, content summary text is output, etc.
  
  ##ChatGPT
  
  ChatGPT based on the GPT-3.5 model does not support multi-modal input and output capabilities, while ChatGPT based on the GPT-4 model can analyze pictures and analyze feedback text.
  
  ##文心一言
  
  Wen Xin Yi Yan can currently generate images and voices based on text descriptions. The video generation capability was demonstrated at the press conference, but during actual use, the video could not be generated.
  
  #Bard
  
  Google Bard does not support multi-modal capabilities.
  
  ##NewBing
  
  NewBing's creativity mode supports generating pictures through text descriptions.
  
  ChatGLM
  
  ##Tsinghua’s ChatGLM does not support multiple Modal capabilities.
  
  The scores are as follows:
  
  #Part 03
  Evaluation summary and selection evaluation
  
  Combined with the above comparison scores, a comprehensive evaluation will be considered from the two stages of Demo and production (commercial use).
  
  The overall evaluation score is as follows:
  
  ##The selection evaluation is as follows:
  
  ##Part 04
  
  Summary ➢
  
  Demo stage: Wen Xinyiyan is the first AI choice, NewBing and ChatGPT are the alternative AI, and ChatGLM As an exploration direction of self-developed AIGC (requires GPU resources). ➢
  
  In the production and commercial stage, multiple lines are available:
  # Domestic mainland regions are seeking to introduce AI in the form of B-side cooperation with Wen Xinyiyan;
  
  Domestic Hong Kong, Macao and Taiwan regions can consider introducing OpenAI’s official GPT-4 API for AI introduction;
  
  Based on the Tsinghua ChatGLM model, build And fine-tune the development of independent AI.
  
  Part 05
  
  Conclusion WebGPUfor

The above is the detailed content of Five-minute technical talk | AIGC introduction and application selection evaluation. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Excel TRANSPOSE FunctionApr 22, 2025 am 09:52 AM

Powerful tools in Excel data analysis and processing: Detailed explanation of TRANSPOSE function Excel remains a powerful tool in the field of data analysis and processing. Among its many features, the TRANSPOSE function stands out for its ability to reorganize data quickly and efficiently. This feature is especially useful for data scientists and AI professionals who often need to reconstruct data to suit specific analytics needs. In this article, we will explore the TRANSPOSE function of Excel in depth, exploring its uses, usage and its practical application in data science and artificial intelligence. Learn more: Microsoft Excel Data Analytics Table of contents In Excel

How to Install Power BI DesktopApr 22, 2025 am 09:49 AM

Get Started with Microsoft Power BI Desktop: A Comprehensive Guide Microsoft Power BI is a powerful, free business analytics tool enabling data visualization and seamless insight sharing. Whether you're a data scientist, analyst, or business user, P

Graph RAG: Enhancing RAG with Graph Structures - Analytics VidhyaApr 22, 2025 am 09:48 AM

Introduction Ever wondered how some AI systems seem to effortlessly access and integrate relevant information into their responses, mimicking a conversation with an expert? This is the power of Retrieval-Augmented Generation (RAG). RAG significantly

SQL GRANT CommandApr 22, 2025 am 09:45 AM

Introduction Database security hinges on managing user permissions. SQL's GRANT command is crucial for this, enabling administrators to assign specific access rights to different users or roles. This article explains the GRANT command, its syntax, c

What is Python IDLE?Apr 22, 2025 am 09:43 AM

Introduction Python IDLE is a powerful tool that can easily develop, debug and run Python code. Its interactive shell, syntax highlighting, autocomplete and integrated debugger make it ideal for programmers of all levels of experience. This article will outline its functions, settings, and practical applications. Overview Learn about Python IDLE and its development benefits. Browse and use the main components of the IDLE interface. Write, save, and run Python scripts in IDLE. Use syntax highlighting, autocomplete and intelligent indentation. Use the IDLE integrated debugger to effectively debug Python code. Table of contents

Python & # 039: S maximum Integer ValueApr 22, 2025 am 09:40 AM

Python: Mastering Large Integers – A Comprehensive Guide Python's exceptional capabilities extend to handling integers of any size. While this offers significant advantages, it's crucial to understand potential limitations. This guide provides a deta

9 Free Stanford AI CoursesApr 22, 2025 am 09:35 AM

Introduction Artificial intelligence (AI) is revolutionizing industries and unlocking unprecedented possibilities across diverse fields. Stanford University, a leading institution in AI research, provides a wealth of free online courses to help you

What is Meta's Segment Anything Model(SAM)?Apr 22, 2025 am 09:25 AM

Meta's Segment Anything Model (SAM): A Revolutionary Leap in Image Segmentation Meta AI has unveiled SAM (Segment Anything Model), a groundbreaking AI model poised to revolutionize computer vision and image segmentation. This article delves into SAM

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software