Home >Technology peripherals >AI >Five-minute technical talk | AIGC introduction and application selection evaluation

Five-minute technical talk | AIGC introduction and application selection evaluation

WBOY
WBOYforward
2023-06-04 13:31:401471browse

五分钟技术趣谈 | AIGC介绍与应用选型评估

Part 01 Introduction to AIGC

AIGC (AI- Generated Content (artificial intelligence production content) refers to the production method that uses AI technology to automatically or assist in generating various forms of content such as text, code, images, voices, videos, and 3D objects. AIGC represents a new trend in the development of AI technology, from perceiving and understanding the world to generating and creating the world, and from analytical capabilities to creative capabilities. AIGC has also brought about changes in content creation, improving the quality, efficiency and diversity of content.

1.1 Text generation

Text generation refers to using AI technology automatically generates text content that conforms to grammar and logic based on given input (such as keywords, pictures, voices, etc.), which is an important aspect of AIGC.

The application scenarios of text generation are very rich, including news writing, novel creation, marketing copywriting, customer service Q&A, chat robots, educational coaching, knowledge graphs, and summary generation wait.

➤ Wen Xinyiyan: A large AI model launched by Baidu that supports multi-modal output and can perform literary creation, business copywriting creation, and mathematical logic calculations , Chinese understanding, multi-modal generation, etc.

➤ ChatGPT: A chat application based on the GPT series model launched by OpenAI. Currently, the GPT-4 model has been launched. ChatGPT based on the GPT-4 model can analyze images. And interact with text and pictures.

1.2 Code generation

Includes code completion, Code refactoring, code optimization, code annotation, etc. can cover a variety of programming languages ​​and fields. Based on OpenAI's GPT-4 model, it is even possible to generate the corresponding website code based on a hand-drawn product prototype draft.

➤ Github Copilot: An AI-assisted programming tool developed based on the OpenAI Codex model. It supports dozens of programming languages ​​and can perform real-time processing based on code or comments. It can provide code suggestions and entire functions in the editor, and can also achieve a pair programming experience through chat interaction.

➤ Cursor: an independent IDE software that integrates OpenAI’s GPT model. Similar to Github Copilot, Cursor can write code, edit code and chat through AI.

1.3 Image generation

Image generation refers to using Artificial intelligence technology, based on given input (such as natural language, images, videos, etc.), automatically generates images that conform to semantics and aesthetics, is an important aspect of AIGC. Image generation has a wide range of application scenarios, including artistic creation, entertainment media, education and training, e-commerce marketing, medical diagnosis, etc.

➤ Wenxin Yige: An AI art and creative assistance platform launched by Baidu. Paintings can be automatically generated based on text description and style selection.

➤ DALL-E2: A generative model based on an adaptive multi-modal encoder launched by OpenAI in (such as text, pictures, etc.) to automatically generate high-quality images.

➤ Midjourney: An AI painting tool released in March 2022. It can generate pictures based on natural language, select the artistic styles of different painters, and also Identify specific lenses or photography terms. Paintings generated by this tool have won first prize in art competitions.

1.4 Video generation

Video generation is mainly divided into two types: video editing and independent video generation. Video editing can be used for video super-scoring, repair and editing. Autonomous video generation can be used for image-to-video conversion, or for generating matching videos given descriptive text. The following are some related applications:

➤ Deepfake: This is an AI video generation platform based on GAN technology, which can realize face changing, voice conversion, and expression imitation and other functions. Users only need to upload a picture or a video as a reference, and the video will be automatically generated.

➤ Make-A-Video: An AI system launched by Meta Company that can convert text into video. It can create one-of-a-kind videos filled with vibrant colors, people, and scenery based on just a few words or lines of text.

1.5 3D modeling

AIGC-based 3D Modeling technology refers to the use of artificial intelligence technology to automatically generate 3D models that comply with semantics and aesthetics based on given input (such as natural language, images, etc.). This area is currently in an early stage of exploration. The following are some related applications or models:

➤ AICommand: an open source AI command plug-in based on Unity that can generate 3D scenes through text descriptions and Text is adjusted and optimized for 3D scenes. (https://github.com/keijiro/AICommand)

➤ ICON: An open source AI model that generates 3D character modeling based on character pictures (https: //github.com/YuliangXiu/ICON). You can experience and download the generated 3D model online: https://huggingface.co/spaces/Yuliang/ICON

五分钟技术趣谈 | AIGC介绍与应用选型评估

Part 02 AIGC Application and Model Evaluation

After ChatGPT was launched by OpenAI at the end of 2022, the cumulative number of users exceeded 100 million in just two months. It quickly became popular all over the world. As a result, the AI ​​iPhone moment has arrived, and major IT manufacturers quickly followed up. The following is an introduction to some relevant applications or models as of April 2023.

  • Wen Xin’s words: See above.
  • ChatGPT: See above.
  • #Bard: A lightweight version of the NLP model launched by Google based on LaMDA.
  • New Bing: An intelligent search engine based on the GPT4 model launched by Microsoft. It can interact with users in natural language and combine with real-time search results to provide Information, entertainment, creation and other functions.
  • ChatGLM: A large conversational language model launched by Tsinghua University based on GLM architecture, open source and supporting Chinese and English bilinguals. Low-cost minimal model construction can be carried out based on CPU, and the model can also be developed and fine-tuned secondaryly.
  • Poe: A free AI chatbot application developed by Quora. The application integrates 6 mainstream AI chatbots including: ChatGPT and GPT-4. .

will be evaluated and compared from the following aspects (except Poe):

  • Natural language processing
  • Logical reasoning
  • Code generation
  • Multi-modal support

PS:

  • The ChatGPT participating in the evaluation is based on the GPT-3.5 model.
  • The ChatGLM participating in the evaluation is only the minimized model: chatglm-6b-int4-qe. For practical applications, the chatglm-6b model that requires GPU memory should be built, and the quality of answers will be greatly improved.

##2.1 Natural Language Processing

Evaluation Content:

➪Multiple rounds of dialogue: Let’s create a children’s story together. The rule is that I say something first and you say something next, alternating. It ends when I say "I'm done with the story." Do you understand?

➪Language understanding: My boss said 1 1=3. Everything my boss said is right, so 1 1=3, right?

➪Language Translation: Translate this passage into English: One flower blooming alone is not spring, but a hundred flowers blooming together fill the garden.

➪Emotional analysis: Analyze the emotional color of this passage: I like this new movie very much. It made me laugh many times and moved me. Cried.


  • ##ChatGPT


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估


##文心一言


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估

五分钟技术趣谈 | AIGC介绍与应用选型评估

五分钟技术趣谈 | AIGC介绍与应用选型评估


Bard


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估


五分钟技术趣谈 | AIGC介绍与应用选型评估

##NewBing


  • 五分钟技术趣谈 | AIGC介绍与应用选型评估

    五分钟技术趣谈 | AIGC介绍与应用选型评估


    五分钟技术趣谈 | AIGC介绍与应用选型评估


    五分钟技术趣谈 | AIGC介绍与应用选型评估


    • ChatGLM


    五分钟技术趣谈 | AIGC介绍与应用选型评估

    五分钟技术趣谈 | AIGC介绍与应用选型评估

    五分钟技术趣谈 | AIGC介绍与应用选型评估

    五分钟技术趣谈 | AIGC介绍与应用选型评估


    The scores are as follows:


    五分钟技术趣谈 | AIGC介绍与应用选型评估


    ##2.2 Logical reasoning

    Assessment content:

    ➪ In a There are five books on the shelf: red book, green book, blue book, orange book and yellow book. The green book is to the left of the yellow book, the yellow book is the third from the left, the red book is the second from the left, and the blue book is on the far right. What is the order of these books?

    ➪ There are three points A, B, and C on a 100-meter-long straight line. The position of A is uncertain. The distance between A and B is 5 meters. The distance between A and C is 10 meters, what is the possible distance between B and C?

    ##➪ If 2


    ChatGPT

    五分钟技术趣谈 | AIGC介绍与应用选型评估


    五分钟技术趣谈 | AIGC介绍与应用选型评估

    五分钟技术趣谈 | AIGC介绍与应用选型评估#文心一言


    五分钟技术趣谈 | AIGC介绍与应用选型评估


    五分钟技术趣谈 | AIGC介绍与应用选型评估


    五分钟技术趣谈 | AIGC介绍与应用选型评估

    #Bard

    • 五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      • #NewBing


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      • ChatGLM


      五分钟技术趣谈 | AIGC介绍与应用选型评估

      ##The scores are as follows:


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      #2.3 Coding Ability

      Evaluation content:

      Code generation: writing a python function , accepts an integer as input and determines whether it is a palindrome.
      • Code explanation: Explain this line of python code: my_list = [x for x in my_list if x % 2 == 0]
      • Bug detection: Where is the BUG in this line of code: my_list = [x for x in my_list if x % 2 = 0]


      ChatGPT



      五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估

      五分钟技术趣谈 | AIGC介绍与应用选型评估


      文心一言


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      五分钟技术趣谈 | AIGC介绍与应用选型评估


      Bard五分钟技术趣谈 | AIGC介绍与应用选型评估

      • 五分钟技术趣谈 | AIGC介绍与应用选型评估


        五分钟技术趣谈 | AIGC介绍与应用选型评估


        五分钟技术趣谈 | AIGC介绍与应用选型评估


        • #NewBing

        五分钟技术趣谈 | AIGC介绍与应用选型评估

        五分钟技术趣谈 | AIGC介绍与应用选型评估

        五分钟技术趣谈 | AIGC介绍与应用选型评估


        • #ChatGLM


        五分钟技术趣谈 | AIGC介绍与应用选型评估

        五分钟技术趣谈 | AIGC介绍与应用选型评估

        五分钟技术趣谈 | AIGC介绍与应用选型评估


        #The scores are as follows:


        五分钟技术趣谈 | AIGC介绍与应用选型评估


        ##2.4 Multi-modal support

        Multimodal support refers to the ability to handle multiple data types, such as text, images, audio and video, etc. For example: through text input, pictures, audio and video are automatically generated based on text requirements; through picture or audio and video input, content summary text is output, etc.


        ##ChatGPT

        • ChatGPT based on the GPT-3.5 model does not support multi-modal input and output capabilities, while ChatGPT based on the GPT-4 model can analyze pictures and analyze feedback text.

        ##文心一言

        • Wen Xin Yi Yan can currently generate images and voices based on text descriptions. The video generation capability was demonstrated at the press conference, but during actual use, the video could not be generated.

        五分钟技术趣谈 | AIGC介绍与应用选型评估

        #Bard


        • Google Bard does not support multi-modal capabilities.

        ##NewBing


        • NewBing's creativity mode supports generating pictures through text descriptions.


        • ChatGLM

        ##Tsinghua’s ChatGLM does not support multiple Modal capabilities.

        The scores are as follows:



        五分钟技术趣谈 | AIGC介绍与应用选型评估


        #Part 03

        Evaluation summary and selection evaluation

        Combined with the above comparison scores, a comprehensive evaluation will be considered from the two stages of Demo and production (commercial use).

        The overall evaluation score is as follows:


        五分钟技术趣谈 | AIGC介绍与应用选型评估

        ##The selection evaluation is as follows:


        五分钟技术趣谈 | AIGC介绍与应用选型评估

        ##Part 04

        Summary

        Demo stage: Wen Xinyiyan is the first AI choice, NewBing and ChatGPT are the alternative AI, and ChatGLM As an exploration direction of self-developed AIGC (requires GPU resources).

        In the production and commercial stage, multiple lines are available:

        # Domestic mainland regions are seeking to introduce AI in the form of B-side cooperation with Wen Xinyiyan;
        • Domestic Hong Kong, Macao and Taiwan regions can consider introducing OpenAI’s official GPT-4 API for AI introduction;
        • Based on the Tsinghua ChatGLM model, build And fine-tune the development of independent AI.
        • Part 05

        Conclusion WebGPUfor

The above is the detailed content of Five-minute technical talk | AIGC introduction and application selection evaluation. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete