A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.-AI-php.cn

Home

Technology peripherals

A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.

王林

Apr 15, 2023 pm 10:49 PM

aiModel

catalog

Transformer models: an introduction and catalog
High-throughout Generative Inference of Large Language Models with a Single GPU
Temporal Domain Generalization with Drift-Aware Dynamic Neural Networks
Large-scale physically accurate modeling of real proton exchange membrane fuel cell with deep learning
A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT
Adding Conditional Control to Text-to-Image Diffusion Models
EVA3D: Compositional 3D Human Generation from 2D image Collections
ArXiv Weekly Radiostation： NLP, CV, ML More selected papers (with audio)

Paper 1: Transformer models: an introduction and catalog

Author: Xavier Amatriain
##Paper address: https://arxiv.org/pdf /2302.07730.pdf

Abstract: Since it was proposed in 2017, the Transformer model has been demonstrated in other fields such as natural language processing and computer vision. It has achieved unprecedented strength and triggered technological breakthroughs such as ChatGPT, and people have also proposed various variants based on the original model.

As academia and industry continue to propose new models based on the Transformer attention mechanism, it is sometimes difficult for us to summarize this direction. Recently, a comprehensive article by Xavier Amatriain, head of AI product strategy at LinkedIn, may help us solve this problem.

A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.

## Recommendation: The goal of this article is to provide a more comprehensive but A simple catalog and classification also introduces the most important aspects and innovations in the Transformer model.

Paper 2: High-throughout Generative Inference of Large Language Models with a Single GPU

Abstract:

Traditionally, the high computational and memory requirements of large language model (LLM) inference have necessitated the use of multiple high-end AI accelerators Conduct training. This study explores how to reduce the requirements of LLM inference to a consumer-grade GPU and achieve practical performance. ,Recently, new research from Stanford University, UC Berkeley, ETH Zurich, Yandex, Moscow State Higher School of Economics, Meta, Carnegie Mellon University and other institutions proposed FlexGen. This is a high-throughput generation engine for running LLM with limited GPU memory. The figure below shows the design idea of FlexGen, which uses block scheduling to reuse weights and overlap I/O with calculations, as shown in figure (b) below, while other baseline systems use inefficient row-by-row scheduling, as shown in figure (a) below .

Recommendation:

Run the ChatGPT volume model, and only need one GPU from now on: here comes the method to accelerate by a hundred times.

Paper 3: Temporal Domain Generalization with Drift-Aware Dynamic Neural Networks

Abstract:

In the Domain Generalization (DG) task, when the distribution of the domain changes continuously with the environment, how to accurately capture the change and its impact on the model is A very important but also extremely challenging question.

To this end, Professor Zhao Liang’s team from Emory University proposed a time domain generalization framework DRAIN based on Bayesian theory, which uses a recursive network to learn the drift of the time dimension domain distribution, and at the same time uses a dynamic neural network to And the combination of graph generation technology maximizes the expressive ability of the model and achieves model generalization and prediction in unknown fields in the future.

This work has been selected into ICLR 2023 Oral (Top 5% among accepted papers). The following is a schematic diagram of the overall framework of DRAIN.

Recommendation: Drift-aware dynamic neural network blessing, the new framework for time domain generalization far exceeds Domain generalization & adaptation methods.

Paper 4: Large-scale physically accurate modeling of real proton exchange membrane fuel cell with deep learning

Author: Ying Da Wang et al
Paper address: https://www.nature.com/articles/s41467-023-35973- 8

##Abstract: In order to ensure energy supply and combat climate change, people’s focus has shifted from fossil fuels to clean and renewable energy, hydrogen With its high energy density and clean and low-carbon energy attributes, it can play an important role in the energy transformation. Hydrogen fuel cells, especially proton exchange membrane fuel cells (PEMFC), are key to this green revolution due to their high energy conversion efficiency and zero-emission operation.

PEMFC converts hydrogen into electricity through an electrochemical process, with the only by-product of the reaction being pure water. However, PEMFCs can become inefficient if water cannot flow out of the cell properly and subsequently "floods" the system. Until now, it has been difficult for engineers to understand the precise manner in which water drains or accumulates inside fuel cells because they are so small and complex.

Recently, a research team from the University of New South Wales in Sydney developed a deep learning algorithm (DualEDSR) to improve the understanding of the internal conditions of PEMFC, which can be obtained from lower resolution X High-resolution modeling images generated from radiographic microcomputed tomography. The process has been tested on a single hydrogen fuel cell, allowing its interior to be accurately modeled and potentially improving its efficiency. The figure below shows the PEMFC domains generated in this study.

A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.

Recommendation: Deep learning conducts large-scale physical accurate modeling of the interior of fuel cells to help Battery performance improved.

Paper 5: A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Author: Ce Zhou et al
Paper address: https://arxiv.org/pdf/2302.09419.pdf

Abstract: This nearly 100-page review combs the evolution history of the pre-trained basic model, allowing us to see how ChatGPT became successful step by step.

Recommendation: From BERT to ChatGPT, a hundred-page review combs the evolution history of pre-trained large models.

Paper 6: Adding Conditional Control to Text-to-Image Diffusion Models

Author: Lvmin Zhang et al
Paper address: https://arxiv.org/pdf/2302.05543.pdf

Abstract: This paper proposes an end-to-end neural network architecture ControlNet, which can improve the graph-generating graph by adding additional conditions to control the diffusion model (such as Stable Diffusion) Effect, and can generate full-color images from line drawings, generate images with the same depth structure, and optimize hand generation through hand key points.

A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.

Recommendation: AI reduces dimensionality to defeat human painters, introduces ControlNet into Vincentian graphs, and fully reuses depth and edge information.

Paper 7: EVA3D: Compositional 3D Human Generation from 2D image Collections

Author ：Fangzhou Hong et al
Paper address: https://arxiv.org/abs/2210.04888

Abstract: At ICLR 2023, the Nanyang Technological University-SenseTime Joint Research Center S-Lab team proposed the first method to learn high-resolution 3D human body generation from a collection of 2D images. EVA3D. Thanks to the differentiable rendering provided by NeRF, recent 3D generative models have achieved stunning results on stationary objects. However, in a more complex and deformable category such as the human body, 3D generation still poses great challenges.

This paper proposes an efficient combined NeRF representation of the human body, achieving high-resolution (512x256) 3D human body generation without using a super-resolution model. EVA3D has significantly surpassed existing solutions on four large-scale human body data sets, and the code has been open source.

A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.

Recommendation: ICLR 2023 Spotlight | 2D images to brain-fill the 3D human body, you can wear any clothes and change the movements.

ArXiv Weekly Radiostation

Heart of Machine cooperates with ArXiv Weekly Radiostation initiated by Chu Hang, Luo Ruotian, and Mei Hongyuan. Based on 7 Papers, this selection is More important papers this week, including 10 selected papers in each of NLP, CV, and ML fields, and abstract introductions of the papers in audio format are provided. The details are as follows:

7 NLP Papers

This week’s 10 selected NLP papers are:

1. Active Prompting with Chain- of-Thought for Large Language Models. (from Tong Zhang)

2. Prosodic features improve sentence segmentation and parsing. (from Mark Steedman)

3. ProsAudit, a prosodic benchmark for self-supervised speech models. (from Emmanuel Dupoux)

4. Exploring Social Media for Early Detection of Depression in COVID-19 Patients. ( from Jie Yang)

5. Federated Nearest Neighbor Machine Translation. (from Enhong Chen)

6. SPINDLE: Spinning Raw Text into Lambda Terms with Graph Attention. (from Michael Moortgat)

7. A Neural Span-Based Continual Named Entity Recognition Model. (from Qingcai Chen)

10 CV Papers

This week’s 10 CV selected papers are:

1. MERF: Memory-Efficient Radiance Fields for Real-time View Synthesis in Unbounded Scenes. (from Richard Szeliski, Andreas Geiger)

2. Designing an Encoder for Fast Personalization of Text-to-Image Models. (from Daniel Cohen-Or)

3. Teaching CLIP to Count to Ten. (from Michal Irani)

4. Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation. (from Weisi Lin)

5. Real-Time Damage Detection in Fiber Lifting Ropes Using Convolutional Neural Networks. (from Moncef Gabbouj)

6. Embedding Fourier for Ultra-High-Definition Low-Light Image Enhancement. (from Chen Change Loy)

7. Region-Aware Diffusion for Zero-shot Text-driven Image Editing. (from Changsheng Xu)

8. Side Adapter Network for Open-Vocabulary Semantic Segmentation. (from Xiang Bai)

9. VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. (from Sanja Fidler)

10. Object-Centric Video Prediction via Decoupling of Object Dynamics and Interactions. (from Sven Behnke)

10 ML Papers

本周 10 篇 ML 精选论文是：

1. normflows: A PyTorch Package for Normalizing Flows. (from Bernhard Schölkopf)

2. Concept Learning for Interpretable Multi-Agent Reinforcement Learning. (from Katia Sycara)

3. Random Teachers are Good Teachers. (from Thomas Hofmann)

4. Aligning Text-to-Image Models using Human Feedback. (from Craig Boutilier, Pieter Abbeel)

5. Change is Hard: A Closer Look at Subpopulation Shift. (from Dina Katabi)

6. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. (from Zhifeng Chen)

7. Diverse Policy Optimization for Structured Action Space. (from Hongyuan Zha)

8. The Geometry of Mixability. (from Robert C. Williamson)

9. Does Deep Learning Learn to Abstract? A Systematic Probing Framework. (from Nanning Zheng)

10. Sequential Counterfactual Risk Minimization. (from Julien Mairal)

The above is the detailed content of A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Most Used 10 Power BI Charts - Analytics VidhyaApr 16, 2025 pm 12:05 PM

Harnessing the Power of Data Visualization with Microsoft Power BI Charts In today's data-driven world, effectively communicating complex information to non-technical audiences is crucial. Data visualization bridges this gap, transforming raw data i

Expert Systems in AIApr 16, 2025 pm 12:00 PM

Expert Systems: A Deep Dive into AI's Decision-Making Power Imagine having access to expert advice on anything, from medical diagnoses to financial planning. That's the power of expert systems in artificial intelligence. These systems mimic the pro

Three Of The Best Vibe Coders Break Down This AI Revolution In CodeApr 16, 2025 am 11:58 AM

First of all, it’s apparent that this is happening quickly. Various companies are talking about the proportions of their code that are currently written by AI, and these are increasing at a rapid clip. There’s a lot of job displacement already around

Runway AI's Gen-4: How Can AI Montage Go Beyond AbsurdityApr 16, 2025 am 11:45 AM

The film industry, alongside all creative sectors, from digital marketing to social media, stands at a technological crossroad. As artificial intelligence begins to reshape every aspect of visual storytelling and change the landscape of entertainment

How to Enroll for 5 Days ISRO AI Free Courses? - Analytics VidhyaApr 16, 2025 am 11:43 AM

ISRO's Free AI/ML Online Course: A Gateway to Geospatial Technology Innovation The Indian Space Research Organisation (ISRO), through its Indian Institute of Remote Sensing (IIRS), is offering a fantastic opportunity for students and professionals to

Local Search Algorithms in AIApr 16, 2025 am 11:40 AM

Local Search Algorithms: A Comprehensive Guide Planning a large-scale event requires efficient workload distribution. When traditional approaches fail, local search algorithms offer a powerful solution. This article explores hill climbing and simul

OpenAI Shifts Focus With GPT-4.1, Prioritizes Coding And Cost EfficiencyApr 16, 2025 am 11:37 AM

The release includes three distinct models, GPT-4.1, GPT-4.1 mini and GPT-4.1 nano, signaling a move toward task-specific optimizations within the large language model landscape. These models are not immediately replacing user-facing interfaces like

The Prompt: ChatGPT Generates Fake PassportsApr 16, 2025 am 11:35 AM

Chip giant Nvidia said on Monday it will start manufacturing AI supercomputers— machines that can process copious amounts of data and run complex algorithms— entirely within the U.S. for the first time. The announcement comes after President Trump si

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks agoByDDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Atom editor mac version download

The most popular open source editor

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.