Home > Article > Technology peripherals > A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.
catalog
Paper 1: Transformer models: an introduction and catalog
Abstract: Since it was proposed in 2017, the Transformer model has been demonstrated in other fields such as natural language processing and computer vision. It has achieved unprecedented strength and triggered technological breakthroughs such as ChatGPT, and people have also proposed various variants based on the original model.
As academia and industry continue to propose new models based on the Transformer attention mechanism, it is sometimes difficult for us to summarize this direction. Recently, a comprehensive article by Xavier Amatriain, head of AI product strategy at LinkedIn, may help us solve this problem.
## Recommendation: The goal of this article is to provide a more comprehensive but A simple catalog and classification also introduces the most important aspects and innovations in the Transformer model.
Paper 2: High-throughout Generative Inference of Large Language Models with a Single GPU
Traditionally, the high computational and memory requirements of large language model (LLM) inference have necessitated the use of multiple high-end AI accelerators Conduct training. This study explores how to reduce the requirements of LLM inference to a consumer-grade GPU and achieve practical performance. ,Recently, new research from Stanford University, UC Berkeley, ETH Zurich, Yandex, Moscow State Higher School of Economics, Meta, Carnegie Mellon University and other institutions proposed FlexGen. This is a high-throughput generation engine for running LLM with limited GPU memory. The figure below shows the design idea of FlexGen, which uses block scheduling to reuse weights and overlap I/O with calculations, as shown in figure (b) below, while other baseline systems use inefficient row-by-row scheduling, as shown in figure (a) below .
Recommendation:
Run the ChatGPT volume model, and only need one GPU from now on: here comes the method to accelerate by a hundred times.
Paper 3: Temporal Domain Generalization with Drift-Aware Dynamic Neural Networks
In the Domain Generalization (DG) task, when the distribution of the domain changes continuously with the environment, how to accurately capture the change and its impact on the model is A very important but also extremely challenging question. To this end, Professor Zhao Liang’s team from Emory University proposed a time domain generalization framework DRAIN based on Bayesian theory, which uses a recursive network to learn the drift of the time dimension domain distribution, and at the same time uses a dynamic neural network to And the combination of graph generation technology maximizes the expressive ability of the model and achieves model generalization and prediction in unknown fields in the future. This work has been selected into ICLR 2023 Oral (Top 5% among accepted papers). The following is a schematic diagram of the overall framework of DRAIN.
Recommendation: Drift-aware dynamic neural network blessing, the new framework for time domain generalization far exceeds Domain generalization & adaptation methods.
Paper 4: Large-scale physically accurate modeling of real proton exchange membrane fuel cell with deep learning
##Abstract: In order to ensure energy supply and combat climate change, people’s focus has shifted from fossil fuels to clean and renewable energy, hydrogen With its high energy density and clean and low-carbon energy attributes, it can play an important role in the energy transformation. Hydrogen fuel cells, especially proton exchange membrane fuel cells (PEMFC), are key to this green revolution due to their high energy conversion efficiency and zero-emission operation.
PEMFC converts hydrogen into electricity through an electrochemical process, with the only by-product of the reaction being pure water. However, PEMFCs can become inefficient if water cannot flow out of the cell properly and subsequently "floods" the system. Until now, it has been difficult for engineers to understand the precise manner in which water drains or accumulates inside fuel cells because they are so small and complex.
Recently, a research team from the University of New South Wales in Sydney developed a deep learning algorithm (DualEDSR) to improve the understanding of the internal conditions of PEMFC, which can be obtained from lower resolution X High-resolution modeling images generated from radiographic microcomputed tomography. The process has been tested on a single hydrogen fuel cell, allowing its interior to be accurately modeled and potentially improving its efficiency. The figure below shows the PEMFC domains generated in this study.
Recommendation: Deep learning conducts large-scale physical accurate modeling of the interior of fuel cells to help Battery performance improved.
Paper 5: A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT
Abstract: This nearly 100-page review combs the evolution history of the pre-trained basic model, allowing us to see how ChatGPT became successful step by step.
Recommendation: From BERT to ChatGPT, a hundred-page review combs the evolution history of pre-trained large models.
Paper 6: Adding Conditional Control to Text-to-Image Diffusion Models
Abstract: This paper proposes an end-to-end neural network architecture ControlNet, which can improve the graph-generating graph by adding additional conditions to control the diffusion model (such as Stable Diffusion) Effect, and can generate full-color images from line drawings, generate images with the same depth structure, and optimize hand generation through hand key points.
Recommendation: AI reduces dimensionality to defeat human painters, introduces ControlNet into Vincentian graphs, and fully reuses depth and edge information.
Paper 7: EVA3D: Compositional 3D Human Generation from 2D image Collections
Abstract: At ICLR 2023, the Nanyang Technological University-SenseTime Joint Research Center S-Lab team proposed the first method to learn high-resolution 3D human body generation from a collection of 2D images. EVA3D. Thanks to the differentiable rendering provided by NeRF, recent 3D generative models have achieved stunning results on stationary objects. However, in a more complex and deformable category such as the human body, 3D generation still poses great challenges.
This paper proposes an efficient combined NeRF representation of the human body, achieving high-resolution (512x256) 3D human body generation without using a super-resolution model. EVA3D has significantly surpassed existing solutions on four large-scale human body data sets, and the code has been open source.
Recommendation: ICLR 2023 Spotlight | 2D images to brain-fill the 3D human body, you can wear any clothes and change the movements.
Heart of Machine cooperates with ArXiv Weekly Radiostation initiated by Chu Hang, Luo Ruotian, and Mei Hongyuan. Based on 7 Papers, this selection is More important papers this week, including 10 selected papers in each of NLP, CV, and ML fields, and abstract introductions of the papers in audio format are provided. The details are as follows:
7 NLP Papers
This week’s 10 selected NLP papers are:
1. Active Prompting with Chain- of-Thought for Large Language Models. (from Tong Zhang)
2. Prosodic features improve sentence segmentation and parsing. (from Mark Steedman)
3. ProsAudit, a prosodic benchmark for self-supervised speech models. (from Emmanuel Dupoux)
4. Exploring Social Media for Early Detection of Depression in COVID-19 Patients. ( from Jie Yang)
5. Federated Nearest Neighbor Machine Translation. (from Enhong Chen)
6. SPINDLE: Spinning Raw Text into Lambda Terms with Graph Attention. (from Michael Moortgat)
7. A Neural Span-Based Continual Named Entity Recognition Model. (from Qingcai Chen)
10 CV Papers
This week’s 10 CV selected papers are:
1. MERF: Memory-Efficient Radiance Fields for Real-time View Synthesis in Unbounded Scenes. (from Richard Szeliski, Andreas Geiger)
2. Designing an Encoder for Fast Personalization of Text-to-Image Models. (from Daniel Cohen-Or)
3. Teaching CLIP to Count to Ten. (from Michal Irani)
4. Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation. (from Weisi Lin)
5. Real-Time Damage Detection in Fiber Lifting Ropes Using Convolutional Neural Networks. (from Moncef Gabbouj)
6. Embedding Fourier for Ultra-High-Definition Low-Light Image Enhancement. (from Chen Change Loy)
7. Region-Aware Diffusion for Zero-shot Text-driven Image Editing. (from Changsheng Xu)
8. Side Adapter Network for Open-Vocabulary Semantic Segmentation. (from Xiang Bai)
9. VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. (from Sanja Fidler)
10. Object-Centric Video Prediction via Decoupling of Object Dynamics and Interactions. (from Sven Behnke)
10 ML Papers
本周 10 篇 ML 精选论文是:
1. normflows: A PyTorch Package for Normalizing Flows. (from Bernhard Schölkopf)
2. Concept Learning for Interpretable Multi-Agent Reinforcement Learning. (from Katia Sycara)
3. Random Teachers are Good Teachers. (from Thomas Hofmann)
4. Aligning Text-to-Image Models using Human Feedback. (from Craig Boutilier, Pieter Abbeel)
5. Change is Hard: A Closer Look at Subpopulation Shift. (from Dina Katabi)
6. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. (from Zhifeng Chen)
7. Diverse Policy Optimization for Structured Action Space. (from Hongyuan Zha)
8. The Geometry of Mixability. (from Robert C. Williamson)
9. Does Deep Learning Learn to Abstract? A Systematic Probing Framework. (from Nanning Zheng)
10. Sequential Counterfactual Risk Minimization. (from Julien Mairal)
The above is the detailed content of A GPU runs the ChatGPT volume model, and ControlNet is another artifact for AI drawing.. For more information, please follow other related articles on the PHP Chinese website!