search
HomeTechnology peripheralsAIPre-training without attention; In-Context Learning driven by GPT

Paper 1: ClimateNeRF: Physically-based Neural Rendering for Extreme Climate Synthesis

  • ##Author: Yuan Li et al
  • Paper address: https://arxiv.org/pdf/2211.13226.pdf

Abstract: This paper introduces a new method of fusing physical simulations with NeRF models of scenes to generate realistic movies of the physical phenomena in these scenes. In terms of concrete results, the method can realistically simulate the possible effects of climate change - what would a playground look like after a small-scale flood? What about after the great flood? What about after the blizzard?


Pre-training without attention; In-Context Learning driven by GPT

## Recommended: Fog, winter, floods, new NeRF models render physically realistic blockbusters.

Paper 2: Pretraining Without Attention

    ##Author: Junxiong Wang et al
  • Paper address: https://arxiv.org/pdf/2212.10544.pdf
Abstract:

This paper proposes a Bidirectional Gating SSM (BiGS) model, which combines the Routing layer based on the State Space Model (SSM) and the model architecture based on the multiplication gate, which can replicate the BERT prediction without using attention. training results, and can be extended to long-range pre-training of 4096 tokens, without the need for approximation.

Pre-training without attention; In-Context Learning driven by GPT

Recommendation:

Pre-training requires no attention, and scaling to 4096 tokens is not a problem, comparable to BERT.

Paper 3: One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations

    Author: Yiming Zhu et al
  • Paper address: https://arxiv.org/pdf/2210.07883.pdf
Abstract:

Recently, using text to guide image editing has achieved great progress and attention, especially based on denoising diffusion models such as StableDiffusion or DALLE wait. However, GAN-based text-image editing still has some problems waiting to be solved. For example, in the classic StyleCILP, a model must be trained for each text. This single-text-to-single-model approach is inconvenient in practical applications. This article proposes FFCLIP and solves this problem. For flexible different text inputs, FFCLIP only needs one model to edit the image accordingly, without the need to retrain the model for each text. , and achieved very good results on multiple data sets. This article has been accepted by NeurIPS 2022.

Recommendation:

A new paradigm for text and picture editing, a single model enables multi-text guided image editing.

Paper 4: SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions

    Author: Yizhong Wang et al
  • Paper address: https://arxiv.org/pdf/2212.10560v1.pdf
Abstract:

The University of Washington and other institutions recently jointly published a paper. The proposed new framework SELF-INSTRUCT improves the performance of pre-trained language models by guiding the model's own generation process. Ability to follow instructions. SELF-INSTRUCT is a semi-automated process that performs instruction tuning on a pre-trained LM using instruction signals from the model itself.

Recommendation: No need for manual annotation, the self-generated instruction framework breaks the cost bottleneck of LLM such as ChatGPT.

Paper 5: Ab Initio Calculation of Real Solids via Neural Network Ansatz


  • Author: Xiang Li et al
  • Paper address: https://www.nature.com/articles/s41467-022- 35627-1

Abstract: Machine learning can process massive amounts of data, solve scientific problems in complex scenarios, and lead scientific exploration to reach areas that were impossible in the past. New areas touched upon. For example, DeepMind uses the artificial intelligence software AlphaFold to make highly accurate predictions of almost all protein structures known to the scientific community; the particle image velocimetry (PIV) method based on deep learning proposed by Christian Lagemann has greatly improved the original purely manual setting of parameters. The application scope of the model is of vital significance to research in many fields such as automobiles, aerospace, and biomedical engineering.

Recently, the work "Ab initio calculation of real solids via neural network ansatz" by the ByteDance AI Lab Research team and Chen Ji's research group at the School of Physics at Peking University provides a method for studying condensed matter. A new idea in physics, this work proposes the industry's first neural network wave function suitable for solid systems, realizes first-principles calculations of solids, and pushes the calculation results to the thermodynamic limit. It strongly proves that neural networks are efficient tools for studying solid-state physics, and also indicates that deep learning technology will play an increasingly important role in condensed matter physics. Relevant research results were published in the top international journal Nature Communication on December 22, 2022.

Recommendation: The industry’s first neural network wave function suitable for solid systems was published in a Nature sub-journal.

Paper 6: Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

  • Author: Damai Dai et al
  • Paper address: https://arxiv.org/pdf/2212.10559v2.pdf

Abstract: In-Context Learning (ICL) has achieved great success on large pre-trained language models, but its working mechanism is still a Unanswered questions. In this article, researchers from Peking University, Tsinghua University, and Microsoft understand ICL as a kind of implicit fine-tuning, and provide empirical evidence to prove that ICL and explicit fine-tuning perform similarly at multiple levels.

Recommended: Why does In-Context Learning, driven by GPT, work? The model performs gradient descent secretly.

Paper 7: Experimental Indications of Non-classical Brain Functions

  • Author: Christian Matthias Kerskens et al
  • Paper address: https://iopscience.iop.org/article/10.1088/2399-6528/ac94be

Abstract: For decades, scientists have been exploring the computing and thinking mechanisms of the human brain. However, the structure of the human brain is too complex, containing tens of billions of neurons, equivalent to trillions of chips, so it is difficult for us to find out. Roger Penrose, who won the Nobel Prize in Physics for his contribution to the study of black holes, once boldly proposed the idea of ​​"quantum consciousness", that is, the human brain itself is a quantum structure, or a quantum computer. But this view has been questioned.

A recent study from Trinity University Dublin suggests that our brains perform quantum computations, arguing that there is entanglement in the human brain mediated by brain functions related to consciousness. If these brain functions must operate in a non-classical way, then this means that consciousness is non-classical, i.e. the brain's cognitive processes involve quantum computations.

Recommendation: The brain’s thinking is quantum computing. There is new evidence for this speculation.

ArXiv Weekly Radiostation

Heart of Machine cooperates with ArXiv Weekly Radiostation initiated by Chu Hang and Luo Ruotian, and selects more important papers this week based on 7 Papers, including 10 selected papers in each of NLP, CV, and ML fields. And provide an audio summary of the paper, the details are as follows:

10 NLP PapersAudio: 00:0020:18

##10 selected NLP papers this week Yes:

1. Does unsupervised grammar induction need pixels?. (from Serge Belongie, Kilian Q. Weinberger, Jitendra Malik, Trevor Darrell)

2. Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing. (from Bernhard Schölkopf)

3. Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation. (from Cordelia Schmid, Ivan Laptev)

4. Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment. (from Ruslan Salakhutdinov, Louis-Philippe Morency)

5. Original or Translated? On the Use of Parallel Data for Translation Quality Estimation. (from Dacheng Tao)

6. Toward Human- Like Evaluation for Natural Language Generation with Error Analysis. (from Dacheng Tao)

7. Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?. (from Kyunghyun Cho )

8. On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. (from Kyunghyun Cho)

9. Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval. (from William W. Cohen)

10. The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning. (from Li Erran Li, Eric Xing)

10 CV PapersAudio:##00:0023:15

##This week’s 10 selected CV papers are:

1. Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective. (from Kalyanmoy Deb)

2. Benchmarking Spatial Relationships in Text-to -Image Generation. (from Eric Horvitz)

3. A Brief Survey on Person Recognition at a Distance. (from Rama Chellappa)

4. MetaCLUE: Towards Comprehensive Visual Metaphors Research. (from Leonidas Guibas, William T. Freeman)

5. Aliasing is a Driver of Adversarial Attacks. (from Antonio Torralba)

6. Reversible Column Networks. (from Xiangyu Zhang)

7. Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble . (from Ming-Hsuan Yang)

8. Learning Object-level Point Augmentor for Semi-supervised 3D Object Detection. (from Ming-Hsuan Yang)

9. Unleashing the Power of Visual Prompting At the Pixel Level. (from Alan Yuille)

10. From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models.  (from Dacheng Tao, Steven C.H. Hoi)

The above is the detailed content of Pre-training without attention; In-Context Learning driven by GPT. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Most Used 10 Power BI Charts - Analytics VidhyaMost Used 10 Power BI Charts - Analytics VidhyaApr 16, 2025 pm 12:05 PM

Harnessing the Power of Data Visualization with Microsoft Power BI Charts In today's data-driven world, effectively communicating complex information to non-technical audiences is crucial. Data visualization bridges this gap, transforming raw data i

Expert Systems in AIExpert Systems in AIApr 16, 2025 pm 12:00 PM

Expert Systems: A Deep Dive into AI's Decision-Making Power Imagine having access to expert advice on anything, from medical diagnoses to financial planning. That's the power of expert systems in artificial intelligence. These systems mimic the pro

Three Of The Best Vibe Coders Break Down This AI Revolution In CodeThree Of The Best Vibe Coders Break Down This AI Revolution In CodeApr 16, 2025 am 11:58 AM

First of all, it’s apparent that this is happening quickly. Various companies are talking about the proportions of their code that are currently written by AI, and these are increasing at a rapid clip. There’s a lot of job displacement already around

Runway AI's Gen-4: How Can AI Montage Go Beyond AbsurdityRunway AI's Gen-4: How Can AI Montage Go Beyond AbsurdityApr 16, 2025 am 11:45 AM

The film industry, alongside all creative sectors, from digital marketing to social media, stands at a technological crossroad. As artificial intelligence begins to reshape every aspect of visual storytelling and change the landscape of entertainment

How to Enroll for 5 Days ISRO AI Free Courses? - Analytics VidhyaHow to Enroll for 5 Days ISRO AI Free Courses? - Analytics VidhyaApr 16, 2025 am 11:43 AM

ISRO's Free AI/ML Online Course: A Gateway to Geospatial Technology Innovation The Indian Space Research Organisation (ISRO), through its Indian Institute of Remote Sensing (IIRS), is offering a fantastic opportunity for students and professionals to

Local Search Algorithms in AILocal Search Algorithms in AIApr 16, 2025 am 11:40 AM

Local Search Algorithms: A Comprehensive Guide Planning a large-scale event requires efficient workload distribution. When traditional approaches fail, local search algorithms offer a powerful solution. This article explores hill climbing and simul

OpenAI Shifts Focus With GPT-4.1, Prioritizes Coding And Cost EfficiencyOpenAI Shifts Focus With GPT-4.1, Prioritizes Coding And Cost EfficiencyApr 16, 2025 am 11:37 AM

The release includes three distinct models, GPT-4.1, GPT-4.1 mini and GPT-4.1 nano, signaling a move toward task-specific optimizations within the large language model landscape. These models are not immediately replacing user-facing interfaces like

The Prompt: ChatGPT Generates Fake PassportsThe Prompt: ChatGPT Generates Fake PassportsApr 16, 2025 am 11:35 AM

Chip giant Nvidia said on Monday it will start manufacturing AI supercomputers— machines that can process copious amounts of data and run complex algorithms— entirely within the U.S. for the first time. The announcement comes after President Trump si

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!