


Neural networks may no longer need activation functions? Layer Normalization also has non-linear expression!

The AIxiv column is a column where academic and technical content is published on this site. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com
The authors of this article are all from the team of Associate Professor Huang Lei, School of Artificial Intelligence, Beihang University and National Key Laboratory of Complex Critical Software Environment. The first author, Ni Yunhao, is a first-year graduate student, the second author, Guo Yuxin, is a third-year graduate student, and the third author, Jia Junlong, is a second-year graduate student. The corresponding author is Associate Professor Huang Lei (Homepage: https://huangleibuaa.github.io/)
However, the paper "On the Nonlinearity of Layer Normalization" recently published at ICML2024 by the team of Professor Huang Lei from the School of Artificial Intelligence of Beihang University pointed out that layer normalization (Layer Normlization, LN) and its computationally degraded version RMSNorm are nonlinear Expression ability, and the universal approximate classification ability of LN is discussed in detail.
Paper address: https://arxiv.org/abs/2406.01255
For further research, the author splits LN into two steps: centering and scaling. Centralization is mathematically a linear transformation, so the nonlinearity of LN mainly exists in the scale scaling operation (also called spherical projection in the article, which is the operation performed by RMSNorm). The author took the simplest linearly inseparable XOR data as an example, and correctly classified these four points through linear transformation and spherical projection.


The above is the detailed content of Neural networks may no longer need activation functions? Layer Normalization also has non-linear expression!. For more information, please follow other related articles on the PHP Chinese website!

Introduction The AI revolution has given rise to a new era of creativity, where text-to-image models are redefining the intersection of art, design, and technology. Pixtral 12B and Qwen2-VL-72B are two pioneering forces drivin

Introduction With the advancement of AI, scientific research has seen a massive transformation. Millions of papers are published annually on different technologies and sectors. But, navigating this ocean of information to retr

Introduction Large Language Models are rapidly transforming industries—today, they power everything from personalized customer service in banking to real-time language translation in global communication. They can answer quest

Introduction Don’t want to spend money on APIs, or are you concerned about privacy? Or do you just want to run LLMs locally? Don’t worry; this guide will help you build agents and multi-agent frameworks with local LLMs t

Introduction This week has been packed with major updates in the world of artificial intelligence (AI). From OpenAI’s o1 models showcasing advanced reasoning to Apple’s groundbreaking Visual Intelligence technology, tech

Introduction In 2022, the launch of ChatGPT revolutionized both tech and non-tech industries, empowering individuals and organizations with generative AI. Throughout 2023, efforts concentrated on leveraging large language mode

The STAR schema is an efficient database design used in data warehousing and business intelligence. It organizes data into a central fact table linked to surrounding dimension tables. This star-like structure simplifies complex q

Retrieval Augmented Generation systems, better known as RAG systems, have become the de-facto standard for building intelligent AI assistants answering questions on custom enterprise data without the hassles of expensive fine-tun


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.