Home  >  Article  >  Technology peripherals  >  How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect

How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect

王林
王林forward
2023-04-12 17:10:031252browse

In order to handle more complex tasks, the scale of neural networks has been increasing in recent years, and how to efficiently store and transmit neural networks has become very important. On the other hand, with the proposal of the Lottery Ticket Hypothesis (LTH), random sparse neural networks have recently shown strong potential. How to use this potential to improve the storage and transmission efficiency of the network is also worth exploring.

Researchers from Northeastern University and Rochester Institute of Technology proposed Parameter-Efficient Masking Networks (PEMN). The authors first explore the representational capabilities of random networks generated by a limited number of random numbers. Experiments show that even if the network is generated from a limited number of random numbers, it still has good representation capabilities by choosing different sub-network structures.

Through this exploratory experiment, the author naturally proposed to use a limited number of random numbers as a prototype, combined with a set of masks to express a neural network. Because a limited number of random numbers and binary masks occupy very little storage space, the author uses this to propose a new idea for network compression. The article has been accepted for NeurIPS 2022. The code has been open sourced.

How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect

  • ##Paper address: https://arxiv.org/abs/2210.06699
  • Paper code: https://github.com/yueb17/PEMN
1. Related research

MIT researchers proposed the Lottery Ticket Hypothesis (ICLR'19): In a randomly initialized network, there is a lottery sub-network (winning ticket) that achieves good results when trained alone. The lottery ticket hypothesis explores the trainability of stochastic sparse networks. Uber researchers proposed Supermask (NeurIPS’19): In a randomly initialized network, there is a sub-network that can be directly used for inference without training. Supermask explores the usability of stochastic sparse networks. Researchers at the University of Washington proposed Edge-Popup (CVPR’20): learning the mask of the subnetwork through backpropagation, which greatly improves the usability of random sparse networks.

2. Research motivation/process

The above related research has explored the potential of random sparse networks from different angles, such as trainability and usability, of which usability can also be understood for representational ability. In this work, the authors are interested in how well a neural network generated from random numbers can represent without training weights. Following the exploration of this problem, the authors proposed Parameter-Efficient Masking Networks (PEMN). Naturally, the author uses PEMN to provide a new idea for network compression and serves as an example to explore potential application scenarios of PEMN.

3. Explore the representation ability of neural networks composed of random numbers

Given a random network, the author chooses the Edge-Popup algorithm to select subnetworks to explore its Representation ability. The difference is that instead of randomly initializing the entire network, the author proposes three parameter-intensive network generation strategies to use a prototype to build a random network.

  • One-layer: Select the weight of the repeated structure in the network as prototype to fill in other network layers with the same structure.
  • Max-layer padding (MP): Select the network layer with the largest number of parameters as the prototype and truncate the corresponding parameter amounts to fill other network layers.
  • Random vector padding (RP): Select a random vector of a certain length as the prototype and copy it to fill the entire network.

Three different random network generation strategies gradually reduce the number of unique values ​​in the network. We choose based on the random networks obtained by different strategies. subnetworks, thereby exploring the representational potential of random networks generated by a limited number of random numbers.

How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect

The above figure shows the experimental results of CIFAR10 image classification using ConvMixer and ViT network. The Y-axis is the accuracy, and the X-axis is the random network obtained using different strategies. As the

According to the experimental results, we observed that even if the random network only has a very limited number of non-repeating random numbers (such as PR_1e-3), it can still maintain the representation of the selected subnetwork well. ability. So far, the author has explored the representation ability of a neural network composed of a limited number of random numbers through different random network generation strategies and observed that even if the non-repeating random numbers are very limited, the corresponding random network can still represent the data well.

At the same time, based on these random network generation strategies and combined with the obtained subnetwork mask, the author proposed a new neural network type called Parameter-Efficient Masking Networks (PEMN).

4. A new network compression idea

This article chooses neural network compression as an example to expand the potential applications of PEMN. Specifically, the different random network generation strategies proposed in this article can efficiently use prototypes to represent complete random networks, especially the most fine-grained random vector padding (RP) strategy.

The author uses the random vector prototype in the RP strategy and a corresponding set of subnet masks to represent a random network. The prototype needs to be saved in floating point format, while the mask only needs to be saved in binary format. Because the prototype length in RP can be very short (because a limited number of non-repeating random numbers still have strong representation ability), the overhead of representing a neural network will become very small, that is, storing a floating point number format with a limited length. A random vector and a set of masks in binary format. Compared with traditional sparse networks that store floating-point values ​​of subnetworks, this paper proposes a new network compression idea to efficiently store and transmit neural networks.

How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect

In the above figure, the author uses PEMN to compress the network and compare it with the traditional network pruning method. The experiment uses the ResNet network to perform image classification tasks on the CIFAR data set. We observe that the new compression scheme generally performs better than traditional network pruning. Especially at very high compression rates, PEMN can still maintain good accuracy.

5. Conclusion

Inspired by the potential demonstrated by random networks recently, this paper proposes different parameter intensive strategies to construct random neural networks, and then explores the problem of only The representation potential of random neural networks generated without repeated random numbers is limited, and Parameter-Efficient Masking Networks (PEMN) are proposed. The author applies PEMN to the network compression scenario to explore its potential in practical applications and provides a new idea for network compression. The authors provide extensive experiments showing that even if there are only a very limited number of non-repeating random numbers in a random network, it still has good representation capabilities through the selection of subnetworks. In addition, compared with traditional pruning algorithms, experiments show that the newly proposed method can achieve better network compression effects, verifying the application potential of PEMN in this scenario.

The above is the detailed content of How to improve storage and transmission efficiency? Parameter-intensive mask network has significant effect. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete