Home >Technology peripherals >AI >A snapshot can restore a video! AAAI 2023 paper proposes a new algorithm for snapshot compression imaging
This article is reprinted with the authorization of AI New Media Qubit (public account ID: QbitAI). Please contact the source for reprinting.
With the development of optical algorithms, we can now "capture" high-dimensional signals using low-dimensional sensors.
For example, this is a "photo" we took with a 2D sensor, which looks full of noisy data:
However, it is through the data contained in this "photo" that we can restore a dynamic video!
Sounds amazing, but through a method called Snapshot Compressive Imaging(Snapshot Compressive Imaging, SCI ) method can indeed be achieved.
This method can sample high-dimensional data as two-dimensional measurements, thereby achieving efficient acquisition of high-dimensional visual signals.
Take a camera as an example. Although it is a 2D sensor, if you find a way to add a digital micromirror device measurement device behind the camera lens (Digital Micromirror Devices, DMD, this It is a device that can accurately control the light source). There is a way to enable ordinary cameras to perform dimensionality reduction measurements on high-dimensional data, obtain simple 2D data, and then restore high-dimensional 3D visual signals.
For example, the frame rate of an ordinary camera is very low, and it can only take dozens of photos per second (assuming it can take 30 photos).
When we want to shoot high-speed moving objects, as long as we add this digital micromirror device to an ordinary camera, it willcompress the video signal along the time dimension. Every time we take a picture One photo can restore several or even dozens of frames of photos (that is, restore a video).
Assume that the preset compression rate for the digital micromirror device is 10. Then, taking one photo now can restore 10 photos (or a video containing 10 frames of photos) ), and the frame rate of the camera has also increased by 10 times, can take 300 photos per second.
Now the question comes, how to recover the original high-dimensional signal as efficiently as possible from these compressed low-dimensional measurement data containing noise?
With the development of deep learning, various reconstruction algorithms have been proposed. However, the accuracy and stability of reconstructed signals by these algorithms are still not good enough.
To this end, researchers from Hong Kong University, Chinese Academy of Sciences and West Lake University proposed a Deep Equilibrium Models (DEQ) method for video snapshot compression imaging , has been included in AAAI 2023:
This method not only improves the reconstruction accuracy and stability, but also further optimizes the memory footprint— —
The algorithm only requires constant-level memory during training and testing, that is: when using deep learning, the memory space it consumesdoes not change with the depth of the network (When using traditional optimization methods, the memory space consumed does not change with the number of iterations) .
Lets come look.
Benefiting from the design of novel optical hardware and imaging algorithms, Snapshot Compressive Imaging (Snapshot Compressive Imaging, SCI) system can perform high-dimensional data as a two-dimensional measurement in one snapshot measurement sampling to achieve efficient acquisition of high-dimensional visual signals.
As shown in Figure 1, the SCI system can be divided into two parts, hardware encoding and software decoding:
Taking video shooting as an example, through hardware encoding, the SCI system samples the video data and compresses it in the time dimension; thereafter, Algorithms are employed to reconstruct the original high-dimensional video data.
Consider the video SCI system here, as shown in Video 1. The upper part of the video shows the compression measurements obtained by the hardware part of the SCI system, and the lower part of the video shows the video results recovered using the algorithm proposed in the paper. .
Obviously, the entire imaging process needs to solve an inverse problem: How to recover video from noisy compression measurements.
Although there are many reconstruction methods that can solve the inverse problem of SCI imaging, these methods each have their own shortcomings, as shown in Figure 2:
△Figure 2. Existing methods and main issues of SCI reconstruction
Among them, the traditional optimization algorithm(a)Performance limited.
With the development of deep learning, end-to-end deep networks(b)and unfolding methods(c)although they can improve performance, they inevitably Increasing layer network depth suffers from ever-increasing memory requirements and requires careful model design.
Plug and Play(PnP)Framework(d)While enjoying the advantages of data-driven regularization and flexible iterative optimization, this algorithm must be passed through appropriate Parameter settings are required to ensure accurate results, and even some complex strategies need to be adopted to obtain satisfactory performance.
Compared with other methods, the paper proposes new algorithms DE-RNN and DE-GAP to ensure the accuracy and stability of the reconstruction results. The performance of the reconstruction results can converge to A higher level, as shown in Figure 3:
△Figure 3. Comparison of reconstruction results between DE-GAP and other methods
Generally speaking, in the past The reconstruction results of methods such as RNN and PnP are unstable, and the performance even deteriorates in long-term iterations.
However, the DE-GAP reconstruction results can maintain performance improvement as the number of iterations increases, and eventually converge to a stable result.
how did you do that?
In order to solve the problems of previous methods and achieve more advanced SCI reconstruction, this paper proposes a new idea for the first time——
Use DEQ model to solve the inverse problem of video SCI reconstruction.
The DEQ model was first proposed in 2019 and is mainly used in large-scale long sequence language processing tasks in natural language processing.
As shown in Figure 4, the DEQ model can directly solve for the fixed points in the process of forward propagation and back propagation through root finding methods such as Newton's iteration method, thus using only constant-level memory. Effectively implements an infinitely deep network:
△Figure 4. Fixed point method for solving the DEQ model (left) and constant-level memory usage (right) )
(Figure 4 is from the paper: S. Bai et al, "Deep equilibrium models", NeurIPS 2019.)
Specifically, this paper will The DEQ model is applied to two existing video SCI reconstruction frameworks: RNN and PnP.
The effect is also very good. RNN is equivalent to realizing an infinitely deep network using only constant-level memory. PnP is equivalent to realizing infinitely many iterative optimization steps, and directly in the iterative optimization process. Solve for the fixed point.
As shown in Figure 5, the paper designs iterative functions combined with the DEQ model for RNN and PnP respectively, where x is the reconstruction result, y is the compression measurement, and Φ is the measurement matrix:
△Figure 5. The iterative functions of RNN and PnP combined with the DEQ model respectively
(For details of the specific derivation process and forward and backward propagation, please see Paper)
The paper conducted experiments on six classic SCI data sets and real data. Compared with previous methods, the overall reconstruction results are better.
As shown in Table 1, on average, this method achieves an improvement of approximately 0.1dB in PSNR and an improvement of approximately 0.04 in SSIM. Improvements in SSIM show that this method can reconstruct images with relatively fine structures:
△Table 1. PSNR of different algorithms on six classic datasets for video SCI reconstruction (dB) and SSIM
Figure 6 is a comparison of the reconstruction results of different algorithms on classic data sets, and the presentation of some details is smoother and clearer:
△Figure 6
Figure 7 is a comparison of the reconstruction results of different algorithms on real data, and the effect is better in comparison:
△Figure 7
More experimental results can be found in the paper.
At present, the code of the paper has been open sourced, and interested friends can use it~
(The author’s explanation video is also attached at the end of the article to explain it in simple terms)
Paper address:
https://www.php.cn/link/b8002139cdde66b87638f7f91d169d96
Code address:
##https: //www.php.cn/link/fa95123aa5f89781ed4e89a55eb2edcc
##Paper explanation video by author:
English: https://www.bilibili.com/video/BV1X54y1g7D9/
Chinese: https://www.bilibili.com/video/BV1V54y137QK/
## Plastic Cantonese: https:/ /www.bilibili.com/video/BV1224y1G7ee/
The above is the detailed content of A snapshot can restore a video! AAAI 2023 paper proposes a new algorithm for snapshot compression imaging. For more information, please follow other related articles on the PHP Chinese website!