


Today I introduce an article published by NTU in April this year. It mainly discusses the differences between the effects of independent prediction (channel independent) and joint prediction (channel dependent) in multivariate time series forecasting problems, the reasons behind them, and the optimization methods. .
Paper title: The Capacity and Robustness Trade-off: Revisiting the Channel Independent Strategy for Multivariate Time Series Forecasting
Download address : https://arxiv.org/pdf/2304.05206v1.pdf
1. Independent forecasting and joint forecasting
In the multivariate time series forecasting problem, the dimensions of multivariable modeling methods are: There are two types, one is independent prediction (channel independent, CI), which refers to treating multivariate sequences as multiple univariate predictions, and each variable is modeled separately; the other is joint prediction (channel dependent, CD), which refers to It is to model multiple variables together and consider the relationship between each variable. The difference between the two is as shown below.
The two methods have their own characteristics: the CI method only considers a single variable, the model is simpler, but the ceiling is also lower because it does not consider the relationship between each sequence. relationship, losing part of the key information; while the CD method considers more comprehensive information, but the model is also more complex.
2. Which method is better
First conduct a detailed comparative experiment and use linear models to observe the effects of the CI method and the CD method on multiple data sets to determine which method A better way. In the experiments in this article, a main conclusion is that the CI method shows better performance on most tasks and has stronger effect stability. As can be seen in the picture below, CI's MAE, MSE and other indicators are basically smaller than CD in each data set, and the fluctuation of the effect is also smaller.
As can be seen from the experimental results below, compared with CD, CI has the same effect on most prediction window lengths and data sets. elevated.
Why is the CI method better and more stable than CD in practical applications? The article conducted some theoretical proofs, and the core conclusion is that real data often has Distribution Drift, and using CI methods can help alleviate this problem and improve model generalization. The picture below shows the distribution of ACF (autocorrelation coefficient, reflecting the relationship between future sequences and historical sequences) of each data set trainset and testset over time. You can see that Distribution Drift is widespread in various data sets. (That is, the ACF of the trainset is different from the ACF of the testset, that is, the relationship between the history and the future sequence of the two is different).
The article proves through theory that CI is effective in mitigating Distribution Drift. The choice between CI and CD is a kind of model capacity and model robustness. A trade-off between stickiness. Although the CD model is more complex, it is also more sensitive to distribution shifts. This is actually similar to the relationship between model capacity and model generalization. The more complex the model, the more accurate the training set samples that the model fits, but the generalization is poor. Once the distribution difference between the training set and the test set is large, the effect will be will get worse.
3. How to optimize
Aiming at the problem of CD modeling, this article proposes some optimization methods that can help the CD model to be more robust.
Regularization: Introduce a regularization loss, use the sequence minus the nearest sample point as the historical sequence input model for prediction, and use smoothing to constrain the prediction result so that the prediction result does not deviate too much from the nearest neighbor observation value. Large, making the estimated results flatter;
Low-rank decomposition: decompose the fully connected parameter matrix into two low-order matrices, which is equivalent to reducing Increases model capacity, alleviates over-fitting problems, and improves model robustness;
Loss function: MAE is used instead of MSE to reduce the model's sensitivity to outliers;
Historical input sequence length: For the CD model, the longer the input historical sequence, the effect may be reduced. This is also because the longer the historical sequence, the more susceptible the model is to the influence of Distribution Shift. For the CI model, the growth of the historical sequence length can be relatively stable. Improve prediction performance.
4. Experimental results
In this article, the above-mentioned method of improving the CD model was tested on multiple data sets. Compared with CD, a relatively stable effect improvement was achieved, indicating that the above method is useful for improving multivariate sequences. Prediction robustness has a relatively obvious effect. Experimental results show that factors such as low-rank decomposition, historical window length and loss function type are also listed in the article in terms of influencing the effect.
The above is the detailed content of Multivariate time series forecasting: independent forecasting or joint forecasting?. For more information, please follow other related articles on the PHP Chinese website!

Vibe coding is reshaping the world of software development by letting us create applications using natural language instead of endless lines of code. Inspired by visionaries like Andrej Karpathy, this innovative approach lets dev

DALL-E 3: A Generative AI Image Creation Tool Generative AI is revolutionizing content creation, and DALL-E 3, OpenAI's latest image generation model, is at the forefront. Released in October 2023, it builds upon its predecessors, DALL-E and DALL-E 2

February 2025 has been yet another game-changing month for generative AI, bringing us some of the most anticipated model upgrades and groundbreaking new features. From xAI’s Grok 3 and Anthropic’s Claude 3.7 Sonnet, to OpenAI’s G

YOLO (You Only Look Once) has been a leading real-time object detection framework, with each iteration improving upon the previous versions. The latest version YOLO v12 introduces advancements that significantly enhance accuracy

The $500 billion Stargate AI project, backed by tech giants like OpenAI, SoftBank, Oracle, and Nvidia, and supported by the U.S. government, aims to solidify American AI leadership. This ambitious undertaking promises a future shaped by AI advanceme

Google's Veo 2 and OpenAI's Sora: Which AI video generator reigns supreme? Both platforms generate impressive AI videos, but their strengths lie in different areas. This comparison, using various prompts, reveals which tool best suits your needs. T

Google DeepMind's GenCast: A Revolutionary AI for Weather Forecasting Weather forecasting has undergone a dramatic transformation, moving from rudimentary observations to sophisticated AI-powered predictions. Google DeepMind's GenCast, a groundbreak

The article discusses AI models surpassing ChatGPT, like LaMDA, LLaMA, and Grok, highlighting their advantages in accuracy, understanding, and industry impact.(159 characters)


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SublimeText3 Chinese version
Chinese version, very easy to use
