Integrated multi-graph neural network-AI-php.cn

Home

Technology peripherals

Integrated multi-graph neural network

王林

May 10, 2023 pm 09:10 PM

gnngraph neural network

Integrated multi-graph neural network

1. GNN from a unified perspective

1. Existing GNN propagation paradigm

How does GNN propagate in the airspace? As shown in the figure below, take node A as an example:

Integrated multi-graph neural network

##First it will identify its neighbors The information of node N (A) is aggregated into a hN(A)(1 ), and then combine it with A to represent the upper layer h# #N(A)₍₁₎^{is combined, and then through a transformation function (i.e. Trans(·) in the formula), the next level representation of A is obtained} hN(A)₍₂₎^{. This is the most basic GCN propagation paradigm.}

Integrated multi-graph neural network

In addition, there is a decoupled propagation process:

Integrated multi-graph neural network

#These two What's the difference?

In the decoupled propagation paradigm, a feature extractor, that is, the transformation function, is first used to extract the initial features, and then the extracted features are put into the aggregation function for aggregation. You can see this This method separates feature extraction and aggregation, that is, decoupling is achieved. The advantage of this is:

The above are the two main paradigms, and the embedding output of the node can use the last layer of the network or the residual of the middle layer. .

Through the above review we can see that there are two basic information sources in GNN:

Based on the propagation mechanism of GNN, we can find that, Existing GNNs have two common goals:

Encode useful information from the characteristics of nodes.
#Use the smoothing capabilities of topology.

So can we use mathematical language to describe these two goals? Someone proposed the following GNN optimization unified framework expressed by the formula:

Integrated multi-graph neural network

## The first item in the optimization goal:

Integrated multi-graph neural network

is the feature fitting item, and its goal is to make the learned nodes Indicates that Z is as close as possible to the original feature H, and F1, F2 is a graph convolution that can be designed freely nuclear. When the convolution kernel is the identity matrix I, it is equivalent to an all-pass filter; when the convolution kernel is a 0 matrix, it is a low-pass filter, and when the convolution kernel is the Laplacian matrix L, it is a high-pass filter.

The second term of the optimization goal is formally the trace of a matrix, and its function is the regular term on the graph. What is the difference between the trace and the regular term? What about relationships? In fact, the second item is expanded into the following form:

Integrated multi-graph neural network

##The meaning is that on the captured image The degree of difference in features between any two adjacent nodes represents the smoothness of a graph. Minimizing this goal is equivalent to making me and my neighbor more similar.

3. Use a unified optimization framework to understand existing GNN

Most GNNs are optimizing this goal. We can discuss it under different situations:

When the parameters are:

Integrated multi-graph neural network

When the optimization goal becomes:

Integrated multi-graph neural network

## Find the partial derivative and get:

Integrated multi-graph neural network

Let’s take the above The result can be further expanded to get:

Integrated multi-graph neural network

## Its meaning means the Kth layer All node representations are equal to the propagation process of the K-1th layer node representation on the adjacency matrix. After derivation until the end, it will be found that it is equal to the initial feature X propagating K times on the adjacency matrix after completing the feature transformation W*. In fact, this is the model of GCN or SGC with the nonlinear layer removed.

When the parameter F

=F2=I, ζ=1, ξ=1/α-1, α∈(0,q], and when selecting an all-pass filter, the optimization goal becomes:

Integrated multi-graph neural network

At this time, we also seek the partial derivative of Z, and make the partial derivative equal to 0 to obtain the closed-form solution of the optimization objective:

Integrated multi-graph neural network

By slightly converting the result, you can get the following formula:

Integrated multi-graph neural network

We can find that the above formula represents the process of node features propagating on personalized PageRank. That is the PPNP model.

Integrated multi-graph neural network

The same is such a model. If you use gradient descent to find it, and set The step size is b, and the iteration term is the partial derivative of the objective function at time k-1 with respect to Z.

Integrated multi-graph neural network

when

Integrated multi-graph neural network

When you can get:

Integrated multi-graph neural network

##This is the APPNP model. The background of the emergence of the APPNP model is that the inverse operation of the matrix in the PPNP model is too complicated, and APPNP uses iterative approximation to solve it. It can also be understood that APPNP can converge to PPNP because both come from the same framework.

Integrated multi-graph neural network

4. New GNN framework

As long as we design a new fitting term O##fit, and Design a corresponding graph regular term Oreg_{, and add a new solution process to get a new GNN model.}

① Example 1: From all-pass filtering to low-pass filtering

As mentioned before, all-pass filtering Convolution kernel under the device F1##=F2=I , when the convolution kernel is the Laplacian matrix L It is a high pass filter. If the GNN obtained by weighting these two situations can encode low-pass information:

Integrated multi-graph neural network

##when

can get the exact solution:

Integrated multi-graph neural network

## Similarly, we can solve it iteratively:

Integrated multi-graph neural network

#5, Elastic GNN

The regular term mentioned in the previous unified framework is equivalent to the L2 regular, which is equivalent to the difference information between any two points on the calculation graph. Some researchers feel that L2 regularization is too global and will cause the smoothness of the entire graph to tend to be the same, which is not entirely consistent with reality. Therefore, it was proposed to add an L1 regular term. The L1 regular term would penalize relatively large changes in the graph.

Integrated multi-graph neural network

##The L1 regular term part is:

Integrated multi-graph neural network

In short, the unified framework above tells us:

However, this unified framework can only be adapted to homogeneous graph structures. Next, let’s take a look at the more common multi-relationship graph structure.

2. Relational GNN model

1, RGCN

## The so-called multi-relationship graph refers to a graph with an edge type greater than 1, as shown in the figure below.

Integrated multi-graph neural network

This kind of multi-relationship diagram is very widespread in the real world, such as in chemical molecules Multiple types of molecular bonds, such as the different relationships between people in a social relationship diagram. For such graphs we can use Relational Graph Neural Networks to model. The main idea is to individually aggregate graphs with N relationships to obtain N aggregation results, and then aggregate the N results.

Integrated multi-graph neural network

expressed as a formula:

Integrated multi-graph neural network

You can see that aggregation is performed in two steps. First, select a relationship r from all relationships R, and then find the relationship that contains this relationship. All nodes

Nr ^{are aggregated, among which}Wr is the weight used to weight various relationships. Therefore, it can be seen that as the number of relationships in the graph increases, the weight matrix Wr will also increase, which will lead to the problem of over-parameterization ( Over-parameterization). In addition, splitting the topological relationship diagram according to relationships can also lead to over-smoothing.

2, CompGCN

#In order to solve the problem of over-parameterization, CompGCN uses a vectorized relation encoder to replace N relations Matrix:

Integrated multi-graph neural network

The encoder contains relationships in three directions: forward, reverse, and self-loop:

Integrated multi-graph neural network

Every iteration will also update the embedding of the relation.

Integrated multi-graph neural network

But this heuristic design and such a parametric encoder are also Will cause excessive parameterization. Then, based on the above considerations, we get the starting point of our work: Can we design a more reliable GNN from the perspective of optimization goals, and at the same time solve the problems of existing GNNs.

3. Integrated Multi-Relationship Graph Neural Network

Our EMR GNN was published this year. Next, we will mainly focus on Three aspects are discussed on how we design a GCN suitable for multi-relationship graphs:

How to design a suitable integrated optimization algorithm
Message delivery mechanism
How the GNN model is designed

1. Integrated optimization algorithm

This optimization algorithm needs to meet two requirements:

Be able to capture multiple relationships on the graph at the same time
Be able to model the importance of different relationships on the graph

The integrated multi-relationship graph regular term we propose on the multi-relationship graph is as follows:

Integrated multi-graph neural network

This regular term is also to capture the smoothing ability of the graph signal, but this adjacency matrix is captured under the relationship r and is subject to normalization Constrained parameters μ_r is to model the importance of a certain relationship. The second term is the second normal form regularization of the coefficient vector, which is to make the coefficient vector more uniform.

#In order to solve the problem of over-smoothing, we added a fitting term to ensure that the original feature information is not lost. The sum of the fitting term and the regular term is:

Integrated multi-graph neural network

and what was mentioned in the previous chapter Compared with the unified framework, the objective function we design here includes two variables: node correction Z and relationship matrix parameter μ. So it is also a challenge to derive a message propagation mechanism based on such an optimization goal.

2. Derivation of message passing mechanism

Here we adopt an iterative optimization strategy:

First fix the node representation Z, and then optimize the parameter μ
Then optimize the node based on the result μ of the previous iteration Represent Z

When the fixed node represents Z, the entire optimization objective degenerates into an objective function only related to μ, but this is a band Constrained objective function:

Integrated multi-graph neural network

##This is actually a simplex constraint (constraint in a standard A convex function of μ on simplex). For this type of problem, we can use the Mirror Entropic Descent Algorithm to solve it. We will first find a constant, and then update the weight coefficient under each relationship. The entire update process is similar to the exponential gradient descent algorithm.

Integrated multi-graph neural network

Fixed the relationship coefficient μ to update Z, then the optimization goal degenerates into the following This form:

Integrated multi-graph neural network

In this way we find the partial derivative of the objective function with respect to Z, and let If the partial derivative is equal to 0, we can get:

Integrated multi-graph neural network

##Then the closed-form solution of Z is:

Integrated multi-graph neural network

Similarly, we can use iteration to get an approximate solution. This process can be expressed as follows :

Integrated multi-graph neural network

From the derived message passing mechanism we can prove that the design can avoid excessive Smooth and avoid over-parameterization. Below we can look at the proof process.

The original multi-relationship PageRank matrix is defined as follows:

Integrated multi-graph neural network

The personalized multi-relationship PageRank matrix adds a probability of returning its own node on this basis:

Integrated multi-graph neural network

By solving the above circular equation, you can get the multi-relationship personalized PageRank matrix:

Integrated multi-graph neural network

Let us:

You can get:

Integrated multi-graph neural network

##this That is the closed-form solution obtained by our proposed solution. That is to say, our propagation mechanism can be equivalent to propagating feature H on the personalized PageRank matrix of the node. Because in this propagation mechanism, a node can return to its own node with a certain probability, which means that its own information will not be lost during the information transmission process, thereby preventing the problem of over-smoothing.

In addition, our model also alleviates the phenomenon of over-parameterization, because as can be seen from the formula, for each relationship our model only has A learnable coefficient μ_r, Compared with the number of parameters of the previous encoder or weight matrix w#r, the magnitude of our parameters is almost negligible. The following figure is the model architecture we designed:

Integrated multi-graph neural network

where RCL is the parameter learning step, and the Pro step is the feature propagation step. These two steps together form our messaging layer. So how do we integrate our messaging layer into the DNN without introducing more additional parameters? We also follow the decoupling design idea: first use an MLP to extract input features, and then pass through multiple layers of message passing layers we designed. Superimposing multiple layers will not lead to over-smoothing. The final transfer result is processed by MLP to complete node classification and can be used for downstream tasks. Use the formula to express the above process as follows:

Integrated multi-graph neural network

##f(X;W) means that The input features are extracted through MLP, and the following EnMP(K) means that the extraction results are passed through the K layer message, gθ represents a classified MLP.

In back propagation we only need to update the parameters in the two MLPs, while the parameters in our EnMP are learned during the forward propagation process Yes, there is no need to update any parameters of EnMP during the backward propagation process.

We can compare the parameters of different mechanisms. We can see that the parameters of our EMR-GNN mainly come from the two MLPs before and after, and the relation coefficient. When the number of layers is greater than 3, we can know that the number of parameters of EMR-GNN is less than that of GCN, and even less than other heterogeneous graphs.

Integrated multi-graph neural network

With such a small number of parameters, our EMR-GNN operates at different nodes as follows: The best level can still be achieved under classification tasks.

Integrated multi-graph neural network

In addition, we also compared the changes in classification accuracy of different network structures after the number of layers increased. As shown in the figure below, when the number of layers increased to 64, our model can still maintain a high level Accuracy, while the original RGCN will encounter insufficient memory when the number of layers increases to more than 16 layers, and it is impossible to superimpose more layers. This is due to too many parameters. The performance of the GAT model is degraded due to over-smoothing.

Integrated multi-graph neural network

In addition, we also found that our EMR-GNN performed better under smaller data sizes. It can achieve the classification accuracy of the whole sample, while RGCN drops a lot.

We also analyzed whether the relationship coefficient μr learned by EMR-GNN is really meaningful, so what is meaningful? We hope that the relationship coefficient μ_rrelationship coefficient Give more weight to important relationships and less weight to unimportant relationships. The results of our analysis are shown in the figure below:

Integrated multi-graph neural network

##The green histogram represents a The effect of classification under a certain relationship. If the classification accuracy is higher under a certain relationship, we can think that the relationship is important. The blue columns represent the relationship coefficients learned by our EMR-GNN. From the blue-green comparison, we can see that our relationship coefficient can reflect the importance of the relationship.

Finally we also made a visual display as shown below:

Integrated multi-graph neural network

You can see that the node embedding trained by EMR-GNN can carry the structured information of the node, which can make the nodes of the same type more cohesive and different types as much as possible. Separate, the segmentation boundary is clearer than other networks.

##4. Summary

1. We use a unified perspective to understand GNN

① From this perspective, we can easily see what problems there are with existing GNN

②This unified perspective can give us a way to redesign GNN Basics

#2. We try to design a new multi-relationship GNN from the perspective of the objective function

## ① We First, an integrated optimization framework was designed

## ② Based on such an optimization framework, we derived a message passing mechanism

③ This message passing mechanism with a small number of parameters is simply combined with MLP to get EMR-GNN

3. EMR-GNN has What are the benefits?

## ① It relies on a reliable optimization goal, so the results obtained are credible, and its underlying principles can be explained mathematically

② It can solve the over-smoothing problem of the existing Relation GNN

## ③ Solve the over-parameterization problem

④ Easy to train, better results can be obtained with smaller parameter quantities

5. Q&A session

Q1: Is there any difference between the learning of relationship coefficients and the attention mechanism?

A1: The relationship coefficient learning here is an update process derived through the optimization framework, and attention is a process that needs to be learned based on backpropagation, so in terms of optimization, both There is a fundamental difference.

#Q2: How applicable is the model on large-scale data sets?

A2: We analyzed the complexity of the model in the appendix. In terms of complexity, we are on the same level as RGCN, but the number of parameters will be smaller than RGCN. , so our model is even more suitable for large-scale data sets.

#Q3: Can this framework incorporate edge information?

#A3: Can be integrated into the fitting term or regular term.

#Q4: Where should I learn these mathematical basics?

#A4: Part of it is based on previous work, and the other part of the mathematical theory related to optimization is also based on some classic optimization papers.

#Q5: What is the difference between a relationship diagram and a heterogeneous diagram?

#A5: The relationship graph is a heterogeneous graph, but we usually think of heterogeneous graphs as those whose node types or edge types are greater than 1. In the relationship diagram, we are particularly concerned about the relationship category greater than 1. It can be understood that the latter includes the former.

Q6: Can mini-batch training be supported?

A6: Supported.

Q7: Is the future research direction of GNN more inclined to strict, interpretable mathematical derivation rather than heuristic design?

#A7: We ourselves feel that strict interpretable mathematical derivation is a reliable design method.

The above is the detailed content of Integrated multi-graph neural network. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Are You At Risk Of AI Agency Decay? Take The Test To Find OutApr 21, 2025 am 11:31 AM

This article explores the growing concern of "AI agency decay"—the gradual decline in our ability to think and decide independently. This is especially crucial for business leaders navigating the increasingly automated world while retainin

How to Build an AI Agent from Scratch? - Analytics VidhyaApr 21, 2025 am 11:30 AM

Ever wondered how AI agents like Siri and Alexa work? These intelligent systems are becoming more important in our daily lives. This article introduces the ReAct pattern, a method that enhances AI agents by combining reasoning an

Revisiting The Humanities In The Age Of AIApr 21, 2025 am 11:28 AM

"I think AI tools are changing the learning opportunities for college students. We believe in developing students in core courses, but more and more people also want to get a perspective of computational and statistical thinking," said University of Chicago President Paul Alivisatos in an interview with Deloitte Nitin Mittal at the Davos Forum in January. He believes that people will have to become creators and co-creators of AI, which means that learning and other aspects need to adapt to some major changes. Digital intelligence and critical thinking Professor Alexa Joubin of George Washington University described artificial intelligence as a “heuristic tool” in the humanities and explores how it changes

Understanding LangChain Agent FrameworkApr 21, 2025 am 11:25 AM

LangChain is a powerful toolkit for building sophisticated AI applications. Its agent architecture is particularly noteworthy, allowing developers to create intelligent systems capable of independent reasoning, decision-making, and action. This expl

What are the Radial Basis Functions Neural Networks?Apr 21, 2025 am 11:13 AM

Radial Basis Function Neural Networks (RBFNNs): A Comprehensive Guide Radial Basis Function Neural Networks (RBFNNs) are a powerful type of neural network architecture that leverages radial basis functions for activation. Their unique structure make

The Meshing Of Minds And Machines Has ArrivedApr 21, 2025 am 11:11 AM

Brain-computer interfaces (BCIs) directly link the brain to external devices, translating brain impulses into actions without physical movement. This technology utilizes implanted sensors to capture brain signals, converting them into digital comman

Insights on spaCy, Prodigy and Generative AI from Ines MontaniApr 21, 2025 am 11:01 AM

This "Leading with Data" episode features Ines Montani, co-founder and CEO of Explosion AI, and co-developer of spaCy and Prodigy. Ines offers expert insights into the evolution of these tools, Explosion's unique business model, and the tr

A Guide to Building Agentic RAG Systems with LangGraphApr 21, 2025 am 11:00 AM

This article explores Retrieval Augmented Generation (RAG) systems and how AI agents can enhance their capabilities. Traditional RAG systems, while useful for leveraging custom enterprise data, suffer from limitations such as a lack of real-time dat

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Hot Topics

Where is the login entrance for gmail email?

7627

CakePHP Tutorial

1389

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

140