1、隐式图神经网络的介绍与进展刘俊成National University of SingaporeAdvisor:Xiaokui Xiao BackgroundA homogenous graph/network =,contains a node set and an edge set.Node attribute matrix.Given and,several tasks are usually considered in this area:1)Node classification(e.g.,classify research area of each paper in a citat
2、ion network)2)Graph Classification(e.g.,molecule property prediction)3)Link Prediction(e.g.,predict missing links on social network for friendship recommendation)4)BackgroundExplicit Layers v.s.Implicit Layers in Deep LearningConsider the input X and output Z,Explicit Layers:=(+)Implicit Layers,inst
3、ead of specifying how to compute output from input,specify thedesirable condition which the output should satisfy.Zico Kolter,David Duvenaud,and Matt Johnson.Deep Implicit Layers-Neural ODEs,Deep Equilibirum Models,and Beyond(NeurIPS 2020 Tutorial)Implicit Deep LearningA typical k-layer deep network
4、:Infinitely deep modelA fixed point iterationA class of implicit layer model:Deep Equilibrium(DEQ)Model=W+=+DEQ ModelThe goal of a DEQ model:to directly find this equilibrium point,without necessarily performing the forward iteration itself.For backpropagation,use implicit differentiation to directl
5、y get the gradient ofparameters from.Thus,dont need to store hidden states.Achieve constant memory cost O(1)which does not grow with#layers.Bai,Shaojie,J.Zico Kolter,and Vladlen Koltun.“Deep equilibrium models.”NeurIPS 2019.Implicit Model on GraphsImplicit Graph Neural Networks(NeurIPS 2020)Obtain n
6、ode representations through the fixed-point solutionTo train the model,they have a condition on parameters W to ensure convergence.Gu F,Chang H,Zhu W,Sojoudi S,El Ghaoui L.Implicit graph neural networks.NeurIPS 2020EIGNN:Efficient Infinite-Depth Graph Neural NetworksJuncheng Liu,Kenji Kawaguchi,Brya
7、n Hooi,Yiwei Wang,Xiaokui XiaoNational University of Singapore(Published on NeurIPS 2021)Graph Neural Networks(GNNs)?Node classification Link predictionKnowledge graph completion Graph Neural Networks(GNNs)GNNs follow Message Passing mechanism:Aggregate representations of each node and its neighbors
8、 at each layerGraph Attention NetworksGraph Neural Networks(GNNs)GNNs follow Message Passing mechanism:Aggregate representations of each node and its neighbors at each layerGraph Attention NetworksLong-range dependencies?Limitations With finite aggregation layers,existing GNNs cannot effectively cap
9、ture long-range dependenciesin graphs.GCN:!#,=$%&!(SGC:)!,=*+%,=.-#0.-#is the normalized adjacency matrix.is the adjacency matrix.Simply stacking many layers to capture long-range dependencies?Oversmoothing 1.1 Deeper insights into graph convolutional networks for semi-supervised learning.AAAI 2018.
10、Methodology In this paper,we propose Efficient Infinite-Depth Graph Neural Networks(EIGNN):An arbitray 0,1 and anarbitrarysmall.and are trainable weight parameters.Residual Connections Learnable weights for propagationMethodology In this paper,we propose Efficient Infinite-Depth Graph Neural Network
11、s(EIGNN):An arbitray 0,1 and anarbitrarysmall.and are trainable weight parameters.But how to compute this infinite-depth GNN?Residual Connections Learnable weights for propagationMethodology First,we show that the infinite sequence/is guaranteed to converge.After getting the limit of this sequence,w
12、e can compute the model w/o iterative solvers which IGNN 1heavily relies on.Some commonly known issues of iterative solvers:e.g.,approximated solutions and sensitive convergence criteria.1 Implicit Graph Neural Networks,NeurIPS 2020Methodology we can also conduct backward without iterative solvers:F
13、orward computation without iterative solvers:Methodology Current form requires us to compute and storeVery costly!Methodology Current form requires us to compute and storeTo reduce memory complexity by using eigendecomposition.=!#,=%&Then the forward and backward computation are obtained as follows:
14、Where,!=!%Very costly!Methodology However,we still need to deal with ,which is a matrix.Memory complexity can be reduced from 0to$if ,or($)otherwise:Experimental Results Node classification:Chain datasetsTo directly examine the ability of capturing long-range dependencies.!#$!Experimental Results No
15、de classification:Chain datasetsTo directly examine the ability of capturing long-range dependencies.!#$!Efficiency comparisonExperimental Results Observation:all GNN models with finite layers perform badBut why IGNN cannot get perfect test accuracy?It uses iterative solvers.Finite-depth IGNN v.s.in
16、finite-depth IGNN.Finite-depth IGNN performs better when the length increases.Iterative solvers can accumulate errors on both forward and backward.Experimental ResultsNode classification on real-world datasets(heterophilic graphs)Experimental ResultsRobustness against adversarial perturbation:Conclu
17、sion We propose EIGNN to capture long-range dependencies.Train EIGNN with a closed-form solution rather than using iterative solvers.More efficient computation for training EIGNN with eigendecomposition.EIGNN outperforms other baselines on both synthetic and real-world experiments Whats Next on Impl
18、icit GNNs?Implicit GNNs:EIGNN 1,IGNN 2,CGS 3.1 Juncheng Liu,Kenji Kawaguchi,Bryan Hooi,Yiwei Wang,and Xiaokui Xiao.Eignn:Efficient infinite-depth graph neural networks.In NeurIPS,2021.2 Fangda Gu,Heng Chang,Wenwu Zhu,Somayeh Sojoudi,and Laurent El Ghaoui.Implicit graph neural networks.In NeurIPS,202
19、0.3 Junyoung Park,Jinhyun Choo,and Jinkyoo Park.Convergent graph solvers.In ICLR,2022.MGNNI:Multiscale Graph Neural Networkswith Implicit LayersJuncheng Liu,Bryan Hooi,Kenji Kawaguchi,Xiaokui XiaoNational University of Singapore(Published on NeurIPS 2022)Whats Next on Implicit GNNs?Implicit GNNs(e.g
20、.,EIGNN 1,IGNN 2,CGS 3)have been proposed to capture long-rangedependencies on graphs.The aggregation step is generally defined as:1 Juncheng Liu,Kenji Kawaguchi,Bryan Hooi,Yiwei Wang,and Xiaokui Xiao.Eignn:Efficient infinite-depth graph neural networks.In NeurIPS,2021.2 Fangda Gu,Heng Chang,Wenwu Z
21、hu,Somayeh Sojoudi,and Laurent El Ghaoui.Implicit graph neural networks.In NeurIPS,2020.3 Junyoung Park,Jinhyun Choo,and Jinkyoo Park.Convergent graph solvers.In ICLR,2022.As an example,the aggregation step in EIGNN:Implicit GNNs require that the aggregation is a contraction mapping.Iterate the aggr
22、egation steps infinite number of times until convergences and use the equilibrium asthe final representations:Recap:OverviewIn this paper,we introduce and justifies the limitations of these implicit GNNs:1.The constrained expressiveness due to their limited effective range for capturing long-range d
23、ependencies.2.Their lack of ability to capture multiscale information on graphs at multiple resolutions.Our main contributions:We introduce the concept of effective range for implicit GNNs and provide theoretical analyses about the limited effective range of previous works.We propose a new implicit
24、GNN model with multiscale propagation to expand the effective rangeand capture underlying graph information at various scales.Effective range of previous implicit GNNs Effective range of previous implicit GNNs The limited effective range of previous implicit GNNs:MGNNI ModelIn MGNNI model,we define
25、a single-scale propagation module as follows:where 0,1 and denotes the graph scale to capture in each propagation(i.e.,-hopneighbors to be aggregated in each propagation).To mitigate the limitations,we propose a new model MGNNIwith multiscale propagation to expand the effective range and capture und
26、erlying graph information at various scales.MGNNI ModelMultiscale propagation with attention mechanism:With a set of multiple scales =%,1*2,we can obtain equilibriums withdifferent scales%,$,1.To combine information from multiple scales,we propose a scale-aggregation mechanism to learnthe attention
27、values of different scales for each node.1.Calculate attention valuesCombine equilibriums at different scalesExpanded range via multiscale propagationExperimentsWe conduct experiments on synthetic and real-world datasets for both node classification andgraph classification.Results on synthetic dataset:Results on real-world heterophilic graph datasets:ExperimentsResults for graph classification:Ablation StudyThe effectiveness of multiscale modeling:The effectiveness of attention mechanism: