《图基础模型初探.pdf》由会员分享,可在线阅读,更多相关《图基础模型初探.pdf(44页珍藏版)》请在三个皮匠报告上搜索。
1、图基础模型初探石川石川 教授教授shichuanbupt.edushichuanbupt.edu北京邮电大学北京邮电大学大纲大纲 图基础模型 相关工作进展 我们的工作 总结 基础模型基础模型“基础模型是一个在广泛的数据上训练且可以被应用于广泛的下游任务的模型。”11 R.Bommasani,D.A.Hudson,E.Adeli,R.Altman,S.Arora,S.von Arx,M.S.Bernstein,J.Bohg,A.Bosselut,E.Brun-skill,et al.,“On the opportunities and risks of foundation models,”ar
2、Xiv preprint arXiv:2108.07258,2021语言视觉语音语言基础模型初步展现出通用AI能力视觉基础模型展现强大的图像理解能力USM语音基础模型展现出上百种语言识别能力GPT4基础模型已经在语言、视觉和语音等领域成为现实基础模型的特点基础模型的特点基础模型的两大特点:涌现(Emergence)和同质化(Homogenization)。涌现:随着基础模型的扩大,它可能会自发地展现新颖的能力。同质化:模型的多功能性,使其能够在各种应用中部署。机器翻译问答系统文本生成信息抽取同质化基础模型涌现2 Wei J,Tay Y,Bommasani R,et al.Emergent ab
3、ilities of large language modelsJ.arXiv preprint arXiv:2206.07682,2022.大语言模型大语言模型大模型(Large Language Models)是指参数量巨大的预训练语言模型,是基础模型的典型代表。3 Zhao W X,Zhou K,Li J,et al.A survey of large language modelsJ.arXiv preprint arXiv:2303.18223,2023.大模型已经从最初的ELMo等具有数百万参数的模型开始,发展到像GPT-4这样具有万亿参数的模型。大语言模型具备理解、生成、逻辑、记
4、忆等人工智能的核心基础能力,为通用人工智能带来曙光。图图图(网络)是用于描述和建模复杂系统的通用语言。金融网络社交网络神经元网络信息网络生物医药网络互联网图(机器学习)发展历史图(机器学习)发展历史图算法 Dijkstra图神经网络 GCN图信号处理 Shuman哥尼斯堡七桥问题图G是一个有序二元组(V,E),其中V称为顶集,E称为边集。图机器学习指将机器学习用于图数据,简称图学习或图模型。长尾分布图嵌入 DeepWalk图神经网络图信号处理DeepWalk算法2201320142017最短路径问题图论 Euler网络科学 Barabasi网络表示学习网络表示学习网络表示
5、:将网络的每个节点嵌入到低维向量空间。易于计算并行化得到表征适用于经典机器学习算法嵌入应用 节点分类 链接预测 社群检测 网络演化 生成图机器学习的发展与分类图机器学习的发展与分类浅层模型 基于矩阵分解 e.g.,Laplacian eigenmaps 基于随机游走 e.g.,DeepWalk,LINE,node2vec深层模型 基于自动编码器 e.g.,DNGR and SDNE 基于图神经网络 e.g.,GCN,GraphSage,GAT当图模型遇到大模型当图模型遇到大模型大模型解决不了图的问题。大模型难以建模图结构语义。大模型难以处理多样的图任务。图模型不具备大模型的能力。有限的表达能力
6、。深层GNN:过平滑、过压缩问题。没有涌现能力、难以支持多任务。图神经网络的信息瓶颈深层GNN的性能下降图数据的丰富结构语义和丰富任务图基础模型图基础模型图基础模型(Graph Foundation Model,GFM)是一个在广泛的图数据上预训练的模型,适用于在不同的下游图任务。图基础模型预期拥有两个主要特点:涌现和同质化。涌现:随着模型增大,自发地展现新颖的能力。同质化:模型可以适应不同类型的图任务。Jiawei Liu,Cheng Yang,Zhiyuan Lu,Junze Chen,Yibo Li,Mengmei Zhang,Ting Bai,Yuan Fang,Lichao Sun,
7、Philip S.Yu,Chuan Shi.Towards Graph Foundation Models:A Survey and Beyond.arXiv 2023图基础模型图基础模型的关键技术的关键技术图基础模型的关键技术包括:预训练技术:神经网络以一种自监督的方式在大规模图数据上训练。代表性方法:生成式预训练、对比式预训练等。适配技术:用于将预训练完成的模型适配到特定下游任务或领域来提高性能。代表性方法:基于Fine-tuning的方法、基于Prompting的方法。图基础模型与语言基础模型比较图基础模型与语言基础模型比较相似性:相同的愿景目标和相似的学习范式差异性:(1)数据和任务的
8、独特性;(2)技术的差异性大纲大纲 图基础模型 相关工作进展 我们的工作 总结 相关工作相关工作没有关于设计和实现图基础模型的明确解决方案,但有相关探索。基于对图神经网络(GNNs)和大型语言模型(LLMs)的依赖将现有探索分为三类。基于基于GNN的模型的模型旨在通过对GNN的模型架构、预训练和适配方面的创新来增强现有的图学习能力。改进骨干架构:Graph Transformer。代表性工作:Graph-BERT、GROVER等。改进预训练:Graph Pretraining。代表性工作:GCC、GraphCL、PT-HGNN等。改进适配:Graph Prompt。代表性工作:GraphPro
9、mpt、All In One等。基于基于LLM的模型的模型以LLM为基础,将图转化为文本(Text)或标记(Token)的方式,探索将LLM用作图基础模型的可行性。Graph-to-Token:把图转成标记,再输入到LLM。代表性工作:InstructGLM。Graph-to-Text:把图转成文本,再输入到LLM。代表性工作:NLGraph、LLM4Mol等。基于基于GNN+LLM的模型的模型结合GNN和LLM,探索二者之间协同作用的方式,增强图学习的能力。以GNN为中心的架构:将LLM的输出作为GNN的增强特征。代表性工作:SimTeG、TAPE等。对称架构:将GNN和LLM的输出对齐。代
10、表性工作:ConGrat、G2P2等。以LLM为中心的架构:利用GNN提升LLM的表现。代表性工作:Graph-Toolformer等。大纲大纲 图基础模型 相关工作进展 我们的工作 总结 我们的工作我们的工作Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN,KDD 2021)Spectral Graph Neural Networks Meet Transformers (Specformer,ICLR 2023)GraphTranslator:Aligning Graph Model to Large Language Mod
11、el for Open-ended Tasks(GraphTranslator,WWW 2024)Xunqiang Jiang,Tianrui Jia,Yuan Fang,Chuan Shi,Zhe Lin,Hui Wang.Pre-training on Large-Scale Heterogeneous Graph.KDD 2021Deyu Bo,Chuan Shi,Lele Wang,Renjie Liao.Specformer:Spectral Graph Neural Networks meet Transformers.ICLR 2023Mengmei Zhang,Mingwei
12、Sun,Peng Wang,Shen Fan,Yanhu Mo,Xiaoxiao Xu,Hong Liu,Cheng Yang,Chuan Shi.GraphTranslator:Aligning Graph Model to Large Language Model for Open-ended Tasks.WWW 2024Motivation of PT-HGNNMotivation How to capture the semantic and structural properties on a heterogeneous graph during pre-training How t
13、o efficiently pre-train GNNs on a large-scale heterogeneous graphXunqiang Jiang,Tianrui Jia,Yuan Fang,Chuan Shi,Zhe Lin,Hui Wang.Pre-training on Large-Scale Heterogeneous Graph.KDD 2021.Heterogeneous graph(HG or HIN)contain multiple object types and/or multiple link types.Network schema:meta-level d
14、escription of a networkMeta path:A relation sequences connecting object pairsBasic idea Preserve heterogeneous semantic and structural properties as transferable knowledge Sparsify large-scale heterogeneous graph for efficient pre-training Design the node-and schema-level pre-training tasks Relation
15、-based SparsificationBasic Idea of PT-HGNNSchema-level Pre-training Task Model pairwise relations between different types of nodes Negative Samples Selection Unlinked nodes that are different enoughPT-HGNNEdge Sparsification Preserve more meaningful edges(lower noise in graphs)Improve the time effic
16、iency on large graphPT-HGNNMethod:Relation-based Personalized PageRank(R-PPR)Acceleration:Random-Walk Formulation(Forward Search)Top-K Entries(Sparsification)Setup Experiment Dataset:Open Academic Graph(OAG),unifies two academic graphs:Microsoft Academic Graph(MAG)and AMiner.Statistics of Open Acade
17、mic Graph Dataset Tasks:Ordinary experiment:Paper-Field predictionPaper-Venue predictionAuthor Name disambiguationTransfer ExperimentEfficiency experimentExperimentsNode classificationLink predictionNetwork Schema of OAGPerformancesExperimentsTransfer experimentExperimentsTransfer experiment setting
18、 Field AField BPretrainFine-tuneKnowledge transferring from pre-training to fine-tuningdoes not guarantee a gain in performancePositive correlation value between graphs results in positive transferring and vice versa Background of SpecformerBackground GNNs can be divided into two categories:Spatial
19、and Spectral methods Spatial GNNs:Aggregate information in the spatial(vertical)domain Spectral GNNs:Filter signals using eigenvalues in the spectral(frequency)domainDeyu Bo,Chuan Shi,Lele Wang,Renjie Liao.Specformer:Spectral Graph Neural Networks meet Transformers.ICLR 2023Motivation of SpecformerM
20、otivation Graph Transformers have been used in the spatial domain,how about spectral domain?Current spectral GNNs only use eigenvalues in the graph spectrum,ignoring set information of eigenvalues.However,the set information is also important.Can we employ the full-connected attention in transformer
21、 to capture the set information?Basic idea Leverage Transformer to capture the dependency of eigenvalues Learn powerful graph filters for graph convolution Encoder:Eigenvalue encoding+Transformer Decoder:Channel-wise graph filterBasic Idea of SpecformerEncoderSpecformer Eigenvalue Encoding(Relative
22、Information)Transformer Encoder(Permutation-invariant)LN:Layer Normalization,MHA:Multi-head Attention,FFN:Feed-forwardDecoderSpecformer Channel-wise Decoder(M-heads)Learning new eigenvalues Construct new graph filtersSynthetic Data(Node Regression)ExperimentsReal Data(Node Classification)Experiments
23、VisualizationExperiments Specformer can learn interpretable spectrum dependency Specformer can learn complex graph filtering functions Motivation of GraphTranslatorMotivation LLM showcases impressive emergent abilities for open-ended tasks based on instructions.Graph models(GMs)achieve state-of-the-
24、art performance on a wide range of pre-defined graph task.Can we build a model that can solve both pre-defined and open-ended tasks?Mengmei Zhang,Mingwei Sun,Peng Wang,Shen Fan,Yanhu Mo,Xiaoxiao Xu,Hong Liu,Cheng Yang,Chuan Shi.GraphTranslator:Aligning Graph Model to Large Language Model for Open-en
25、ded Tasks.WWW24GraphTranslatorWe employ LLM to construct high-quality description text with Chain-of-Thought(COT).ProducerTranslator aims to align GM and LLM by converting the learned node embedding into token representations.TranslatorWe propose a novel framework to align graph models(GMs)to LLM,na
26、med GraphTranslator.GraphTranslator Stage 1:We obtain text embeddings with translator,then we train the translator through contrastive learning Stage 2:We use a linear layer to project the output of Translator module into the same dimension with the word embedding of LLMExperimentsWe conducted exper
27、iments on the Taobao and ArXiv datasets in zero-shot scenario.ExperimentsWe conducted QA experiment in Taobao dataset.GraphTranslator captures the preferences of users and his friends more accurate.大纲大纲 图基础模型 相关工作进展 我们的工作 总结 参考资料参考资料 更多资料详见个人主页:www.shichuan.orgJiawei Han作序方滨兴院士刘欢 裴健唐杰 周靖人联袂推荐未来研究方向未来研究方向1.提升数据量和数据质量图增强、特征增强、标签增强为基于LLM的模型设计增强方案2.改进骨干架构和训练策略提高性能和可解释性知识蒸馏、模型编辑3.模型评估和寻找杀手级应用人工评估、元评估药物发现、城市计算谢谢Q&Awww.shichuan.org