《GraphGPT_汤嘉斌_hku.pdf》由会员分享,可在线阅读,更多相关《GraphGPT_汤嘉斌_hku.pdf(19页珍藏版)》请在三个皮匠报告上搜索。
1、GraphGPTGraphGPT:Graph Instruction Tuning for:Graph Instruction Tuning for Large Language ModelsLarge Language ModelsJiabin TangMusketeers Foundation Institute of Data Science,The University of Hong KongHomepage:https:/tjb-tech.github.io/First-year Ph.D.student majoring in Data Science at The Univer
2、sity of Hong Kong,supervised by Dr.Chao Huang.Research Interests:Large Language Models and other AIGC techniques Graph Learning,Trustworthy Machine Learning Deep Learning Applications,e.g.,Spatio-Temporal Mining and Recommendation2About MeAbout Me Graphs Consisting of nodes and edges to model relati
3、onships between entities.Having extensive applications,e.g.,recommendation systems,social network analysis Are modeled by Graph Neural Networks(GNN).Large Language Models Are trained on vast corpora with large-scale model parameters.Achieving astonishing results in many Natural Language Processing(N
4、LP)tasks.Having impressive transfer capability and generalization.3Background and ChallengeBackground and Challenge Let large language models understand graphs and directly do downstream graph tasks,e.g.,node classification and link prediction.4MotivationMotivationLLMGraph Information:Central Node:2
5、,Edge index:src node,dst node,Node list:Human Question:Given a citation graph:where the 0th node is the target paper,with the following information:Abstract:Title:Question:Which arXiv CS sub-category does this paper belong to?GraphGPT Response:cs.IT,cs.LG,cs.SP,cs.CV,cs.NA.The paper discusses the Re
6、stricted Isometry.So,it is likely to belong to cs.IT RQ1:How can we feed graph structures into LLMs?RQ2:How can we empower LLMs to understand graph structures?RQ3:How to endow LLMs with the ability to reason step-by-step for zero-shot complex graph learning tasks.5Research QuestionResearch QuestionZ
7、ero-Shot Graph Learner RQ1:How can we feed graph structures into LLMs?Without Graph Structure fail when interdisciplinary field Text-based Graph fail when interdisciplinary field unacceptably long token length GraphGPT effectively learning from graph controllable token length6RQ1RQ1 RQ1:How can we f
8、eed graph structures into LLMs?Graph is a sequence of“graph tokens”7Overall ArchitectureOverall Architecture Initializing graph encoder with natural language alignment8Text-Graph GroundingText-Graph Grounding let LLM match the graph tokens with the corresponding natural language content in the promp
9、t.9Self-Supervised Self-Supervised Instruction Tuning(RQ 2)Instruction Tuning(RQ 2)Graph Information:Central Node:68442,Edge index:src node,dst node,Node list:Human Question:Given a sequence of graph tokens that constitute a subgraph of a citation graph,.Here is a list of paper titles:1.2.,please re
10、order the list of papers according to the order of graph tokens.GraphGPT Response:Based on the given graph tokens and the list of paper titles,we obtain the matching of graph tokens and papers:Graph token 1 corresponds to smt based induction methods for timed systems.Graph token 2 corresponds to Lig
11、htweight Alignment Instruction tuning for downstream tasks.10Task-Specific Task-Specific Instruction Tuning(RQ 2)Instruction Tuning(RQ 2)Graph Information:Central Node:2,Edge index:src node,dst node,Node list:Human Question:Given a citation graph:where the 0th node is the target paper,with the follo
12、wing information:Abstract:Title:Question:Which arXiv CS sub-category does this paper belong to?GraphGPT Response:cs.IT,cs.LG,cs.SP,cs.CV,cs.NA.The paper discusses the Restricted Isometry.So,it is likely to belong to cs.ITGraph Information:Central Node 1:8471,Edge index 1:src node,dst node,Node list
13、1:Central Node 2:19368,Edge index 2:src node,dst node,Node list 2:Human Question:Given a sequence of graph tokens:that constitute a subgraph of a citation graph,.Abstract:Titile:and the other sequence of graph tokens:,Abstract:Title:,are these two central nodes connected?Give me an answer of yes or
14、no.Distilling reasoning capabilities from a powerful model(ChatGPT)through the Chain-of-Thought(CoT).What is COT?Please think step by step.11Chain-of-Thought(Chain-of-Thought(CoTCoT)Distillation(RQ 3)Distillation(RQ 3)Powerful yet Closed-source and Cost-7B,Lightweight yet not“smart”COT Distillation
15、Outperform SOTA not only Supervised but Zero-shot settings.12Experimental ResultsExperimental ResultsFirst!First!Generalization Ability Investigation.More Data Boost Model Transfer AbilityMore Data Yet No ForgettingGeneralization for Multitasking Graph Learner13Experimental ResultsExperimental Resul
16、ts Module Ablation Study.Effect of Graph Instruction TuningEffect of LLM-enhanced Semantic Reasoning Model Efficiency StudyTraining Efficiency with Graph Instruction TuningModel Inference Efficiency14Experimental ResultsExperimental ResultsTrainingInference Model Case Study.15Experimental ResultsExp
17、erimental Results Towards Graph Foundation Models.How to encode unified semantic information among different graphsGeneralization and Emergent Abilities Towards Data-Centric Graph LearningHow to alleviate catastrophic forgetting in traditional GNNMore effective data-centric workflow Continuous updat
18、esEfficient training within 2 3090 GPUs (24 G)16OneOne moremore thingthing More details could be found as below:Project page:https:/graphgpt.github.io/(QR code:)Paper:https:/arxiv.org/abs/2310.13023 Code:https:/ Huggingface:Jiabin99/GraphGPT-7B-mix-all17More DetailsMore DetailsDonDon t be t be stingy with stingy with your stars!your stars!QQ&A&AThank You!Jiabin TangSite:https:/tjb-tech.github.io/E-mail:Github:https:/ HuangSite:https:/