上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

3-6 带有规则发现的关系推理.pdf

编号:102427 PDF 46页 7.63MB 下载积分:VIP专享
下载报告请您先登录!

3-6 带有规则发现的关系推理.pdf

1、RELATIONAL REASONING WITH RULE DISCOVERYBang Liu,Assistant ProfessorUniversity of Montreal&Mila2022/06/25|01IntroductionWhat is relational reasoning?Why it is important?02Existing ResearchGraph Neural Networks Inductive Logic Programming Neural-Symbolic Reasoning03R5 FrameworkRule Discovery with Rei

2、nforced and Recurrent Relational Reasoning:consider reasoning over graphs as sequential decision making04ExperimentsStrong generalization ability of R5 Extend to larger-scale graphs CONTENT2IntroductionWhat is relational reasoning?Why it is important?013Relational ReasoningThe ability to consider re

3、lationships between multiple mental representations Directly linked to the capacity to think logically and solve problems in novel situationsThe one thing that makes humans far superior to other beings is the ability to reason and think logically.4Reasoning in Different ApplicationsReasoning in Natu

4、ral Language Understanding(NLU)Growing concerns regarding the ability of NLU systems to generalize in a systematic and robust wayGiven a story,the goal is to infer the relationship between two family members,whose relationship is not explicitly mentionedExtract relationships Induce logical rules Inf

5、er relationship with rules5Reasoning in Different ApplicationsReasoning in Visual Question Answering Learning to understand relations between different objects(ideas).This is considered an essential characteristic of intelligence.The model has to look at objects of different shape/size/color,and be

6、able to answer questions that are related between multiple such objects6Reasoning in Different ApplicationsAutomated Theorem Proving Use computers to prove or disprove mathematical or logical statements7Reasoning in Different ApplicationsKnowledge Graph Completion Complete missing nodes or relations

7、 by reasoning over existing knowledge8More Applications Involving ReasoningSpeech recognition,Image classification,Machine translation,Game playing a)speech recognitionb)image classificationc)machine translationd)game playing9The Systematicity of ReasoningSystematicity is the ability to recombine kn

8、own parts and rules to form new sequences while reasoning over relational data There is a debate over the problem of systematicity in connectionist models(i.e.,deep neural networks)Kinship problem Logical generalization:induce logical rules and generalize by combining these rules in novel ways after

9、 training.10Systematicity is a Component of CompositionalitySystematically recombine known parts and rulesExtend predictions to unseen longer sequencesIf models composition operations are local or globalPredictions are robust to synonym substitutionsIf models favour rules or exceptions during traini

10、ngHupkes et al.,Compositionality Decomposed:How do Neural Networks Generalise?,JAIR 202011Existing ResearchGraph Neural Networks Inductive Logic Programming Neural-Symbolic Reasoning0212Reasoning with Graph Neural NetworksRepresentation and Composition A representation module to learn the embedding

11、of relations A composition module take as input the learned embeddings and a query(g,u,v):predict the relation g between u and v.Graph Neural Networks13Reasoning with Graph Neural NetworksSinha et al.,Evaluating logical generalization in graph neural networks,2020A big graph generated with a set of

12、rulesSampled subgraphsAugment with“relation nodes”Learning relation representationsComposition function predicts relation14Various Graph Neural NetworksRelational Graph Neural Networks(RGCN)Edge-based Graph Attention Network(Edge-GAT)Sinha et al.,Evaluating logical generalization in graph neural net

13、works,2020*Extends GAT*Incorporate gating via LSTM*Attention conditioned on both nodes and relations*Extends GCN*Relation-specific propagation matrix 15Experimental Observations and LimitsShown some capability of systematicity.When trains on new worlds,its performance on the previously seen worlds d

14、egrades rapidly(forgetting effect in continual learning).Rules are implicitly encapsulated in the neural networks,these models lack interpretability.We need to discover the underlying rules!Sinha et al.,Evaluating logical generalization in graph neural networks,202016Inductive Logic ProgrammingExamp

15、les(logic program)BK (logic program)ILPHypothesis(logic program)17Inductive Logic ProgrammingGiven:-background knowledge B-positive examples E+-negative examples E-Find:a hypothesis H such that:-H B entails E+-H B does not entail E-ILP problemGiven a set of positive examples,and a set of negative ex

16、amples,an ILP system constructs a logic program that entails all the positive examples but does not entail any of the negative examples.18Background axioms:Positive and negative examples:Learned rules:Example:Learning Fizz-Buzz 19Differentiable Inductive Logic Programming(ILP)A reimplementation of I

17、LP in an an end-to-end differentiable architecture.Combine the advantages of ILP and neural networks Can learn explicit human-readable symbolic rules and robust to noisy and ambiguous datalosscross entropypredicted labelextractconclusion valuationinfertrue labeltarget atomclause weightsclausesinitia

18、l valuationgeneratecomputed valuedifferentiablefunctionnon-differentiablefunctionparametersinputsdifferentiablepathnon-differentiablepathLegendprogram templateconvertlanguageaxiomsEvans et al.,Learning Explanatory Rules from Noisy Data,JAIR 201820Pros and Cons of ILPGiven:-background knowledge B-pos

19、itive examples E+-negative examples E-Find:a hypothesis H such that:-H B entails E+-H B does not entail E-ILP problemAdvantages:*the learned program can be inspected,understood,and verified *ILP systems tend to be impressively data-efficient and generalize well*support continual and transfer learnin

20、g.Disadvantages:*inability to handle noisy,erroneous,or ambiguous data.*the search space of the compositional rules exponentially grows with the number of relations,making it hard to scale beyond small rule sets.21Neuro-Symbolic Reasoning Combining neural models and symbolic reasoning given their co

21、mplementary strengths and weaknesses Neural Theorem Provers(NTPs)A family of neuro-symbolic reasoning models NTPs are continuous relaxations of the backward-chaining reasoning algorithm It replace discrete symbols with their continuous embedding representations Rocktschel T,Riedel S.End-to-end diffe

22、rentiable proving,NIPS 201722Reasoning with Backward ChainingBackward chaining can be seen as a type of and/or search OR because the goal can be proven by any rule in the KB AND because all the conjuncts in the premise of a rule must be proven.23Reasoning with Backward Chaining:ExampleA KB composed

23、by:facts p(RICK,BETH)and p(BETH,MORTY)rule g(X,Y):p(X,Z),p(Z,Y)p and g denote the relationships parent and grandparent Prove goal:G=g(RICK,MORTY)Unify G with the head of the rule g(X,Y)with the substitution X/RICK,Y/MORTY Recursively proving the subgoals p(RICK,Z),p(Z,MORTY)with and/or search,which

24、hold true for the substitution Z/BETH.MORTY YRICK XBETH Z24Neural Theorem ProverNTPs make this reasoning process more flexible and end-to-end differentiable Replace the comparison between symbols with a soft matching of their embeddings and recursively build a neural network to enumerate all possibl

25、e proof paths.Three modules:unification module compares sub-symbolic representations of logic atoms,and mutually recursive or and and modules,which jointly enumerate all possible proof paths,before the final aggregation selects the highest scoring one.Rocktschel T,Riedel S.End-to-end differentiable

26、proving,NIPS 201725Conditional Theorem Prover(CTP)An extension to NTPs that selecting subsets of rules to consider at each reasoning step Achieved by a select module that,given a goal,produce the rules needed for proving it Minervini et al.,Learning reasoning strategies in end-to-end differentiable

27、proving,ICML 202026Weaknesses of Existing MethodsGraph Neural Networks(GNNs)No explicit rules,hard to interpret Inductive Logic Programming(ILP)Hard to scale up to a lot relations Neural-Symbolic Learning(e.g.,NTP and CTP)Systematicity is still not good enough R5 is able to perform relation predicti

28、on while preserving the systematicity and interpretability.27R5 FrameworkRule Discovery with Reinforced and Recurrent Relational Reasoning:consider reasoning over graphs as sequential decision making0328Overview of R5r3Short definite clause:Long Horn clause:The inference path(long horn clause)is dec

29、omposed into a number of short definite clauses via decision-making.At each step of the decision-making process,a short definite clause,which is an action,will be taken,and the paths will be deducted accordingly.At the end of an episode,only one relation between X and Y is left,which will be the out

30、put as the predicted relationship in between.29Path Samplingr330R5 transforms a relation graph into a set of paths connecting the queried node pairs,which consist of only relations.Namely,we neglect the nodes in our model since the node identity usually contributes little to the rules in between.Pat

31、h Sampling31Recurrent Relational Reasoningr332Recurrent Relational ReasoningWhether successfully predict the target relationAction probabilities searched by MCTSAction probabilities by the policy networkState valuer3 Both the MCTS and the policy value network output an action probability distributio

32、n.The network is updated by minimizing the difference between these two probability distributions.The policy value network also outputs a state value,and the model is also updated by minimizing the state value and the final reward z.Final reward An actionState:the features of relation pairs If there

33、 are m kinds of relations in the dataset,and n kinds of invented relations we define,then there are in total(m+n)2 possible relation pairs.k predefined features are collected according to the sampled paths.33r3r3Rule Induction with a Dynamic Rule Memoryr334Rule Induction with a Dynamic Rule Memoryr3

34、An actionkeythe relation to be replaced toIf the relation pair is not in the memory,then it should be replaced to an invented relation that has not been used.And this should be memorized as a rule in the Dynamic Rule Memory.During training,at the end of an episode,R5 checks whether the obtained rela

35、tion matches the answer in the dataset.If not,we need to update the dynamic memory with the Backtrack rewriting algorithm that is described in Algorithm 1.35Example36ExperimentsStrong generalization ability of R5 Extend to larger-scale graphs 0437Experiments on CLUTRR DatasetClean dataset-CLUTRR:All

36、 the samples are correct.The inference path length of the training samples are 2,3 and 2,3,4.The inference path length of the testing samples are 4,5,6,7,8,9,10 and 5,6,7,8,9,10.Namely,train on small-scale tasks,and generalize to larger tasks.Sinha et al.,CLUTRR:A Diagnostic Benchmark for Inductive

37、Reasoning from Text.EMNLP 201938Experiments39Experiments40Noisy dataset-GraphLog contains more wrong examples(follows the true inference path cannot give the correct prediction).the graphs are larger than CLUTRR.the underlying inference path may not be shared between a train set and its correspondin

38、g test set.GraphLog also provides the rules used to generate their datasets,which can be used to validate the rules recall rate.ExperimentsSinha et al.,Evaluating logical generalization in graph neural networks,202041ExperimentsThe length of the inference paths can be as long as 15 steps.Other model

39、s,like probabilistic models or the ILP models,need to consider more than 6.5 x 1020 rules if number of relation type is 20,which will lead to memory issues.Besides,R5 has nearly 100%of recall rate on most of these datasets.42Ablation StudyInvented relations speeds up the convergence The policy value

40、 network.guarantees the accuracy43Conclusion and Future WorkConclusion R5 exhibits high accuracy for relation prediction R5 has a high recall rate for rule discovery R5 has a strong ability of systematicity and is robust to data noise Future work Extend to large scale graphs Reasoning beyond horn ru

41、les Reasoning without explicit graph structures44ReferencesHupkes D,Dankers V,Mul M,Bruni E.Compositionality decomposed:How do neural networks generalise?.Journal of Artificial Intelligence Research.2020 Apr 12;67:757-95.Evans R,Grefenstette E.Learning explanatory rules from noisy data.Journal of Ar

42、tificial Intelligence Research.2018 Jan 26;61:1-64.Sinha K,Sodhani S,Dong J,Pineau J,Hamilton WL.CLUTRR:A Diagnostic Benchmark for Inductive Reasoning from Text.In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Nat

43、ural Language Processing(EMNLP-IJCNLP)2019 Nov(pp.4506-4515).Sinha K,Sodhani S,Pineau J,Hamilton WL.Evaluating logical generalization in graph neural networks.arXiv preprint arXiv:2003.06560.2020 Mar 14.Rocktschel T,Riedel S.End-to-end differentiable proving.Advances in neural information processing

44、 systems.2017;30.Minervini P,Riedel S,Stenetorp P,Grefenstette E,Rocktschel T.Learning reasoning strategies in end-to-end differentiable proving.In International Conference on Machine Learning 2020 Nov 21(pp.6938-6949).PMLR.Lu S,Liu B,Mills KG,JUI S,Niu D.R5:Rule Discovery with Reinforced and Recurrent Relational Reasoning.InInternational Conference on Learning Representations 2021 Sep 29.45常感谢您的观看|

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(3-6 带有规则发现的关系推理.pdf)为本站 (云闲) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部