上海品茶

清华大学:2024大模型工具学习报告(英文版)(48页).pdf

编号:163236  PDF  PPTX   中文版 48页 9.52MB 下载积分:VIP专享
下载报告请您先登录!

清华大学:2024大模型工具学习报告(英文版)(48页).pdf

1、THUNLPTool Learning秦禹嘉0THUNLPBackground1 Tools are extensions of human capabilities designed to enhance productivity,efficiency,and problem-solving Throughout history,humans have been the primary agents in the invention and manipulation of tools Question:can artificial intelligence be as capable as

2、humans in tool use?2Tools and IntelligenceTools and Intelligence The answer is yes with foundation models Strong semantic understanding Extensive world knowledge Powerful reasoning and planning capabilities3Tools and IntelligenceTools and Intelligence4Tools and IntelligenceTools and Intelligence Too

3、l Learning 1:foundation models can follow human instructions and manipulate tools for task solving1 Qin,Yujia,et al.Tool Learning with Foundation Models.arXiv preprint arXiv:2304.08354(2023).Tool-augmented learning Augment foundation models with the execution results from tools Tools are viewed as c

4、omplementary resources that aid in the generation of high-quality outputs5Categorization of Tool LearningCategorization of Tool Learning6Categorization of Tool LearningCategorization of Tool Learning Tool-oriented learning Utilize models to govern tools and make sequential decisions in place of huma

5、ns Exploiting foundation models vast world knowledge and reasoning ability for complex reasoning and planningTHUNLPFramework78FrameworkFrameworkTool Set:a collection of tools with different functionalitiesEnvironment provides the platform where tools operateThe perceiver summarizes feedback to the c

6、ontrollerController provides feasible plans to fulfill user requests Comprehending the underlying purpose of an instruction Learning a mapping from the instruction space to the models cognition space Instruction Tuning9Intent UnderstandingIntent Understanding Wrap tasks with diverse instructions Sup

7、ervised fine-tuning Extraordinary generalization capability1 Finetuned Language Models Are Zero-Shot Learners2 Multitask Prompted Training Enables Zero-Shot Task Generalization 3 OPT-IML:Scaling Language Model Instruction Meta Learning through the Lens of Generalizationuserid:444287,docid:155342,dat

8、e:2024-05-19, Scaling up the model size and the diversity of instruction-tuning datasets Enhancement of generalization capability Challenges Understanding Vague Instructions:vagueness and ambiguity in the user query Theoretically Infinite Instruction Space:infinite expression and personalized instru

9、ctions 10Intent UnderstandingIntent Understanding11Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Zero-shot prompting:Describe API functionalities,their input/output formats,possible parameters,etc.Allow the model to understand the tasks that each API can tackle Few

10、-shot prompting:Provide concrete tool-use demonstrations to the model By mimicking human behaviors from these demonstrations,the model can learn how to utilize these tools12Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Introspective Reasoning Generate a static plan

11、 without interacting with the environment Extrospective Reasoning Generate a dynamic plan considering the change of environment and feedbacks13Planning and ReasoningPlanning and Reasoning Introspective Reasoning If prompted appropriately,PLMs can effectively decompose high-level tasks into mid-level

12、 plans without any further training14Planning and ReasoningPlanning and ReasoningLanguage Models as Zero-Shot Planners:Extracting Actionable Knowledge for Embodied Agents Extrospective Reasoning Challenge:foundation models are not embodied or grounded to the physical world Solution:constrain the mod

13、el to propose natural language actions that are both feasible and contextually appropriate15Planning and ReasoningPlanning and ReasoningDo as I can,Not as I say!Ahn,Michael,et al.Do as i can,not as i say:Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691(2022).Extrospective Re

14、asoning Inner Monologue 1:injecting information from various sources of feedback into model planning16Planning and ReasoningPlanning and Reasoning1 Huang,Wenlong,et al.Inner monologue:Embodied reasoning through planning with language models.arXiv preprint arXiv:2207.05608(2022).Multi-step Multi-tool

15、 Scenarios Humans wont stick to one scenario and one tool Understanding the Interplay among Different Tools Models should not only understand individual tools,but learn their combination usage and order the tools logically From Sequential Execution to Parallel Execution Tools do not have to be perfo

16、rmed sequentially,parallel performing leads to superimposed effects From Single-agent Problem-Solving to Multi-agent Collaboration Complex tasks often necessitate collaboration among multiple agents,each with their unique expertise17Planning and ReasoningPlanning and Reasoning Learning from demonstr

17、ations:often involves(human)annotations Learning from feedback:often involves reinforcement learning18Training StrategiesTraining Strategies Supervised Learning Clone human behavior to use search engines Supervised fine-tuning+reinforcement learning Only need 6,000 annotated data19WebGPTWebGPTNakano

18、,Reiichiro,et al.WebGPT:Browser-assisted question-answering with human feedback.arXiv preprint arXiv:2112.09332(2021).Motivation WebGPT is not public,and its inner workings remain opaque Our Efforts(WebCPM)Open-source interactive web search interface The first public QA dataset that involves interac

19、tive web search,and also the first Chinese LFQA dataset Framework and Model Implementation20WebCPMWebCPM Interface(search mode)and pre-defined actions21WebCPMWebCPM22WebCPMWebCPM Our framework consists of two models:1.Search model,consisting of:Action prediction module Search query generation module

20、 Supporting fact extraction module 2.Information synthesis model23WebCPMWebCPMFor an action sequence of T steps,the search model executes actions to collect supporting facts,which are sent to the synthesis model for answer generation.24WebCPMWebCPMHolistic Pipeline Evaluation(based on human preferen

21、ce)Model-generated Answer v.s.Human AnnotationThree sources of supporting facts are sent to the synthesis model(1)pipeline-collected,(2)human-collected,(3)non-interactive search(TF-IDF)25WebCPMWebCPM Learning to perform online shopping26WebShopWebShop Self-supervised Tool Learning Pre-defined tool A

22、PIs Encourage models to call and execute tool APIs Design self-supervised loss to see if the tool execution can help language modeling27ToolformerToolformerIf the tool execution reduces LM loss,save the instances as training data From Tool User to Tool Creator Humans are the primary agents that crea

23、te and use tools from Stone Age to 21st century Most tools are created for humans,not AI Tools Made for Models Modularized:compose tools into smaller units New input and output formats:more computable and suitable for AI28Tool CreationTool Creation29Tool CreationTool Creation Limitations of Existing

24、 Works Most existing work tends to concentrate on a limited number of tools The reasoning process employed by models for determining the optimal utilization of tools is inherently complex The current pipelines lack a error-handling mechanism after retrieving execution results Instead of letting LLMs

25、 act as the users of tools,we enable them to be the creators 130Tool CreationTool CreationQian,Cheng,et al.CREATOR:Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation.31Tool CreationTool Creation Four Procedures Creation Decision Execution Rectification32Too

26、l CreationTool Creation Experiments Datasts:MATH,TabMWP Significant improvements over PoT and pure CoTTHUNLPApplication33 OpenAIs official tool library Empower ChatGPT with broader applications By simply providing APIs with descriptions,ChatGPT is enabled to call applications and complete more compl

27、ex tasks34ChatGPTChatGPT PluginsPlugins BMTools An open-source repository that extends language models to use tools and serves as a platform for the community to build and share tools35OpenOpen-source Solutionssource Solutions Features:Users can easily build a new plugin by writing python functions

28、and use external ChatGPT-Plugins Users can host their local models(e.g.,LLaMA,CPM)to use tools36OpenOpen-source Solutionssource Solutionshttps:/ Features:30+tools tools supported,welcome contributing!37OpenOpen-source Solutionssource SolutionsdatabaseWeather APIPPTGoogle ScholarHuggingface ModelsIma

29、ge Generationhttps:/ Features:Support BabyAGI and AutoGPT 100k+tool-use SFT data on the way!38OpenOpen-source Solutionssource Solutionshttps:/ Solutionssource Solutions40OpenOpen-source Solutionssource Solutions ToolBench An open-source,large-scale,high-quality instruction tuning SFT data to facilit

30、ate general tool-use capability We provide the dataset,the corresponding training and evaluation scripts,and a capable model ToolLLaMA fine-tuned on ToolBenchhttps:/ Solutionssource Solutions Features Both single-tool and multi-tool scenarios are supported ToolBench provides responses that not only

31、include the final answer but also incorporate the models chain-of-thought process,tool execution,and tool execution results Multi-step decision making and tool execution Another notable advantage is the diversity of our API,which is designed for real-world scenarios 98k instances,312k API callshttps

32、:/ Solutionssource Solutions Construction Process All the data is automatically generated by OpenAI API and then filtered,the whole data creation process is easy to scale uphttps:/ Solutionssource Solutions Creation Process We provide the dataset,the corresponding training and evaluation scripts,and

33、 a capable model ToolLLaMAhttps:/ Solutionssource Solutions Evaluation ToolLLaMA matches ChatGPTs capabilities in tool use Auto-evaluated by ChatGPT(higher is better)https:/ Traditional language tasks are(almost)well solved Syntactic parsing,entity recognition,sentiment analysis We are facing more c

34、hallenging tasks!Foundation models can be leveraged in complex scenarios by using language,and the performance may largely rely on LLMs effectiveness Theoretical issues still exist Practical issues still exist Explore leveraging tool learning in complex scenarios46Tool Learning Paper ListTool Learning Paper Listhttps:/

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(清华大学:2024大模型工具学习报告(英文版)(48页).pdf)为本站 (Yoomi) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
会员动态
会员动态 会员动态:

185**08... 升级为高级VIP  wei**n_...  升级为至尊VIP 

151**13... 升级为至尊VIP 136**32...  升级为高级VIP 

wei**n_...  升级为至尊VIP  132**99... 升级为高级VIP 

 Hen** H... 升级为高级VIP  wei**n_... 升级为至尊VIP

 wei**n_...  升级为标准VIP S** 升级为标准VIP  

 wei**n_...  升级为至尊VIP  wei**n_...  升级为高级VIP

 wei**n_... 升级为高级VIP   188**66... 升级为至尊VIP

wei**n_...  升级为高级VIP  181**98... 升级为标准VIP 

wei**n_...  升级为至尊VIP    180**15... 升级为高级VIP

136**53... 升级为标准VIP   wei**n_...   升级为至尊VIP

150**25... 升级为至尊VIP   wei**n_... 升级为标准VIP

 wei**n_... 升级为标准VIP wei**n_...  升级为标准VIP 

 wei**n_...  升级为高级VIP  135**09... 升级为至尊VIP

 微**... 升级为标准VIP wei**n_...  升级为标准VIP

 wei**n_... 升级为标准VIP   wei**n_...  升级为至尊VIP

wei**n_... 升级为至尊VIP  wei**n_... 升级为标准VIP

 138**02... 升级为至尊VIP  138**98... 升级为标准VIP 

微**... 升级为至尊VIP  wei**n_... 升级为标准VIP 

 wei**n_... 升级为高级VIP  wei**n_...  升级为高级VIP

 wei**n_... 升级为至尊VIP   三**...  升级为高级VIP

186**90...  升级为高级VIP  wei**n_...  升级为高级VIP

 133**56... 升级为标准VIP   152**76...  升级为高级VIP

wei**n_... 升级为标准VIP  wei**n_...  升级为标准VIP

  wei**n_... 升级为至尊VIP wei**n_...  升级为标准VIP

133**18...  升级为标准VIP   wei**n_...  升级为高级VIP

 wei**n_... 升级为标准VIP  微**... 升级为至尊VIP 

 wei**n_... 升级为标准VIP   wei**n_...  升级为高级VIP

187**11... 升级为至尊VIP  189**10...  升级为至尊VIP

 188**51...  升级为高级VIP 134**52... 升级为至尊VIP

134**52... 升级为标准VIP   wei**n_... 升级为高级VIP  

  学**... 升级为标准VIP liv**vi...  升级为至尊VIP

 大婷 升级为至尊VIP  wei**n_... 升级为高级VIP 

wei**n_... 升级为高级VIP  微**...   升级为至尊VIP

 微**...  升级为至尊VIP wei**n_... 升级为至尊VIP 

 wei**n_...  升级为至尊VIP wei**n_...  升级为至尊VIP

战** 升级为至尊VIP  玍子 升级为标准VIP

 ken**81... 升级为标准VIP 185**71... 升级为标准VIP 

wei**n_... 升级为标准VIP  微**... 升级为至尊VIP

 wei**n_... 升级为至尊VIP 138**73... 升级为高级VIP 

 138**36... 升级为标准VIP   138**56... 升级为标准VIP

wei**n_... 升级为至尊VIP  wei**n_... 升级为标准VIP

  137**86... 升级为高级VIP 159**79... 升级为高级VIP 

wei**n_... 升级为高级VIP  139**22... 升级为至尊VIP

151**96... 升级为高级VIP   wei**n_... 升级为至尊VIP

186**49...  升级为高级VIP  187**87... 升级为高级VIP 

wei**n_... 升级为高级VIP wei**n_...  升级为至尊VIP

sha**01... 升级为至尊VIP  wei**n_... 升级为高级VIP

139**62...  升级为标准VIP  wei**n_...  升级为高级VIP

跟**...  升级为标准VIP   182**26... 升级为高级VIP

 wei**n_... 升级为高级VIP  136**44...  升级为高级VIP