《清华大学:2024大模型工具学习报告(英文版)(48页).pdf》由会员分享,可在线阅读,更多相关《清华大学:2024大模型工具学习报告(英文版)(48页).pdf(48页珍藏版)》请在三个皮匠报告上搜索。
1、THUNLPTool Learning秦禹嘉0THUNLPBackground1 Tools are extensions of human capabilities designed to enhance productivity,efficiency,and problem-solving Throughout history,humans have been the primary agents in the invention and manipulation of tools Question:can artificial intelligence be as capable as
2、humans in tool use?2Tools and IntelligenceTools and Intelligence The answer is yes with foundation models Strong semantic understanding Extensive world knowledge Powerful reasoning and planning capabilities3Tools and IntelligenceTools and Intelligence4Tools and IntelligenceTools and Intelligence Too
3、l Learning 1:foundation models can follow human instructions and manipulate tools for task solving1 Qin,Yujia,et al.Tool Learning with Foundation Models.arXiv preprint arXiv:2304.08354(2023).Tool-augmented learning Augment foundation models with the execution results from tools Tools are viewed as c
4、omplementary resources that aid in the generation of high-quality outputs5Categorization of Tool LearningCategorization of Tool Learning6Categorization of Tool LearningCategorization of Tool Learning Tool-oriented learning Utilize models to govern tools and make sequential decisions in place of huma
5、ns Exploiting foundation models vast world knowledge and reasoning ability for complex reasoning and planningTHUNLPFramework78FrameworkFrameworkTool Set:a collection of tools with different functionalitiesEnvironment provides the platform where tools operateThe perceiver summarizes feedback to the c
6、ontrollerController provides feasible plans to fulfill user requests Comprehending the underlying purpose of an instruction Learning a mapping from the instruction space to the models cognition space Instruction Tuning9Intent UnderstandingIntent Understanding Wrap tasks with diverse instructions Sup
7、ervised fine-tuning Extraordinary generalization capability1 Finetuned Language Models Are Zero-Shot Learners2 Multitask Prompted Training Enables Zero-Shot Task Generalization 3 OPT-IML:Scaling Language Model Instruction Meta Learning through the Lens of Generalizationuserid:444287,docid:155342,dat
8、e:2024-05-19, Scaling up the model size and the diversity of instruction-tuning datasets Enhancement of generalization capability Challenges Understanding Vague Instructions:vagueness and ambiguity in the user query Theoretically Infinite Instruction Space:infinite expression and personalized instru
9、ctions 10Intent UnderstandingIntent Understanding11Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Zero-shot prompting:Describe API functionalities,their input/output formats,possible parameters,etc.Allow the model to understand the tasks that each API can tackle Few
10、-shot prompting:Provide concrete tool-use demonstrations to the model By mimicking human behaviors from these demonstrations,the model can learn how to utilize these tools12Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Introspective Reasoning Generate a static plan
11、 without interacting with the environment Extrospective Reasoning Generate a dynamic plan considering the change of environment and feedbacks13Planning and ReasoningPlanning and Reasoning Introspective Reasoning If prompted appropriately,PLMs can effectively decompose high-level tasks into mid-level
12、 plans without any further training14Planning and ReasoningPlanning and ReasoningLanguage Models as Zero-Shot Planners:Extracting Actionable Knowledge for Embodied Agents Extrospective Reasoning Challenge:foundation models are not embodied or grounded to the physical world Solution:constrain the mod
13、el to propose natural language actions that are both feasible and contextually appropriate15Planning and ReasoningPlanning and ReasoningDo as I can,Not as I say!Ahn,Michael,et al.Do as i can,not as i say:Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691(2022).Extrospective Re
14、asoning Inner Monologue 1:injecting information from various sources of feedback into model planning16Planning and ReasoningPlanning and Reasoning1 Huang,Wenlong,et al.Inner monologue:Embodied reasoning through planning with language models.arXiv preprint arXiv:2207.05608(2022).Multi-step Multi-tool
15、 Scenarios Humans wont stick to one scenario and one tool Understanding the Interplay among Different Tools Models should not only understand individual tools,but learn their combination usage and order the tools logically From Sequential Execution to Parallel Execution Tools do not have to be perfo
16、rmed sequentially,parallel performing leads to superimposed effects From Single-agent Problem-Solving to Multi-agent Collaboration Complex tasks often necessitate collaboration among multiple agents,each with their unique expertise17Planning and ReasoningPlanning and Reasoning Learning from demonstr
17、ations:often involves(human)annotations Learning from feedback:often involves reinforcement learning18Training StrategiesTraining Strategies Supervised Learning Clone human behavior to use search engines Supervised fine-tuning+reinforcement learning Only need 6,000 annotated data19WebGPTWebGPTNakano
18、,Reiichiro,et al.WebGPT:Browser-assisted question-answering with human feedback.arXiv preprint arXiv:2112.09332(2021).Motivation WebGPT is not public,and its inner workings remain opaque Our Efforts(WebCPM)Open-source interactive web search interface The first public QA dataset that involves interac
19、tive web search,and also the first Chinese LFQA dataset Framework and Model Implementation20WebCPMWebCPM Interface(search mode)and pre-defined actions21WebCPMWebCPM22WebCPMWebCPM Our framework consists of two models:1.Search model,consisting of:Action prediction module Search query generation module
20、 Supporting fact extraction module 2.Information synthesis model23WebCPMWebCPMFor an action sequence of T steps,the search model executes actions to collect supporting facts,which are sent to the synthesis model for answer generation.24WebCPMWebCPMHolistic Pipeline Evaluation(based on human preferen
21、ce)Model-generated Answer v.s.Human AnnotationThree sources of supporting facts are sent to the synthesis model(1)pipeline-collected,(2)human-collected,(3)non-interactive search(TF-IDF)25WebCPMWebCPM Learning to perform online shopping26WebShopWebShop Self-supervised Tool Learning Pre-defined tool A
22、PIs Encourage models to call and execute tool APIs Design self-supervised loss to see if the tool execution can help language modeling27ToolformerToolformerIf the tool execution reduces LM loss,save the instances as training data From Tool User to Tool Creator Humans are the primary agents that crea
23、te and use tools from Stone Age to 21st century Most tools are created for humans,not AI Tools Made for Models Modularized:compose tools into smaller units New input and output formats:more computable and suitable for AI28Tool CreationTool Creation29Tool CreationTool Creation Limitations of Existing
24、 Works Most existing work tends to concentrate on a limited number of tools The reasoning process employed by models for determining the optimal utilization of tools is inherently complex The current pipelines lack a error-handling mechanism after retrieving execution results Instead of letting LLMs
25、 act as the users of tools,we enable them to be the creators 130Tool CreationTool CreationQian,Cheng,et al.CREATOR:Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation.31Tool CreationTool Creation Four Procedures Creation Decision Execution Rectification32Too
26、l CreationTool Creation Experiments Datasts:MATH,TabMWP Significant improvements over PoT and pure CoTTHUNLPApplication33 OpenAIs official tool library Empower ChatGPT with broader applications By simply providing APIs with descriptions,ChatGPT is enabled to call applications and complete more compl
27、ex tasks34ChatGPTChatGPT PluginsPlugins BMTools An open-source repository that extends language models to use tools and serves as a platform for the community to build and share tools35OpenOpen-source Solutionssource Solutions Features:Users can easily build a new plugin by writing python functions
28、and use external ChatGPT-Plugins Users can host their local models(e.g.,LLaMA,CPM)to use tools36OpenOpen-source Solutionssource Solutionshttps:/ Features:30+tools tools supported,welcome contributing!37OpenOpen-source Solutionssource SolutionsdatabaseWeather APIPPTGoogle ScholarHuggingface ModelsIma
29、ge Generationhttps:/ Features:Support BabyAGI and AutoGPT 100k+tool-use SFT data on the way!38OpenOpen-source Solutionssource Solutionshttps:/ Solutionssource Solutions40OpenOpen-source Solutionssource Solutions ToolBench An open-source,large-scale,high-quality instruction tuning SFT data to facilit
30、ate general tool-use capability We provide the dataset,the corresponding training and evaluation scripts,and a capable model ToolLLaMA fine-tuned on ToolBenchhttps:/ Solutionssource Solutions Features Both single-tool and multi-tool scenarios are supported ToolBench provides responses that not only
31、include the final answer but also incorporate the models chain-of-thought process,tool execution,and tool execution results Multi-step decision making and tool execution Another notable advantage is the diversity of our API,which is designed for real-world scenarios 98k instances,312k API callshttps
32、:/ Solutionssource Solutions Construction Process All the data is automatically generated by OpenAI API and then filtered,the whole data creation process is easy to scale uphttps:/ Solutionssource Solutions Creation Process We provide the dataset,the corresponding training and evaluation scripts,and
33、 a capable model ToolLLaMAhttps:/ Solutionssource Solutions Evaluation ToolLLaMA matches ChatGPTs capabilities in tool use Auto-evaluated by ChatGPT(higher is better)https:/ Traditional language tasks are(almost)well solved Syntactic parsing,entity recognition,sentiment analysis We are facing more c
34、hallenging tasks!Foundation models can be leveraged in complex scenarios by using language,and the performance may largely rely on LLMs effectiveness Theoretical issues still exist Practical issues still exist Explore leveraging tool learning in complex scenarios46Tool Learning Paper ListTool Learning Paper Listhttps:/