《方蒙_Generative AI and Large Language Models_watermark.pdf》由会员分享,可在线阅读,更多相关《方蒙_Generative AI and Large Language Models_watermark.pdf(95页珍藏版)》请在三个皮匠报告上搜索。
1、Introduction to Generative AI and Large Language ModelsMeng FangRL China 2023What is Generative AI?Artificial intelligence systems that can produce high quality content,specifically text,images,and audio.The rise of generative AIZhao,Wayne Xin,et al.A survey of large language models.arXiv preprint a
2、rXiv:2303.18223(2023).T5Text-to-text-transfer-transformerRaffel,Colin et al.Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.arXiv preprint arXiv:2303.18223(2023).GPT-3Generative Pre-trained Transformer 3OpenAI 2020Language modellingLanguage Modeling is the task of p
3、redicting what word comes nextMore formally,given a sequence of words,language modelling can compute the probability distribution of the next word:P(foods|I like trying new),P(hobbies|I like trying new),.A system that does this is called a Language ModelI like trying new _ foods hobbies products act
4、ivitiesLanguage Models are everywhereLanguage Models are everywhereGenerative Pre-trained Transformers(GPT)Decoder-based transformers The first GPT model,introduced in 2018 by OpenAI,was just the decoder part of the original transformer.Input:What is NLP?Output:NLP stands for Natural Language Proces
5、sing,which is a subfield of artificial intelligence(AI)that focuses on the interaction between computers and.From InternetTransformersStandard transformers:encoder-decoder architectureVaswani,Ashish etal.Attention Is All You Need.2017.Transformer blockDecoder-based transformersFrom InternetGPT Archi
6、tecture A stack of decoders(decoder blocks)GPT models from OpenAIGPT-2/-3/-4 have mostly just been larger versions.With the key differences coming from training data and training processes.GPT(2018)GPT-2(2019)GPT-3(2020)GPT/GPT-112xBooksCorpusGPT-248xWebTextGPT-396xWebText 2The rise of generative AI
7、Zhao,Wayne Xin,et al.A survey of large language models.arXiv preprint arXiv:2303.18223(2023).GPT models have many parametersModel size refers the number of parameters in the model.The larger the model,the better it will be in understanding the nuances of human language.From Web search to ChatGPTAI i
8、s already pervasive in our livesWeb searchRecommender systemChatGPTLanguage translationText generation/Writing a story CodingImage,Audio and Video generationPromptAn astronaut riding a steel horse on the moon.The astronaut is wearing a medieval armor with a party hat and a green sword.From InternetH
9、ow generative AI worksFrom Machine Learning to Generative AI1956,Artificial intelligence1997,Machine Learning2017,Deep Learning2021,Generative AISupervised learning(labeling things)Input:a text or an imageOutput:a labelBishop,Christopher M.,and Nasser M.Nasrabadi.Pattern recognition and machine lear
10、ning.Vol.4.No.4.New York:springer,2006.ClassificationGenerating text using Large Language Models(LLMs)Text generation processI like trying _ new recipes.different art forms.varied workouts.challenging puzzles.unique teas.prompt/inputoutputHow do large language models work Tokenizer,text to numbers:L
11、arge Language Models receive a text as input and generate a text as output.How do large language models work Predicting the next token repeatedly:Given n tokens as input(with max n varying from one model to another),the model is able to predict one token as output.This token is then incorporated int
12、o the input of the next iteration,in an expanding window pattern,enabling a better user experience of getting one(or multiple)sentence as an answer.Selection process,probability distribution:The output token is chosen by the model according to its probability of occurring after the current text sequ
13、ence.I like trying _ newI like trying new _ recipesI like trying new recipes _ .40,1093,4560,502 19141,13How do large language models work I like trying _ newI like trying new _ recipesI like trying new recipes _ .40,1093,4560,502 19141,13Input-OutputIt is a form of supervised learning.We can train
14、a very large model on a vast amount of data,such as hundreds of billions of words,to obtain a language model like ChatGPT.A new way to find informationA new translatorWriting assistantSearch or Chat?Applications-WritingApplications-TranslationApplications-CodingApplications-ChattingPrompt Engineerin
15、gMeng FangRL China 2023What is Prompt Engineering?Prompt Engineering is the process of designing and optimising text inputs(prompts)to deliver consistent and quality responses(completions)for a given application objective and model.Prompt EngineeringPrompt engineering is currently more art than scie
16、nce.The best way to improve our intuition for it is to practice more and adopt a trial-and-error approach that combines application domain expertise with recommended techniques and model-specific optimizations.To execute the exercises you will need:An OpenAI API key-the service endpoint for a deploy
17、ed LLM.A Python Runtime-in which the Notebook can be executed.OPENAI_API_KEY=st-xxxxxxxPrompt EngineeringWe can think of this as a 2-step process:designing the initial prompt for a given model and objectiverefining the prompt iteratively to improve the quality of the responseTrial-and-error process
18、We first need to understand three concepts:Tokenization=how the model sees the promptBase LLMs=how the foundation model processes a promptInstruction-Tuned LLMs=how the model can now see tasksTokenizationhttps:/ Models-LLMs Once a prompt is tokenized,the primary function of the Base LLM(or Foundatio
19、n model)is to predict the token in that sequence.Since LLMs are trained on massive text datasets,they have a good sense of the statistical relationships between tokens and can make that prediction with some confidence.Note that they dont understand the meaning of the words in the prompt or token;the
20、y just see a pattern they can complete with their next prediction.They can continue predicting the sequence till terminated by user intervention or some pre-established condition.We can chat.But what if the user wanted to see something specific that met some criteria or task objective?Instruction Tu
21、ned LLMsAn Instruction Tuned LLM starts with the foundation model and fine-tunes it with examples or input/output pairs(e.g.,multi-turn messages)that can contain clear instructions-and the response from the AI attempt to follow that instruction.This uses techniques like Reinforcement Learning with H
22、uman Feedback(RLHF)that can train the model to follow instructions and learn from feedback so that it produces responses that are better-suited to practical applications and more-relevant to user objectives.Instruction examplesPrompt:Output:Classify the text into neutral,negative or positive.Text:I
23、like walking.Sentiment:PositiveInstruction examplesPrompt:Output:Summarize the text delimited by triple backticks into a single sentence.You should express what you want a model to do by providing instructions that are as clear and specific as you can possibly make them.This will guide the model tow
24、ards the desired output,and reduce the chances of receiving irrelevant or incorrect responses.Dont confuse writing a clear prompt with writing a short prompt.In many cases,longer prompts provide more clarity and context for the model,which can lead to more detailed and relevant outputs.Clear and spe
25、cific instructions are crucial when guiding a models behavior,emphasizing the importance of expressing desired outcomes to reduce the likelihood of irrelevant or incorrect responses;the length of the prompt should not be sacrificed for clarity and context,as longer prompts often lead to more detaile
26、d and relevant model outputs.What LLMs can and cannot doKnowledge cutoffs:An LLMs knowledge of the world is frozen at the time of its training.HallucinationsThe input(and output)length is limited.Bias and Toxicity:An LLM can reflect the biases that exist in the text it learned.Advanced TechnologiesZ
27、ero-shot PromptingFew-shot PromptingChain-of-Thought PromptingSelf-ConsistencyZero-Shot PromptingPrompt:Classify the sentiment of the following statement:textOptions:1.Very Positive2.Positive3.Neutral4.Negative5.Very Negative*Replace text with the actual text you want to classify for sentiment.Few-S
28、hot Prompting1-shot3-shotClassify the sentiment of the following sentences as Positive,Negative,or Neutral.Sentence:I love sunny days.Sentiment:PositiveSentence:This soup tastes terrible.Sentiment:NegativeSentence:He is always so kind and helpful.Sentiment:PositiveSentence:Their performance exceeded
29、 all my expectations!Sentiment:Classify the sentiment of the following sentences as Positive,Negative,or Neutral.Sentence:I love sunny days.Sentiment:PositiveSentence:Their performance exceeded all my expectations!Sentiment:Few-Shot Prompting5-shotClassify the sentiment of the following sentences as
30、 Positive,Negative,or Neutral.Sentence:I love sunny days.Sentiment:PositiveSentence:This soup tastes terrible.Sentiment:NegativeSentence:He is always so kind and helpful.Sentiment:PositiveSentence:Im feeling very sick today.Sentiment:NegativeSentence:The movie was okay,nothing special.Sentiment:Neut
31、ralSentence:Their performance exceeded all my expectations!Sentiment:Some Tips for Few-Shot Promptingthe label space and the distribution of the input text specified by the demonstrations are both important(regardless of whether the labels are correct for individual inputs).the format you use also p
32、lays a key role in performance,even if you just use random labels,this is much better than no labels at all.additional results show that selecting random labels from a true distribution of labels(instead of a uniform distribution)also helps.*random labelsChain-of-Thought PromptingWei,Jason,et al.Cha
33、in-of-thought prompting elicits reasoning in large language models.Advances in Neural Information Processing Systems 35(2022):24824-24837.Chain-of-Thought PromptingPrompt:OutputAdding all the odd numbers(15,5,13,7,1)gives 41.The answer is False.The odd numbers in this group add up to an even number:
34、4,8,9,15,12,2,1.A:Adding all the odd numbers(9,15,1)gives 25.The answer is False.The odd numbers in this group add up to an even number:17,10,19,4,8,12,24.A:Adding all the odd numbers(17,19)gives 36.The answer is True.The odd numbers in this group add up to an even number:16,11,14,4,8,13,24.A:Adding
35、 all the odd numbers(11,13)gives 24.The answer is True.The odd numbers in this group add up to an even number:17,9,10,12,13,4,2.A:Adding all the odd numbers(17,9,13)gives 39.The answer is False.The odd numbers in this group add up to an even number:15,32,5,13,82,7,1.A:Self-ConsistencySelf-consistenc
36、y aims to replace the naive greedy decoding used in chain-of-thought prompting.The idea is to sample multiple,diverse reasoning paths through few-shot CoT,and use the generations to select the most consistent answer.This helps to boost the performance of CoT prompting on tasks involving arithmetic a
37、nd commonsense reasoning.Self-ConsistencyWang,Xuezhi,et al.Self-consistency improves chain of thought reasoning in language models.arXiv preprint arXiv:2203.11171(2022).Directional Stimulus PromptingA tuneable policy LM is trained to generate the stimulus/hint.Seeing more use of RL to optimize LLMs.
38、Li,Zekun,et al.Guiding Large Language Models via Directional Stimulus Prompting.arXiv preprint arXiv:2302.11520(2023).Directional Stimulus PromptingThe policy model can be trained with SFT and/or RL,where the reward is defined as the downstream task performance measure,such as the ROUGE score for th
39、e summarization task,or other alignment measures like human preferences.ReAct PromptingReAct is a general paradigm that combines reasoning and acting with LLMs.ReAct prompts LLMs to generate verbal reasoning traces and actions for a task.Yao,Shunyu,et al.React:Synergizing reasoning and acting in lan
40、guage models.arXiv preprint arXiv:2210.03629(2022).Tree of Thoughts(ToT)Yao,Shunyu,et al.Tree of thoughts:Deliberate problem solving with large language models.arXiv preprint arXiv:2305.10601(2023).Expanding Knowledge for LLMsMeng FangRL China 2023Retrieval Augmented Generation(RAG)RAG combines an i
41、nformation retrieval component with a text generator model.RAG can be fine-tuned and its internal knowledge can be modified in an efficient manner and without needing retraining of the entire model.Lewis,Patrick,et al.Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in Neura
42、l Information Processing Systems 33(2020):9459-9474.Retrieval-Augmented GenerationGeneration:Large Language Models(LLMs),such as ChatGPT/GPT-3.5/4 and othersLLMs can generate texts in response to a user input,referred to as a prompt.LLMInput/PromptRetrievalRetrieval from a datastore or a collection
43、of documentsRepresentative tasks:open-domain QA,fact checking,entity linking,.Internet,or a collection of documentsRetrievalRetrieval from a datastore or a collection of documentsRepresentative tasks:open-domain QA,fact checking,entity linking,.Retrieval-Augmented GenerationRetrieval+LLMPrompting:-I
44、nstruction-Retrieved content-Question Asai,Akari,et al.Retrieval-based Language Models and Applications ACL(2023).Retrieval-Augmented GenerationRetrieval+LLMWhy retrieval LLMs?LLMs cant memorize all(long-tail)knowledge in their parametersLLMs knowledge is easily outdated and hard to updateLLMs outpu
45、t is challenging to interpret and verifyLLMs are shown to easily leak private training dataLLMs are*large*and expensive to train and runLLMs cant memorize all knowledge in their parametersDatabase:https:/dblp.org/LLMs knowledge is easily outdated and hard to updateThe datastore can be easily updated
46、 and expanded-even without retraining!LLMs output is challenging to interpret and verifyProvide referencesLLMs are shown to easily leak private training dataDanger!Copyrighed text.Private datastore/data access controlLLMs are*large*and expensive to train and runCan we train LLMs every week?Can we po
47、ssibly reduce the training and inference costs,and scale down the size of LLMs?Use datastore and retrievalFine-tuningTo carry out a task that isnt easy to define in a prompt.To help LLM gain specific knowledge.To get a smaller model to perform a task.Pretraining an LLMPretrain general-purpose LLMs b
48、y learning from internet text.For building a specific application:option of last resort;could help if have a highly specialized domainFrom InternetChoosing a modelModel size1B parameters:GPT-Neo(1.3B),Pythia-1B,OPT-1.3B10B parameters:LLama-13B,Mistra-13B,100B parameter:GPT-3.5,GPT-4Closed or open so
49、urce?Closed-source modelsEasy to use in applicationsMore large/powerful modelsRelatively inexpensiveSome risk of vendor lock-inOpen-source modelsFull control over modelCan run on your own deviceFull control over data privacy/accessHow do chat systems learn to follow instructions?Fine tuningHelp me b
50、rainstorm some fun museums in London.Here are some suggestions Input-OutputHelp me brainstorm some fun museums in London.-HereHelp me brainstorm some fun museums in London.Here-are some suggestionsHelp me brainstorm some fun museums in London.Here are-some suggestionsHelp me brainstorm some fun muse
51、ums in London.Here are some-suggestions-Reinforcement learning from human feedbackStep 1:Train a reward(quality)modelStep 2:Have LLM generate a lot of answers.Further train it to generate more responses that get high scoresInput-Output(Response)|(Score/Reward)Im happy to help.Here are some steps|5Ju
52、st try your best!|3It is hopeless!|1From InternetReinforcement learning from human feedbackStep 1:Train a reward(quality)modelStep 2:Have LLM generate a lot of answers.Further train it to generate more responses that get high scoresFrom InternetToolTools for reasoning:decide which APIs to call,when
53、to call them,what arguments to pass,and how to best incorporate the results into future token predictionPrompt x1:i-1-Schick,Timo,et al.Toolformer:Language models can teach themselves to use tools.arXiv preprint arXiv:2302.04761(2023).Language AgentsMeng FangRL China 2023AgentsUse LLM to choose and
54、carry out complex sequences of actionsCutting edge area of AI researchHelp me research BurgerKing to competitors:1.Search top competitors2.Visit website of each competitor3.For each competitor,write summary based on homepage contentSEARCH(BurgerKing competitors)VISIT(https:/ the following text:At Bu
55、rgerKing,we pride ourselves on customizing their meals Agent SystemPlanningDecompositionReasoningTool useThe agent learns to call external APIs for extra information that is missing from the model weights(often hard to change after pre-training)PlanningHuang,Wenlong,et al.Language Models as Zero-Sho
56、t Planners:Extracting Actionable Knowledge for Embodied Agents(ICML 2022).PlanningDecompose high-level tasks into sensible mid-level action plansEach step is admissible actionLarge Language Models(LLMs)such as GPT-3 and Codex can plan actions for embodied agents,even without any additional training.
57、Huang,Wenlong,et al.Language Models as Zero-Shot Planners:Extracting Actionable Knowledge for Embodied Agents(ICML 2022).PlanningDecompositionZhou,Denny,et al.LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS(ICLR 2023).Planning and ReasoningSynthesize programs to compose va
58、rious tools and executing them sequentially to get final answersLu,Pan et al.Chameleon:Plug-and-Play Compositional Reasoning with Large Language Models(NeurIPS 2023).Planning,Reasoning,Tool useExamplesConcerns about LLMsUnderstanding Biases and Fairness in LLMsStereotypical Bias:Large Language Model
59、s(LLMs)have the potential to generate text that upholds prevailing stereotypes regarding specific communities,thereby sustaining societal prejudices.Gender Bias:Bias related to gender can result in an imbalanced portrayal and differential treatment of genders within the text produced.Cultural Bias:P
60、rejudice originating from cultural presumptions can lead to misinterpretations or inaccurate depictions of diverse cultural settings.Political Bias:Large Language Models(LLMs)may display a tendency to show partiality toward specific political beliefs,impacting the impartiality of how information is
61、conveyed._ is a CEO=man?woman?Tom?Kate?Techniques for Bias MitigationPre-Processing MitigationIn-Training MitigationIntra-Processing MitigationPost-Processing MitigationTechniques for Bias MitigationPre-Processing Mitigation-dataFrom InternetTechniques for Bias MitigationIn-Training MitigationRLHFFr
62、om InternetTechniques for Bias MitigationIntra-Processing MitigationFrom InternetTechniques for Bias MitigationPost-Processing MitigationRewriting:Rewriting strategies detect harmful words and replace them with more positive or representative terms,using a rule-or neural-based rewriting algorithm.Ot
63、her concernsRecognizing distinct social groups:Research should meticulously investigate diverse sources of bias,discern variations in mechanisms among social groups,and devise evaluation and mitigation strategies specific to historical and structural influences,avoiding the simplistic approach of er
64、asing social group identities as a sufficient debiasing strategy.Understanding mechanisms of bias within LLMs:Research into how and in which components LLMs encode bias,and in what ways bias mitigations affect these,remains an understudied problem.We still have a lot of work to doSummary Introduction to Generative AI and LLMsPrompt EngineeringExpanding knowledge for LLMsLanguage agents and fairnessThank you.Q&AReferencesGenerative AI for Everyone.Coursera.2023Retrieval-based Language Models and Applications.ACL tutorial 2023Some technique blogs