上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

邓小铁_watermark.pdf

编号:155527 PDF 41页 1.18MB 下载积分:VIP专享
下载报告请您先登录!

邓小铁_watermark.pdf

1、PreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryProvably Bound of Nash EquilibriumApproximatorXiaotie Deng Peking University2023.11.24based onZhaohua Chen,Xiaotie Deng,Wenhan Huang,Hanyu Li,Yuhao Li:On tightness of Tsaknakis-Spirakisdescent methods

2、 for approximate Nash equilibria.Inf.Comput.293:105046(2023)Zhijian Duan,Wenhan Huang,Dinghuai Zhang,Yali Du,Jun Wang,Yaodong Yang,Xiaotie Deng:Is NashEquilibrium Approximator Learnable?AAMAS 2023:233-241Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash E

3、quilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummary1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function Approxima

4、torApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryFinite Normal-Form GamesIn a finite normal-form game,there are several players.Ea

5、ch player has a set of pure strategies.Their joint choices,called their strategy profiles,determine a payoff function for each player.At a play,each player chooses a mixed(i.e.,randomized)strategy,simultaneously and without communication.The goal of each player is to maximize its expected payoff.Xia

6、otie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryNash Equilibrium(NE)Nash equilibrium:a strategy profile that every player cangain no extra payoff by unilterally deviating fro

7、m it.Example:Paper-Scissors-Rock.Payoff functions:PSRP0,0-1,11,-1S1,-10,0-1,1R-1,11,-10,0Unique Nash equilibrium:both players play(P,S,R)withprobability(1/3,1/3,1/3)T.Theorem(Nash,1950)Every finite normal-form game has at least one Nash equilibrium.Xiaotie Deng Peking UniversityProvably Bound of Nas

8、h Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryApproximate Nash EquilibriumIn practice,we may not be able to reach a Nash equilibrium.Nash equilibrium requires players to be perfectly rational.Infinite computation power,t

9、ime,etc.We may want to find a strategy profile that is close to a Nashequilibrium.approximate Nash equilibrium:a strategy profile thatevery player can gain at most extra payoff by unilterallydeviating from it.We term as the approximation.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibri

10、um ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS Al

11、gorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash

12、 Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmReference1Zhaohua Chen,Xiaotie Deng,Wenhan Huang,Hanyu Li,andYuhao Li.On tightness of the Tsaknakis-Spirakis algorithmfor approximate Nash equilibrium.SAGT 2021.2Zhaohua C

13、hen,Xiaotie Deng,Wenhan Huang,Hanyu Li,andYuhao Li.On tightness of Tsaknakis-Spirakis descentmethods for approximate Nash equilibria.Information andComputation 293(2023):105046.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for

14、 One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilib

15、rium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Uppe

16、r Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmLower Bound ResultsComputing(approximate)Nash equilibrium is a well-studiedtopic in complexity theory.Computing 1/poly(n)-approximate Nash equilibrium isPPAD-complete for k-player games when k 2 DGP06,CD06.For small but constant,com

17、puting-approximate Nashequilibrium requires quasi-polynomial time for 2-player gamesunder plausible complexity assumptions(ETH for PPAD)Rubinstein,2016.Computing approximate Nash equilibrium is hard evenfor 2-player games and small but constantapproximations.Xiaotie Deng Peking UniversityProvably Bo

18、und of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmUpper Bound ResultsThe other direction:find the minimum (after normalization

19、)admitting a polynomial-time algorithm.Almost all known algorithms only work for 2-player games,and follow the same framework:Search-and-mix methods:Search for several strategies x1,xsof player 1 andy1,ytof player 2 in polynomial time.Make certain convex combinations on these strategies,obtaining st

20、rategy profiles(x1,y1),.,(xu,yu).Calculate i=argmin1iuapprox(xi,yi);output(xi,yi).Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS Alg

21、orithmTight InstancesTightness of DFM AlgorithmUpper Bound ResultsSummary of polynomial-time algorithms computing-NE withconstant.Author initialsPublish yearApproximationKPS20060.75DMP20060.5DMP20070.38+BBM20070.36TS20070.3393+CDFFJS20150.38DFM20221/3+Xiaotie Deng Peking UniversityProvably Bound of

22、Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Boun

23、d ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator fo

24、r One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmThe previous state-of-the-art approximation,0.3393+,isattained by the TS algorithm.In WINE 2007,H.Tsaknakis and P.G.Spirakis proposed it.An optim

25、ization approach-descent procedure.The current state-of-the-art approximation is 1/3+DFM22,obtained by elaborately modifying the TS algorithm.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function

26、ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmTS algorithm is hard to analyze and improve.0.3393 was only known as the upper bound of the algorithm.Experiment studies show that TS algorithm perform well inpractice.The observed

27、worst approximation was 0.3385 FIS2015.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algor

28、ithmMain ContributionsWe give a comprehensive analysis and further understandings forthe worst cases of the algorithm with the following results:0.3393 is the lower bound for TS algorithm,even when wearbitrarily modify its mixing phase(called generalized TSalgorithms).Characterize all tight instance

29、s for generalized TS algorithmsas LP feasible solutions.1/3 is the lower bound of the DFM algorithm.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper B

30、ound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTwo-Player Normal-Form GamesThe two players are called row player and column player.They have m and n pure strategies,respectively.Their payoff functions can be described by m n matrices,R and C,respectively.Their mixed strategies are

31、x,y,respectively.Let k=(x1,.,xk)Rk:Pixi=1,xi 0.x m,y n.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightn

32、ess of DFM AlgorithmApproximate Nash EquilibriumNotice that all Nash equilibria will not change if we scale and shiftpayoff functions uniformly.So w.l.o.g.we assume Rij,Cij 0,1.A strategy profile(x,y)is called an approximate Nashequilibrium if every player can gain at most extra payoff bydeviating f

33、rom the present strategy,i.e.,fR(x,y):=max(Ry)xTRy .fC(x,y):=max(CTx)xTCy .Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmT

34、ight InstancesTightness of DFM AlgorithmLoss Function Minimizationf(x,y):=maxfR(x,y),fC(x,y).Clearly,f(x,y)is the approximation of(x,y).Optimization formalization:minimizef(x,y)subject tox m,y n.Now we can use optimization algorithms to solve this problem.Xiaotie Deng Peking UniversityProvably Bound

35、 of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM AlgorithmTS AlgorithmSearch:Use a linear-programming-based descent method to find astati

36、onary point(x,y),together with a dual solution,w,z.Mix:The adjusted strategy pair is(x,y):=?11+w+1+x,z?,?w,11+z+1+y?,0(r2lnN,1(H,r)m+2r)+4r2ln(4/)mLD(h):=EuDE(h(u),u)is the expected approximation of h.LS(h):=1|S|PuSE(h(u),u)is the empirical averageapproximation of h over dataset S.N,1(H,r)is the cov

37、ering number of H that characterizes itscapacity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummaryNash Equilibrium Function ApproximatorApplicationGeneralization BoundGener

38、alization BoundFor a hypothesis class H of NE function approximators and a gamedistribution D,with probability at least 1 over the draw ofdataset S from D,for all h H,the following generalizationbound holds:LD(h)LS(h)2 infr0(r2lnN,1(H,r)m+2r)+4r2ln(4/)mLD(h):=EuDE(h(u),u)is the expected approximatio

39、n of h.LS(h):=1|S|PuSE(h(u),u)is the empirical averageapproximation of h over dataset S.N,1(H,r)is the covering number of H that characterizes itscapacity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibr

40、ium Function ApproximatorSummary1Preliminaries2Nash Equilibrium Approximator for One GameLower Bound and Upper Bound ResultsTS AlgorithmTight InstancesTightness of DFM Algorithm3Nash Equilibrium Function ApproximatorNash Equilibrium Function ApproximatorApplicationGeneralization Bound4SummaryXiaotie

41、 Deng Peking UniversityProvably Bound of Nash Equilibrium ApproximatorPreliminariesNash Equilibrium Approximator for One GameNash Equilibrium Function ApproximatorSummarySummaryPolynomial Time Algorithm Design for a Bi-matrix Game isDifficult in General.Any known algorithm with a provable approximation bound is1/3 up to now.Warmstarting approximation algorithm performs well in termsof sampling complexity.Xiaotie Deng Peking UniversityProvably Bound of Nash Equilibrium Approximator

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(邓小铁_watermark.pdf)为本站 (张5G) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部