报告预览

2020BCS-北京网络安全大会：针对现实应用的文本对抗攻击研究.pdf

编号：29444

PDF 36页 12.05MB 下载积分：VIP专享

下载报告请您先登录！

2020BCS-北京网络安全大会：针对现实应用的文本对抗攻击研究.pdf

1、LEHIGH浙江大学BerkeleyUNIVERSITYZhejiangUniversityTextBugger: Generating Adversarial Text AgainstReal-worldApplicationsJinfeng LiShouling JiTianyu DuBo LiTingWangINFORSEC2020#page#Machine Learning For Natural Language ProcessingSentiment AnalysisNegativInformation ExtractionInformation RetrievalMachine

2、LearningTasksForMultipleMachine TranslationQuestion Answering22020/8/21#page#Machine Learning As A Service For NLPMicrosoftGoogleawsAzureCloud PlatformEWATSONfastTextParallelDotsGoogle PerspectiveRThey SayAYLIENmashape2020/8/21#page#Breaking Thing Is EasyRecent works have revealed the vulnerabilitie

3、s of DNNs in image and speech domain The DNNS forimage classification are vulnerable to adversarialimages.Goodfellow et al.， ICLR15 Automatic speech recognition systems can be broken down by adversarial audios in physical world.IYuan et al.，USENIX18iV(o.y）买，Open hedoor100XnoamMOH99.3号。Do the adversa

4、rial examples also exist in text domain？Are the MLaaS for NLP also vulnerable to adversarial examples？2020/8/21#page#Preliminaries2020/8/21#page#Adversarial TextWhat is the adversarial text？Carefully generated by adding small perturbations to thelegitimate text.Task:Sentiment Analysis.ClassiferAmazo

5、n AWS.Originallabel:100%NegativeAdversariallabel89%Positive.uu apeu seM alou SIuMesISasoalpor o ueyanHwelasneag Auu Auaal alou Slu pauemlxeLeunoun Atnetf uBiu elam suoeoadxe u ossasuuoed uuuM pleme esOu uamaqthought the movie was terible terribleandlm stil left wondering how shewas everpersuaded to

6、makethis movie.TheScriptisreallyweakweak.Whatisthe challenge forgenerating adversarialtexts？ The discrete property of text makes it hard to optimize. Small perturbations in text are usually clearly perceptible， Replacement of a single word may drastically alter the semantics of the sentence2020/8/21

7、#page#Related Works For Generating Adversarial TextsGradient-based Methods Modifying an input text repetitively until it is misclassified.Papernot et al， MILCOM 16 Changing one token to another by a gradient-based optimization method.Ebrahimi et al.,NAACL 18 Perturbing the important words determined

8、 by embedding gradient with hand-crafted synonyms.Samanta etal，arxiv17Out-of-vocabulary Words Breaking machinelearning systems down by randomcharacter manipulations.Belinkovetal,ICLR18 Attacking black-box models by applying random character perturbations.Gao et al.SPW 18 Changing the toxicity score

9、ofthe texts by adding spaces or dots between characters.Hosseinietal.,arxiv172020/8/21#page#Related Works For Generating Adversarial TextsReplace with Semantically/Syntactically Similar Words Only replacing words with semantically similar ones.Alzantot et al.,arXiv 18 Replacing tokens by random word

10、s of the same POS tag with a probability proportional to theembedding similarity. Ribeiro et al ACL 18Other Methods Atacking reading comprehension systems by adding distracting sentences to the input documentUia et al，EMNLP 17 Generating adversarial sequence by Generative Adversarial Networks (GANS）

11、Zhao etal.，ICLR182020/8/21#page#LimitationsThese works are limited in practice due to at least one of the following reasons: Limited to short textsZ Significantly affect the original meaning Need hand-crafted synonyms and typos Requires manual intervention to polish the added sentences Not computati

12、onally efficient2020/8/21#page#TextBugger102020/8/21#page#Framework For TextBuggerText ClassificationOnline PlatformOffline ModelTextConfidenceGradientWordinformationvalueEmbeddingAttack ModelBlack-boxWhite-boxAttack ModelAttack ModelNoiseSfeed backAdversarialText2020/8/2111#page#Threat ModelWhite-b

13、ox Have complete knowledge about the targeted modelBlack-box Do not know the model architecture， parameters or training data Only capable of querying the targeted model with output as the prediction or confidence scoresSentiment AnalysisABUSIVE CONTENT CLASSIFIERTt of your brand.product orsorvicgPna

14、dncfAnalyso5.80号在074.80%914192020/8/2112#page#Step 1:Finding Important WordsWhite-boxattack Find important words by gradient information.OF（）Cx：=Jf(i，y）=0iOF（）0F（）Denotes:xistheinput text xiistheitn word inx.F(a)is the confidence value ofthe jtn class.Cr.istheimportance ofword xi N is the total numb

15、er of words in x.K is the total number of classes.132020/8/21#page#Step 1: Finding Important WordsBlack-box attack50.20 Find important sentences20.15Centence(2)=F（8）号0.1030.05Sordered Sort(s) according to Cacnienec(）Delete sentencesinSordered if F(si）2-0.70-0.75 Findimportant words foreach sentencei

16、n Sordered-0.90C=F（ui，u2.m)-F（u.-1，1.m）Denotes:Sentence:ltis so laddish andjuvenile onlySiis theitn sentence in theinputtext xteenage boys could possibly find itfunny.F(sa） is srs confidence value ofthe predictedclassy. Sordered is the important sentences setCsentencc(zis the importance ofwordSi Cuy

17、 is the importance of thejth word in Si142020/8/21#page#Step 2: Bugs GenerationCharacter-level perturbation: out-of-vocabulary phenomenon Insert: Insert a space into the word. Delete: Delete a random character of the word.Swap:Swap random two adjacent letters in the word.Substitute-C(Sub-C）：Replace

18、characters with visually similar characters or adjacent characters in the keyboard.Word-level perturbation: nearest neighbor searching in the embedding space Substitute-W(Sub-W）：Replace a word with its top k nearest neighbors in a context-awareword vector space.OriginalInsertDeleteSwapSub-CSub-Wfool

19、ishf oolishfolishfooilshfoOlishsillyawfullyawfull yawfulyawflulyawfullyterriblyclichsclichesclich esclcihesclichescliche2020/8/2115#page#Step 3: Replacing Important Word By Generated BugOptimal bug selection choose the optimal bug according to the change of the confidence valuecandidate(k)=replace a

20、 with brin score(k)=F（)-F（candidate（k）Important word replacement Replace the important word by the selected optimal bug Repeat until “convergence” the semantic similarity is below the threshold the new text is misclassified by the classifier2020/8/2116#page#Attack Evaluation2020/8/21#page#Case Study

21、C Sentiment AnalysisToxic Content Detection2020/8/2118#page#Attack Evaluation: Sentiment AnalysisDataset IMDB:50000 positive and negative movie reviews Rotten Tomatoes Movie Reviews(MR）:5,331 positive and 5,331 negative snippetsTargeted ModelWhite-box models:LR，CNN，LSTMMicrosoftGoogleaws Real-world

22、Online Platforms:AzureCloud PlatformBMWATSOHfastTextThey SayAYLIE厂ParallelDotsBaseline Algorithms white-box: Random， FGSM+NNS (Nearest Neighbor Search）， DeepFool+NNSBlack-box: DeepWordBug192020/8/21#page#Attack Evaluation： Sentiment AnalysisEvaluation Metrics人Edit Distance Jaccard Similarity Coeffic

23、ientAnBAnB Euclidean Distanced（(p,g)=(p-4）+（D2-02）+.+(pn-4n） Semantic Similaritypqip.g2020/8/2120#page#Important Words Selected By TextBuggerhorriblemuchprettyOMlong seenreallactuallyaebettebudgetitherSlittlefimsnothingpoorwhOreasonaHergwysupposedcasthorrorsomethinggbadstupidA80probablywithouticould

24、enough0awfulorinc36moneyUtyingalmostshow directorevemoviesdWorstwasteanyonePoo66fwholeleastterrible.bas！lot2020/8/2121#page#Generated Adversarial TextsSuccessful Attack ExamplesTask:Sentiment Analysis.Classifier: CNN.Originallabel:99.8%Negative.Adversarallabel:81.0%Positive.Text:llove theseawful awf

25、ul80ssummercampmovies.The best part aboutPartyCampis thefact that it literallyliteralyhasneNoplot.Theeliehesclichsherearelimitless:thenerdsvs.thejocks，thesecretcamerainthegirls lockerroom，thehikers happening uponanudistcolonythecontestat theconclusion，thesecretly horny campadministrators，and the emb

26、arrassingty embarrassingly foolish foolish sexual innuendo littered throughout.This movie will make youlaugh，butneverintentionally.lrepeat，neverTask:SentimentAnalysis.ClassifierAmazonAWS.Original label:100%Negative.Adversarial label:89%Positive.Textlwatched this movie recently mainly becauselama Hug

27、e fan of Jodie Fosters.lsaw this movie was made rightthought themoviewas terfibleterrib1eandmstill left wondering howshewasever persuadedto make this movie.TheScriptis reallyweakweak.2020/8/2122#page#Attack Performance: Effectiveness And EfficiencyWhite-box AttackTABLE IIRESULTS OF THE WHITE-BOXATTA

28、CKS ON IMDB ANDMR DATASETSRandomFGSM+NNS 2DeepFool+NNS I2TEXTBUGCERModelDatasetAccuracySuceessPerturbedPerturbedPerturbedSuccessPerturbedSuccessSuccessRateWordRateWordRateWordRatePIOM2.1%92.7%6.1%MR73.7%10%32.4%4.3%35.2%4.9%LRIMDB82.1%2.7%10%41.1%8.7%30.0%5.8%95.2%4.9%1.5%10%85.1%9.8%MR78.1%25.7%7.5

29、%28.5%5.4%CNNIMDB10%36.2%10.6%2.7%89.4%1.3%23.9%90.5%4.2%MR80.1%1.8%10%25.0%24.4%80.2%10.2%6.6%11.3%LSTM10%IMDB90.7%0.8%31.5%9.0%26.3%3.6%6.9%86.7%Remarks Choosing important words to modify is necessary Effective: TextBugger has high attack success rate on all models and performs betterthan baseline

30、s. Evasive:TextBugger perturbs few words to fool the models2020/8/21#page#Attack Performance: Effectiveness And EfficiencyBlack-box AttackTABLE II.RESULTS OF THE BLACK-BOXATTACK ON IMDBDeepWordBug 1ljTEXTBUGGERTargeted ModelOriginal AccuracySuccess RatePerturbed WordSuecess RateTime（s）Time（s）Perturb

31、ed Word1.9%85.3%43.6%266.6910%70.1%33.47Google Cloud NLP89.6%34.5%690.5910%97.1%99.288.6%IBM Waston89.6%56.3%182.0810%100.0%23.015.7%Microsof Azure75.3%68.1%43.9810%100.0%4.611.2%Amazon AWSFacebookfastText86.7%67.0%0.1410%85.4%0.035.0%2.2%ParalleDots63.5%79.6%812.8210%92.0%129.029.5%943%134.034.1%Th

32、eySay86.0%888.9510%70.0%63.8%10%90.0%44.961.4%Aylien Sentiment674.2181.7%10%8.9%TextProcessing57.3%303.0497.2%59.4210%Mashape Sentiment88.0%31.1%585.7265.7%117.136.1%Remarks Effective: TextBugger has higher attack success rate against all online platforms than DeepWordBug Evasive: TextBugger only pe

33、rturbs fewer words than DeepWordBugEfficient: TextBugger spends less time than DeepWordBug42020/8/21#page#Attack Performance: Change Of ConfidenceSentiment Score Distribution1.0OriginalTextm Original Text1.2中0.8PerturbedTextPerturbed Text三21.00.510.2南城0.020.6-0.2TF0.4店-0.5式0.2-0.80.0-1.0GoogleWatson

34、AWSAzurefastText（a）IMDB（b）IMDBRemarks TextBuggergreatly changes the confidence value of the classification results IBM Watson is more sensitive to the adversarialtexts generated by TextBugger.2020/8/2125#page#Utility Analysis:White-box Attack1.01.01.01皖LRCNNCNN0.80.80.80.8LSTMLSTM0.60.6目手0.680.40.40

35、.4LR-LR0.20.20.20.2CNN-CNNLSTMSTM0.00.00.00.0061000.20.41.020080.023450.20.40.60.81.0Edit DistanceJaccardCoefficentEuclidean DistanceSemanticSimilarity（a） IMDBRemarks The generated adversarial texts preserve good word-level and vector-level utility2020/8/2126#page#Utility Analysis:Black-box Attack1.

36、01.01.0TextBuggerTextBugger-DeepWordBug080.80.80.60.60.20.20.2TextBuggeTextBugge-DeepWordBugDeepWordBug0.010.00.010.00.20304050.60.70.809.00.20.40.68010121.0EditDistanceJaccardCoefficientEuclidean DistanceSemanticSimilarity（a） IMDBRemarks TextBugger generates higher quality adversarial te

37、xts than DeepWordBug.2020/8/21#page#The Impact Of Document LengthThe lmpact of Document Length on Attack Performance0.81601.0Google Cloud NLP140Score0.6MicrosoftAzure120IBMWatson0.410080.6800.2Google Cloud NLPGoogle Cloud NLP福40MicrosoftAzure0.0MicrosoftAzure0.220IBMWatsonIBMWatson0.0-0.20-252550751

38、00 125 150 175 20050100 125 150 175 200255075100125150 175 20075WordsWordsWords（b） Score（e）Time（a） Success RateRemarks Length has little impact on the success rate，but may decrease the change of negative classs confidence value The time required for generating one adversarial textincreases slightly

39、as the length grows.282020/8/21#page#The Impact Of Document LengthThe Impact of Document Length on The Utility of Generated Adversarial Texts.16Google Cloud NLPSimilarity14Perturbed WordsMicrosoftAzureIBMWatson0.980.86Google Cloud NLP0.7MicrosoftAzureIBM Watson0.65255075520050751001251501

40、75200WordsWords（a） Number of Perturbed WordsGb） Semantic SimilarityRemarks Longer document length leads to more perturbed words， Theincreasing perturbed words do not decrease thesemanticsimilarity ofthe adversarialtexts.2020/8/21#page#Bug Distribution0.6SwapInsert0.5Sub-CDeleteProportion0.4Sub-W0.30

41、.20.10.0AWS fastTextGoogle Watson AzureRemarks Azure and AWS aresensitive to theinsert bug Watson and fastText are sensitive to Sub-C Delete and Sub-W are used less than others2020/8/2130#page#Further AnalysisTransferabilityUser Study2020/8/2131#page#TransferabilityTABLE VII.TRANSFERABILITY ON IMDB

42、AND MR DATASETSBlack-box APIsWhite-box ModelsDataset ModelCNNLSTMIBMLRAzureGoogle fastText AWSLR20.3%14.5%19.0%95.2%14.5%24.8%15.1%18.8%IMDBCNN20.0%28.9%90.5%21.2%21.2%31.4%20.4%25.3%LSTM25.1%28.8%23.8%86.6%27.3%26.7%27.4%23.1%LR92.7%29.8%18.3%28.7%22.4%39.5%31.3%19.8%MRCNN26.5%82.1%25.3%21.0%19.1%2

43、0.5%31.1%28.2%LSTM24.6%21.9%22.5%16.5%18.7%21.4%88.2%17.7%Remarks Transferability also exists in adversarial texts among models and online platforms. Transferability can be used to attack online platforms even they have call limits.322020/8/21#page#page#Vulnerability ReportawsIBM CloudTWe.2020/8/213

44、4#page#SummaryWe proposed TextBuggera framework for generating adversarial texts effectively and efficiently Effective:It outperforms state-of-the-art attacksin terms ofattack success rateunderbothwhite-box and black-box settings. Evasive: lt preserves the utility of benign text. Efficient: It gener

45、ates adversarial text with computational complexity sub-linearto the text lengthWe evaluated TextBugger on 15 realworld online applicationsDataset：IMDB，MR and Kaggle. Application: Includes sentiment analysis and toxic content detection.Utility-preserving:TextBugger has little impact on human understandingWe further discuss two potential defense strategies to defend against such attacks2020/8/2135#page#QA2020/8/2136#page#

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（2020BCS-北京网络安全大会：针对现实应用的文本对抗攻击研究.pdf）为本站（X-iao）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。

上海品茶

2020BCS-北京网络安全大会：针对现实应用的文本对抗攻击研究.pdf

2020BCS-北京网络安全大会：针对现实应用的文本对抗攻击研究.pdf