《6-5 基于图的视觉分类模型的可解释性.pdf》由会员分享,可在线阅读,更多相关《6-5 基于图的视觉分类模型的可解释性.pdf(39页珍藏版)》请在三个皮匠报告上搜索。
1、1Explainability of Graph-based Image ClassificationJindong Gu University of MunichContent:1.Motivation2.Graph Neural Network3.Graph Capsule Network4.A Graph-based View of Vision Transformer 5.Conclusion21.Motivation3He,Kaiming,Xiangyu Zhang,Shaoqing Ren,and Jian Sun.Deep residual learning for image
2、recognition.InProceedings of the IEEE conference on computer vision and pattern recognition,pp.770-778.2016.Image Classification with Convolutional Neural Networks(ResNet)1.Motivation4He,Kaiming,Xiangyu Zhang,Shaoqing Ren,and Jian Sun.Deep residual learning for image recognition.InProceedings of the
3、 IEEE conference on computer vision and pattern recognition,pp.770-778.2016.Image Classification with Convolutional Neural Networks(ResNet)Convolutional Neural Networks:Advantages:powerful to capture correlations and abstract conceptions out of image pixels.Disadvantage:difficult to capture pairwise
4、 relation,global context and attribute feature.1.Motivation5He,Kaiming,Xiangyu Zhang,Shaoqing Ren,and Jian Sun.Deep residual learning for image recognition.InProceedings of the IEEE conference on computer vision and pattern recognition,pp.770-778.2016.Image Classification with Convolutional Neural N
5、etworks(ResNet)Convolutional Neural Networks:Advantages:powerful to capture correlations and abstract conceptions out of image pixels.Disadvantage:difficult to capture pairwise relation,global context and attribute feature.2.Graph Neural Networks6Kipf,T.N.and Welling,M.,2016.Semi-supervised classifi
6、cation with graph convolutional networks.arXiv preprint arXiv:1609.02907.2.Graph Neural Networks7Monti,Federico,et al.Geometric deep learning on graphs and manifolds using mixture model cnns.Proceedings of the IEEE conference on computer vision and pattern recognition.2017.Dwivedi,Vijay Prakash,et a
7、l.Benchmarking graph neural networks.arXiv preprint arXiv:2003.00982(2020).GCN/GAT/Gated GCN/MoNet/Spline CNN Predictions2.Graph Neural Networks8Knyazev,B.,Lin,X.,Amer,M.R.and Taylor,G.W.,2019.Image classification with hierarchical multigraph networks.arXiv preprint arXiv:1907.09000.2.Graph Neural N
8、etworks9Vasudevan,Varun,et al.Image Classification using Graph Neural Network and Multiscale Wavelet Superpixels.arXiv preprint arXiv:2201.12633(2022).Primary CapsuleVotingOutput CapsuleRoutingMaskingReconstructionWiWjCiCjSabour S,Frosst N,Hinton G E.Dynamic routing between capsules.NeurIPS,2017.103
9、.Graph Capsule NetworksCapsule NetworkPrimary CapsuleSabour S,Frosst N,Hinton G E.Dynamic routing between capsules.NeurIPS,2017.113.Graph Capsule NetworksCapsule NetworkPrimary CapsuleVotingWiWjSabour S,Frosst N,Hinton G E.Dynamic routing between capsules.NeurIPS,2017.123.Graph Capsule NetworksCapsu
10、le NetworkPrimary CapsuleVotingOutput CapsuleRoutingWiWjCiCjSabour S,Frosst N,Hinton G E.Dynamic routing between capsules.NeurIPS,2017.133.Graph Capsule NetworksCapsule NetworkPrimary CapsuleVotingOutput CapsuleRoutingMaskingReconstructionWiWjCiCjSabour S,Frosst N,Hinton G E.Dynamic routing between
11、capsules.NeurIPS,2017.143.Graph Capsule NetworksCapsule NetworkGu,J.,2021,May.Interpretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Intelligence(Vol.35,No.2,pp.1469-1477).153.Graph Capsule NetworksGraph Capsule Network163.Graph Capsule Network
12、sGu,J.,2021,May.Interpretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Intelligence(Vol.35,No.2,pp.1469-1477).Graph Attention in GraCapsNet-based Image Classification on MNIST173.Graph Capsule NetworksGraph Attention in GraCapsNet-based Image C
13、lassification on CIFAR10Gu,J.,2021,May.Interpretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Intelligence(Vol.35,No.2,pp.1469-1477).Graph Attention in GraCapsNet-based Image Classification on MNIST183.Graph Capsule NetworksGu,J.,2021,May.Inter
14、pretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Intelligence(Vol.35,No.2,pp.1469-1477).193.Graph Capsule NetworksGu,J.,2021,May.Interpretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Int
15、elligence(Vol.35,No.2,pp.1469-1477).203.Graph Capsule NetworksGu,J.,2021,May.Interpretable graph capsule networks for object recognition.InProceedings of the AAAI Conference on Artificial Intelligence(Vol.35,No.2,pp.1469-1477).4.A Graph-based View of Vision Transformer21Dosovitskiy,Alexey,et al.An i
16、mage is worth 16x16 words:Transformers for image recognition at scale.ICLR.2021.22Dosovitskiy,Alexey,et al.An image is worth 16x16 words:Transformers for image recognition at scale.ICLR.2021.https:/graphdeeplearning.github.io/post/transformers-are-gnns/attention-block.jpgSelf-Attention Operation4.A
17、Graph-based View of Vision Transformer23https:/graphdeeplearning.github.io/post/transformers-are-gnns/attention-block.jpgSelf-Attention Operationhttps:/graphdeeplearning.github.io/post/transformers-are-gnns/gnn-block.jpgGraph Convolutional Operation4.A Graph-based View of Vision Transformer24Dosovit
18、skiy,Alexey,et al.An image is worth 16x16 words:Transformers for image recognition at scale.ICLR.2021.4.A Graph-based View of Vision Transformer25Gu,Jindong,Volker Tresp,and Yao Qin.Are Vision Transformers Robust to Patch Perturbations?.arXiv preprint arXiv:2111.10659(2021).Attack or CorruptAttack o
19、r CorruptGraph Classification4.A Graph-based View of Vision Transformer26Gu,Jindong,Volker Tresp,and Yao Qin.Are Vision Transformers Robust to Patch Perturbations?.arXiv preprint arXiv:2111.10659(2021).Attack or Corrupt4.A Graph-based View of Vision Transformer Natural Patch Corruption:Adversary Pat
20、ch Attack:271 Srinadh Bhojanapalli,Ayan Chakrabarti,Daniel Glasner,Daliang Li,Thomas Unterthiner,and Andreas Veit.Understanding robustness of transformers for image classification.arXiv preprint arXiv:2103.14586,2021.2 Rulin Shao,Zhouxing Shi,Jinfeng Yi,Pin-Yu Chen,and Cho-Jui Hsieh.On the adversari
21、al robustness of visual transformers.arXiv preprint arXiv:2103.15670,2021.4.A Graph-based View of Vision Transformer How to create natural corruptions?281 Dan Hendrycks and Thomas Dietterich.Benchmarking neural network robustness to common corruptions and perturbations.In ICLR,2019.2 Karmon,Danny,Da
22、niel Zoran,and Yoav Goldberg.Lavan:Localized and visible adversarial noise.International Conference on Machine Learning.PMLR,2018.Natural Corruption 14.A Graph-based View of Vision Transformer How to create natural corruptions?How to generate adversary patch perturbation 2?291 Dan Hendrycks and Thom
23、as Dietterich.Benchmarking neural network robustness to common corruptions and perturbations.In ICLR,2019.2 Karmon,Danny,Daniel Zoran,and Yoav Goldberg.Lavan:Localized and visible adversarial noise.International Conference on Machine Learning.PMLR,2018.Step 1:specify the patch position and replace p
24、ixel values with randomly initialized noise;Step 2:Update the noise to minimise the probability of the ground-truth class.Natural Corruption 1ModelOutput4.A Graph-based View of Vision Transformer How to evaluate model robustness?Evaluation Metric:Fooling Rate(FR)1)Collecting N correctly classified i
25、mages.2)Perturbing images with natural corruptions or adversary perturbations.3)M of N perturbed images are misclassified.Fooling Rate:FR=M/N,reported in%.304.A Graph-based View of Vision Transformer31DeiT is more robust to naturally corrupted patches than ResNet.low FR more robust4.A Graph-based Vi
26、ew of Vision Transformer32DeiT is more vulnerable than ResNet against adversarial patches.high FR more vulnerable4.A Graph-based View of Vision Transformer33DeiT is more robust than ResNet to naturally corrupted patches.However,DeiT is significantly more vulnerable than ResNet against adversarial pa
27、tches.low FR more robusthigh FR more vulnerable4.A Graph-based View of Vision Transformer34The attention mechanism in ViT can effectively ignore the naturally corrupted patches whereas be forced to focus on the adversarial patches.4.A Graph-based View of Vision Transformer35 Comparing ViT and ResNet
28、 with Vanilla Gradient Visualization4.A Graph-based View of Vision Transformer36 Comparing ViT and ResNet with Vanilla Gradient VisualizationGradient Visualization:The adversary patches on DeiT attract attention,while the ones on ResNet hardly do.4.A Graph-based View of Vision Transformer37 Comparin
29、g ViT and ResNet with Vanilla Gradient VisualizationGradient Visualization:The adversary patches on DeiT attract attention,while the ones on ResNet hardly do.4.A Graph-based View of Vision Transformer385.Conclusion1.GNNs achieve unsatisfying performance when directly applied to the graph representat
30、ion of images,although they can provides explanations for image classifications.2.It is effective to model visual concepts in feature spaces instead of input space.The explanations can be created in the feature spaces.3.The state-of-the-art vision models represent images as graph implicitly.The explanations can be created from the perspective of graph representation.Thank you for your attention!Q&A39