《ACI 故障排除:使用 Nexus 仪表板见解扩展您的工具集.pdf》由会员分享,可在线阅读,更多相关《ACI 故障排除:使用 Nexus 仪表板见解扩展您的工具集.pdf(105页珍藏版)》请在三个皮匠报告上搜索。
1、#CiscoLive#CiscoLiveIvan KovaeviBRKDCN-2626ACI Troubleshooting:Expand your toolset with Nexus Dashboard Insights 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveEnter your personal notes hereCisco Webex App Questions?Use Cisco Webex App to chat with the speaker after the s
2、essionFind this session in the Cisco Live Mobile AppClick“Join the Discussion”Install the Webex App or go directly to the Webex spaceEnter messages/questions in the Webex spaceHowWebex spaces will be moderated by the speaker until June 9,2023.12343https:/ 2023 Cisco and/or its affiliates.All rights
3、reserved.Cisco PublicBRKDCN-26263Agenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicND and NDI IntroScenario 1:BD enigmaScenario 2:Broken backupScenario 3:VM affairScenario 4:Resource trouble Instead of ConclusionBRKDCN-26264ND and NDI Intro 2023 Cisco and/or its affiliates.All
4、rights reserved.Cisco Public#CiscoLiveAny apps|All site types Cisco Nexus DashboardCisco Nexus Dashboard OrchestratorThird-party applicationsCisco Nexus Dashboard Insights 6.0Cisco Nexus DashboardColocation/edgeITSMSplunkServiceNowData broker 2023 Cisco and/or its affiliates.All rights reserved.Cisc
5、o Public#CiscoLiveCisco Nexus Dashboard:A unified platform7BRKDCN-2626The admin viewConsume service(s)from single placeFrictionless navigation across multiple services and sitesCustomize views and workflowsConsistent one-time onboarding of domains and servicesConsistent user management and access co
6、ntrolSingle dashboard for lifecycle management of services and ops infraThe operator view 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCisco Nexus Dashboard platform under the hoodProactive NotificationsConfiguration compliancyRealtime Telemetry CollectionFaster resolut
7、ion through correlationSingle Dashboard to view healthSingle UI to consume app services Secure cluster and app managementHardened secure container OSInfrastructure for secure K8s bringupCisco Nexus DashboardApplication ServicesShared ServicesInfra ServicesService LifecycleSystem ServicesSecure Conta
8、iner OSOpenSearch KafkaCisco Nexus Dashboard OrchestratorCisco Nexus Dashboard InsightsFrictionless accessStandardized service access Industry standard container managementSSO APIGWKubernetesKey Vault,ClusterAtomix OSCisco Nexus DashboardBRKDCN-26268 2023 Cisco and/or its affiliates.All rights reser
9、ved.Cisco PublicNexus Dashboard ServicesOrchestratorFabric ControllerInsightsData BrokerBRKDCN-26269 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicNexus Dashboard ServicesInsightsBRKDCN-262610 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveOperationsSin
10、gle-pane-of-glassCan I get visibility across datacenters?Single point for monitor and control?TroubleshootingWhere is the problem and whats the blast radius?How do I reduce MTTR?How do I prove network is healthy?Proactive advisoriesWas the issue preventable?Is the network exposed to known vulnerabil
11、ities?Can I get proactive advice?AssuranceAm I doing correct configuration?Are interdependencies known?Does the change impact something am not aware of?Data Center OperationsChallengesBRKDCN-262611 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNexus Dashboard InsightsBen
12、efitsAvoid outages with precautionary and proactive advisoryKeep the network state compliant with your intent.Control multiple datacenters with single pane of glassRapidly remediate with automated,correlated insightsBRKDCN-262612 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#Cisc
13、oLiveNexus Dashboard InsightsHow Does It Work?13BRKDCN-2626Notify and Recommend ActionIngest and ProcessSources of Telemetry DataDerive InsightsProcessSW-Version/PSIRTsNetwork ProtocolsConfig/Scale/HardeningComplex correlationCorrelate against DatabaseReduce Time to Problem Awareness,Action and Reso
14、lutionConsistency CheckersMac TableEvent HistoryStreaming TelemetryEnvironmentalsCoresCLIFIBDebug LogsAccounting LogsSyslogRIBTech-SupportConfig FileTopologyPSIRTsField noticesMetadata extraction 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNexus Dashboard InsightsHow C
15、an It Help?14BRKDCN-2626AvailabilityIdentify,locate,rootcause,remediateError detection,latency packet dropsControl plane issueAutomated alertsVisibilityPre-change analysisCompliance alertsEnd-to-end workflowsAutomated remediationMitigatePrevent outagesUpgrade impact advisoriesHardening checksSoftwar
16、e hardwarerecommendationsPSIRT noticesEoS/EoL noticesTAC assistTopology checkerNexus DashboardInsights 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveUser raised a trouble ticket for erratic access to ERP9:00NetOps checked connectivity e.g.,ping,trace,routing9:30Performed
17、 hop-by-hopdiagnostics and found everything ok11:00User reported recurrence12:00NetOps worked with multiple teams/tools but found nothing wrong with the network14:00NetOps suspected application misbehaving.There are back and forth calls with app team(s)15:00RCA:Misbehaving process on server is causi
18、ng application performance issue16:00Ops(Network and others)UserTraditional troubleshooting workflowBRKDCN-262615 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveUser raised a trouble ticket for erratic access to ERP9:00Now with Nexus Dashboard InsightsOps(Network and othe
19、rs)UserNexus Dashboard InsightsBRKDCN-262616NetOps checks flow health information,and it is green9:03NetOps looks at application anomalies and notices performance impacting events detected9:05NetOps cross verifies with Flow information and time series-based app performance data9:07RCA:Misbehaving pr
20、ocess on server is causing application performance issue9:10Scenario 1 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive“I added a new server to a pre-created EPG and it doesnt work”BRKDCN-262618 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWha
21、t exactly“doesnt work?”Lets collect some facts!Server needs to talk to a service outside of ACI fabricSomeone created a new BD+Subnet+EPG earlierServer can ping the gateway and connectivity inside ACI seems to be fineThere is a dynamic routing protocol between ACI and external routerBRKDCN-262619 20
22、23 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262620 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262621 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive22BRKDCN-2626 2023 Cisco and/or its affiliates.Al
23、l rights reserved.Cisco Public#CiscoLiveBRKDCN-262623 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNo faults and internal connectivity is fine-Must be something with external networkBRKDCN-262624 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiv
24、eA day later(if you are lucky)Net Infra team checks and says everything OK on their sideOnly one Tenant is complaining on a shared L3OutEMEA-STAC-A1K-F1#show ip route 192.168.10.0 255.255.255.0%Network not in tableEMEA-STAC-A1K-F1#-There is dynamic routing and ACI is not advertising our Subnet!But w
25、e checked our ACI and everything is fine?!?!But we checked our ACI and everything is fine?!?!BRKDCN-262625Meanwhile on NDI 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveDoubleClickBRKDCN-262627 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveSc
26、roll DownBRKDCN-262628 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262629 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262630 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive!BRKDCN-262631 2023 Cisc
27、o and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveProblem fixed!Our colleague forgot to asociate L3Out to the BDBRKDCN-262632 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveHow did NDI pick this up?(text version)NDI collects snapshots(datasets)every X minu
28、tesSnapshot is a list of CLI and API outpus from APICs and all switches in the fabricIt contais config and current state of ports,routing tables,protocols,HW programming,etcAll this data is processed and analyzed;detected anomalies are presented to the user in the UIIn our case it was a deviation fr
29、om standard,well known config modelBRKDCN-262633 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveHow did NDI pick this up?34BRKDCN-2626Timeconfigif staterouting tablesprotocolsHW programmingNDIIn our case it was a deviation from a standard,well known config modelScenario 2
30、 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive“Our external backup platform cannot reach servers hosted on ACI fabric”BRKDCN-262636 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveLets collect some facts!This used to work until last weekBackup
31、 platform is on a dedicated subnet outside of ACIAll servers on ACI are impacted,spanning multiple EPGsConnectivity inside ACI fabric is fine and everything else seems to work fineThere is a dynamic routing protocol between ACI and external router(same fabric as before)BRKDCN-262637 2023 Cisco and/o
32、r its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262638 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262639 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262640 2023 Cisco and/or its affiliates.All rights reser
33、ved.Cisco Public#CiscoLiveWhat next?We are pretty sure that route should be thereWe cannot find anything wrong on ACI-Lets contact the Net Infra team(again)Two days laterNet Infra team says there was not supposed to be any changeThey dont want to spend time on this,it must be something wrong with AC
34、IBRKDCN-262641In doubt?Lets check what NDI has to say!2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWhat is Delta-analysis?Shows difference in the policy,run time state,and the health of the network between two snapshots you specifyHealth Delta Health Delta analyses the
35、difference in the health of the fabric across the two snapshotsPolicy Delta Policy Delta analyses the differences in the policy between the two snapshots and provides a co-related view of what has changed in the ACI FabricBRKDCN-262643 2023 Cisco and/or its affiliates.All rights reserved.Cisco Publi
36、c#CiscoLiveDelta AnalysisBRKDCN-262644 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCreating new AnalysisBRKDCN-262645 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveDoubleClickBRKDCN-262646 2023 Cisco and/or its affiliates.All rights reserve
37、d.Cisco Public#CiscoLiveScroll DownBRKDCN-262647 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveScroll DownBRKDCN-262648 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNo new anomalies!BRKDCN-262649 2023 Cisco and/or its affiliates.All rights r
38、eserved.Cisco Public#CiscoLiveDelta by Resource Only MismatchIt looks like we lost a route!BRKDCN-262650 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive!BRKDCN-262651 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBack to Net Infra teamAh yes w
39、e had a maintenance last week and due to config change error the route got dropped sorryBRKDCN-262652Scenario 3 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive“My VM doesnt work”BRKDCN-262654 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveLets
40、collect some facts!VM is on a Hypervisor hosted and intergrated with ACIVM is not accessibleThis is a“passwordles”VM accesible via SSH and public key(console login is not possible)Other VMs on same EPG are accesibleAffected VM is also not accessible from other VMs in the same EPGVM name is“ubuntu4.1
41、ubuntu4.1”and assigned IP is 192.168.10.14192.168.10.14BRKDCN-262655 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262656 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveIP seems to be there!BRKDCN-262657 2023 Cisco and/or its affiliates
42、.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262658 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWhat do we do next?Restart the VM?(Same result)Talk to Server/Virtualization Team?(They are very busy)What else?BRKDCN-262659You know where this is going 2023 Cisco and
43、/or its affiliates.All rights reserved.Cisco Public#CiscoLiveLets browse some EPsBRKDCN-262661 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveDoubleClickBRKDCN-262662 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262663 2023 Cisco and/o
44、r its affiliates.All rights reserved.Cisco Public#CiscoLiveDoubleClickBRKDCN-262664 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveScroll DownBRKDCN-262665 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveFault is raised!BRKDCN-262666 2023 Cisco
45、and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBut this one would not be that helpful on ACI sideBRKDCN-262667 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262668 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-2
46、62669 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNDI Integrates with vSphere!BRKDCN-262670 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262671 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262672
47、 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWhat actually happened?The user that reported the issue just created his VM with duplicate IPThe other VM with the same IP had SSH on a different port-this is what created“inaccessible”symptomThere was an actual fault on EPG
48、 but it flaps really fast BRKDCN-262673Scenario 4 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive“Help me,my TCAM is on fire!”BRKDCN-262675 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWhat is TCAM and Policy CAM?In ACI policy,contracts and f
49、ilters are programmed in Policy CAM Policy CAM and TCAMTCAM(T Ternary C Content A Addressable MMemory)Specialized and precious local switch HW resources designed for rapid table lookupsIn an environment with many to many EPG contract relationship,multiple filters,and EPGs concentrated on few leaf sw
50、itches can easily hit the Policy CAM limit and exhaust the hardware resourcesIf the resources are exhausted additional policies/contracts wont be programmed on the hardware.As a result system will see unexpected behavioursBRKDCN-262676 2023 Cisco and/or its affiliates.All rights reserved.Cisco Publi
51、c#CiscoLiveOne day you wake up to thisBRKDCN-262677 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveOne day you wake up to thisBRKDCN-262678 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262679 2023 Cisco and/or its affiliates.All rights
52、 reserved.Cisco Public#CiscoLiveWhat do we do?Clearly there is“too much”config!But where?Even if we find it,can we remove it?BRKDCN-262680This is getting old 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive82BRKDCN-2626 2023 Cisco and/or its affiliates.All rights reserved.
53、Cisco Public#CiscoLiveBRKDCN-262683 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveSelect site and snapshotBRKDCN-262684 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262685 2023 Cisco and/or its affiliates.All rights reserved.Cisco Pub
54、lic#CiscoLiveScroll DownBRKDCN-262686 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveScroll RightBRKDCN-262687 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBRKDCN-262688How did it get to this?2023 Cisco and/or its affiliates.All rights reserv
55、ed.Cisco Public#CiscoLivePrevious stable situationBRKDCN-262690 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive91BRKDCN-2626 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBut I added just two more filters to an existing contract!BRKDCN-262692
56、2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveTwo innocent looking filtersBRKDCN-262693Instead of Conclusion:What else can NDI do?2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveExplore connectivityBRKDCN-262695 2023 Cisco and/or its affiliates
57、.All rights reserved.Cisco Public#CiscoLivePre-change analysisBRKDCN-262696 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBrowse flowsBRKDCN-262697 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveOffline sitesBRKDCN-262698Wrapping it upTest Dri
58、ve?Walk-in Lab:LABDCNLABDCN-22152215Nexus Dashboards Insights on lab 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveFill out your session surveys!Attendees who fill out a minimum of four session surveys and the overall event survey will get Cisco Live-branded socks(while
59、supplies last)!These points help you get on the leaderboard and increase your chances of winning daily and grand prizesAttendees will also earn 100 points in the Cisco Live Challenge for every survey completed.BRKDCN-2626101 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicContinue y
60、our educationVisit the Cisco Showcase for related demosBook your one-on-oneMeet the Engineer meetingAttend the interactive education with DevNet,Capture the Flag,and Walk-in LabsVisit the On-Demand Library for more sessions at www.CiscoL you#CiscoLive 2023 Cisco and/or its affiliates.All rights rese
61、rved.Cisco Public#CiscoLive104Gamify your Cisco Live experience!Get points Get points for attending this session!for attending this session!Open the Cisco Events App.Click on Cisco Live Challenge in the side menu.Click on View Your Badges at the top.Click the+at the bottom of the screen and scan the QR code:How:1234104 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicBRKDCN-2626#CiscoLive