《为超过 13 PB 且拥有数千名用户的三角洲湖泊构建和管理数据平台:AT&T 的故事.pdf》由会员分享,可在线阅读,更多相关《为超过 13 PB 且拥有数千名用户的三角洲湖泊构建和管理数据平台:AT&T 的故事.pdf(20页珍藏版)》请在三个皮匠报告上搜索。
1、 2023 AT&T Intellectual Property.AT&T and globe logo are registered trademarks and service marks of AT&T Intellectual Property and/or AT&T affiliated companies.All other marks are the property of their respective ownersAT&T Proprietary(Internal Use Only)-Not for use or disclosure outside the AT&T co
2、mpanies except under written agreementAT&T Databricks StoryPraveen Vemulapalli-Director TechnologyJegadeesan Pugazhenthi(JP)Lead Big Data EngineerJune 26,2023Agenda1:Intros2:Databricks Migration Story3:Share some fun facts2AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Inte
3、llectual Property-AT&T Proprietary(Internal Use Only)3AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Praveen Vemulapalli Things I love to do.Love Hiking&Camping Love motorcycle riding Hanging out with my family Data&A
4、I Technology evangelism Drive change&evolution o17+yrs of experience as a Technology leader/Analytics Architect and Business consultant.oLead multiple development&design implementations for Data warehousing,Business Intelligence&AI applications.oLifelong learner-Entrepreneurship&Innovation from Stan
5、ford;IT Business Leadership from NYU Jegadeesan Pugazhenthi(JP)4Things I love to doSportsOutdoor camping Technology(Data/AI/Cloud/Security.)Data driven insightsVisualizationTroubleshootingMy Experience20+yrs of experience as Technical Leader/ArchitectEngineered multiple solutions across broad spectr
6、um of technologies including Cloud,Hadoop,Security,Web applications,High AvailabilityBS in Computer EngineeringAT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)IF:THEN:5AT&T Chief Data Office-Enterprise Data Technology/
7、June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Why Databricks?6AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Edited by Praveen VemulapalliAT&T Technology Leadership Championing the AC
8、C Databricks Adoption7AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)8CDO Mission-AT&T North Star for Data&AnalyticsGovernanceForumsTechnologyDedicated RSA TeamRTU Spend GovernanceTraining Credit GovernanceCloud Compu
9、te Consumption Planning Contract Credits Governance(if any)CoE Leadership ForumCoE Technical Forum ACC DP Tech Communityo Amp site/Wiki/SharepointsAdhoc BU Trainings SessionsAdhoc Dev Guidance SessionsEnterprise Project GuidanceFMO Cloud Architecture Patternso Updates&Timeline viewsFeature Roadmapso
10、 Timeline&Deployment Views Technical GuidanceOperational Excellence&Best Practices FrameworksEmpower all AT&T users to use cloud patterns optimally by creating Center Of Excellence(CoE)for various target state cloud technologies.AT&T Data&AI Platforms CoE|CharterAT&T Chief Data Office-Enterprise Dat
11、a Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)9 Default pathACC Enterprise DatabricksBU managed DatabricksFaster self-service onboardingBU Chargeback enabledA Multi-tenant Service Full scale Security Isolation for teamsEnd-end cost allocation&tagging by
12、 teams24/7 Platform Support Access to Resident Solution Architect Support CDO Databricks ML/AI PlatformWith exceptionACC Enterprise Databricks Platform that serves all business unitsBusiness unit managed workspace built for a specific use caseDatabricks CoE&CDO SME SupportPre-connected to CDO manage
13、d enterprise Data&Snowflake environmentsUsers can manage their own Apps and ClustersAccess to Databricks Cost Optimization App Code Migration assistance Onboarding&access within 3 days SLA*CDO managed platformUsers administer apps/computeChargeback dashboardBU Databricks WorkspacesBU managed platfor
14、m BU manages azure costs*CDO will enable DBU chargeback BU Champion to review use case with CDO Databricks CoE for New workspace approval&setup BU Champion to takes full administrative responsibilities of managing a secure deployment of Databricks shared workspace for their chain of command.Follows
15、the best practices and Implementation architecture guidelines set forward by CDO Databricks CoE&EA Team.AT&T DatabricksWithout strategy,execution is aimless.Without execution,strategy is useless.Morris Chang*SLA=SLA will vary depending on the complexity of the request.AT&T Chief Data Office-Enterpri
16、se Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)10Provides Visibility to all Databricks usage in AT&TEnables usage alerts¬ificationsProvides opportunity for cost optimizationEnables data for Business unit$Chargeback on shared environmentsDatabric
17、ks Governance DashboardAT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Databricks Workspace Implementation Strategy Whitepaper 11Edited by Praveen VemulapalliAT&T Chief Data Office-Enterprise Data Technology/June 27,20
18、23/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Live the Learnings!12SetupSocializeCollaborateTrainings dashboard coming soon.Edited by Praveen VemulapalliAT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Us
19、e Only)Where Rubber meets the RoadOur Migration Journey13AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Hadoop Cluster Overview14AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Pr
20、operty-AT&T Proprietary(Internal Use Only)50,000+Jobs per day1,000 Users100 Sandboxes Workloads across the spectrum of Hadoop Eco system with Self-Service data dependencies1050 Worker Nodes50 Edge Nodes13 PB of storage used(9PB of Data Library)22 M CPU hours per monthCloud Migration15AT&T Chief Data
21、 Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Data to Cloud(s)How to?.Workloads to Cloud(s)Cloud PlatformData Migration16AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietar
22、y(Internal Use Only)Curated Historical Data(mostly ORC)(WanDisco)KMFormat Conversion(Orc to Delta)ADLS-gen2Available for User or Applications ADLS-gen2Raw Data ingested through NiFi (Parquet)Data Sources(on-prem)Parquet Data curated and Stored in Delta LakeADLS-gen2Available for User or Applicaions
23、ADLS-gen2On-PremDirect Ingestion from Source to the CloudHistorical Data MigrationCloudChange DriversFuture-State GoalsSuccess To-DateSingle Version of TruthParallelize,Simplify&AutomateMove Resources up the Value ChainFree Capital for Growth-Oriented InvestmentsEnable streaming pipelines&analyticsE
24、mpower citizen data scientists&analytics+60 BUs5-year Migration ROI of+300%Source:AT&T Interviews17AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Fun Facts18AT&T Chief Data Office-Enterprise Data Technology/June 27,20
25、23/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Key OutcomesSynchronize DataMigrate JobsMigrate Users234Resources RequiredCalendar DurationTools LeveragedKey Lessons LearnedAnalyze&Develop Solution1Business case/savings targetsMigration prioritizationPre-migration Dataset&Workl
26、oad rationalizationSynchronize 10PB and Convert(delta)data sets to cloud prior to user/workload migrationMigrate+12K feeds to cloudMigrate data pipelines&data extract jobsMap users to use cases via surveyDevelop user self-migration job aids with durations/timelinesMigrate user analytic use casesBusi
27、ness Case:2 Head CountRationalization:3 Head CountPrioritization:2 Head CountData Migrate,Sync&Integration:6 Head CountData Conversion:8 Head CountMigrating Jobs:11 Head CountUser Surveys:1 Head CountPrep Job aids:Head Count User Migration:5 Head CountChange Champions:5 Head Count9 Mo.(May20-Feb21)1
28、6 Mo.(Nov20-Mar22)18 Mo.(Jul21-Dec22)Workload AnalyzerFinancial AnalysisWANDiscoOnline User SurveyWiki Pages for Job AidsExcel user migration trackerN/APrioritize migration based on job data/compute to accelerate benefitsSize network bandwidth to cloud to reduce data sync windowSynchronize data to c
29、loud then migrate workloads&users to maximize migration runway,reduce dependencies and parallelize migrationOptimize data I/O and job run time via appropriate use of z-orderHorizontally distribute workloads across workspaces&storage accounts to avoid storage bandwidth and per subscription limits&tag
30、 cloud costs to BUsBudget 10%of migration labor for optimizationLeverage online survey to associate users to use cases at scaleObtain leadership support for change champions in business units to federate tracking and marshalling BU user self-migrationSource:AT&T InterviewsRationalize data/workload/u
31、sers up-front to save effort/$laterLeverage cloud for disaster recovery&retire on-prem DRRetire legacy equipment as migration occursPivot sustaining resources to support&optimize on DatabricksAccelerate Savings-Retire Footprint Throughout Migration(+2yr Nov20-Jan23)9 Mo.(Mar22-Nov22)19AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)THANK YOU20AT&T Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)