《托普塔尔:在多云策略中启用数据流.pdf》由会员分享,可在线阅读,更多相关《托普塔尔:在多云策略中启用数据流.pdf(16页珍藏版)》请在三个皮匠报告上搜索。
1、Matt KroonVP of Enterprise,ToptalChristina TaylorData and ML Ops EngineerToptal Enables Data Streaming Within Multi-Cloud StrategiesFireside Chat:Toptal Automates Data Analysis with Alteryx Confidential|3Agenda1.Toptal Digital and Business Services2.Toptal Solutions with Databricks3.Multi-Cloud Stre
2、aming Data Challenges4.Multi-Cloud Streaming Solutions5.Providing Reliability and Scalability to Machine LearningConfidential|4Toptal Digital and Business Consulting ServicesTechnology&DevelopmentDesign&BrandingProject ManagementProduct ManagementStrategy&FinanceThousands of Talent With Deep Experti
3、seFlexible Delivery Engagement ModelsEnsure Client Success with Quality and Speed Provide Specialized Expertise Subject matter experts with deep expertise deliver challenging solutionsExecute a Project Teams of experts delivering complex initiativesStrategic Services Consultants help clients define
4、their imperatives,strategies and programs giving the organization a competitive edgeTop 3%of talent in the worldHand-picked,highly vetted and curated talent networkExpert identification of the right talent within 3 to 5 daysHigh success rate in placing the right talent for a roleConfidential|5Toptal
5、 Solutions with Databricks Design/Architecture of one unified data management systemsAutomation of data pipelinesIntegration of multiple data sourcesImplementation of Databricks LakehouseData EngineeringArtificial Intelligence and Machine LearningData ScienceSolve problems and accelerate innovation.
6、Develop predictive analytics making better and faster decisionsLeverage Auto ML,ML Flow and the Datalakehouse as the central data management platformDesign/Architect complex data analysis systemsDevelop custom data science workflowsIntegrate,analyze and visualize dataImprove and accelerate decision
7、making though improved data insightConfidential|6Multi-Cloud Streaming Data PlatformChallenges and SolutionsConfidential|7Confidential|8Streaming with Multi-Cloud Overview and ChallengesMost enterprises today are multi-cloudModern data landscape is increasingly crowdedTools and code both grow expone
8、ntiallyReliability and scalability are still catching upKey ObservationsAvoid cloud and vendor lock-inAvoid reprocessing historical dataDevelop for both batch and streamingScale to multiple deployments and cloudsDesign ConsiderationsConfidential|9Migrate from EDW to Open Source&FormatCross Cloud/Reg
9、ionBigQueryGoogle Kubernetes EngineEC2RDSGoogle Kubernetes EnginePub/SubCloud StorageEC2S3Confidential|10Bring Reliability and Scalabilityto Machine LearningConfidential|11ML Ops StackConfidential|12Iterate and Deploy Quality ModelsSource control code,data,and configuration with MLflow Recipessteps:
10、ingest:INGEST_CONFIGsplit:split_ratios:0.75,0.125,0.125post_split_filter_method:create_dataset_filtertransform:using:customtransformer_method:transformer_fntrain:using:customevaluate:validation_criteria:-metric:root_mean_squared_errorthreshold:10register:allow_non_validated_model:falseingest_scoring
11、:INGEST_SCORING_CONFIGpredict:output:PREDICT_OUTPUT_CONFIGmodel_uri:models/model.pklConfidential|13ML Ops StackConfidential|14IAC for Machine Learning PipelinesUse the Databricks Lab dbx project to manage ML workflows:workflows:-name:demo-ml-pipeline:*permissionsjob_clusters:-job_cluster_key:shared-
12、ml-clusternew_cluster:*prod-ml-cluster-settingstasks:-task_key:pretrainjob_cluster_key:shared-ml-clusternotebook_task:notebook_path:/Repos/ml/notebook_jobs/pretrain-task_key:finetune job_cluster_key:shared-ml-cluster notebook_task:notebook_path:/Repos/ml/notebook_jobs/finetunedepends_on:-task-key:pretrainThank YouNeil PickardHead of Service M+1(724)787-9384Get in touch with us!Blake HarveySenior Enterprise Client E+1(236)979-6349Interested in learning more about Toptal and our work with clients in data,analytics,and AI?Matt KroonVP of Enterprise S+1(402)676-0788