1、数据蜂巢-轻量级数据处理平台CONTENTSCONTENTSp背景IntroductionIntroductionp核心功能p设计架构p应用 数据源数量多,分布广 数据出口不统一 资源受限,数据处理缺少必要生态 生产业务重度依赖,实时性要求高背景CONTENTSCONTENTSp背景IntroductionIntroductionp核心功能p设计架构p应用实时数据采集历史数据加工同步实时数据单表加工同步实时数据宽表加工同步核心功能CONTENTSCONTENTSp背景IntroductionIntroductionp核心功能p设计架构p应用基本概念Batch-历史数据加工同步Stream-数据
2、采集Pie-实时数据加工同步JobTaskQueen(Master)Queen(Master)Wasp(Slave)Wasp(Slave)PieWorkerStreamWorkerBatchWorkerWasp(Slave)MysqlActiveStandbyZookeeperManagerMonitor数据采集-Stream售后SourceStorageMysqlSourceStreamSourceDcombStoreSourceRelayLogTaskExtractormysql binlogrelay binlogstorerelay binlogstorerelay binlogsto
3、reRelayLogTaskExtractorExtractorRelayLogTaskCatchupCatchupMainStoreFileQueueConsumerClientServerIndexFilterTransportFileQueueFileQueueSegementIndexSegementIndexFileIndex数据传输ClientStore123412345ConnectionPool5单表加工-DirectPie售后客服FetcherRecordProcessorCommitProcessor Fetcher:Dcomb;JMQ;Kafka CommitProces
4、sor:ES,Mysql,PG,JMQ,Kafka单表加工-DirectPieFetcherPartitionerProcessorChainPositionPersistenceEngineCommitRing?ProcessorChainProcessorChain并发:串行,表,行宽表加工-FlowPie宽表加工-FlowPieTickNodeAckNodeSourceNodeComputeNode_1ComputeNode_2Table_aComputeNode_1ComputeNode_2Table_bCommitNode_1CommitNode_2历史数据-Batch售后客服FetcherStorageRecordProcessorCommitProcessorJobTaskSplitterTask-1Task-2Task-3Task-4SinkerCONTENTSCONTENTSp背景IntroductionIntroductionp核心功能p设计架构p应用数据迁移DBDB-1DB-2I.同步历史数据II.开启实时同步III.应用迁移IV.关闭实时同步园区数据采集StreamStreamStreamStreamStreamStreamPieESHBaseKafkaMysql本地加工处理业务系统Dcomb