上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

2.美团OLAP引擎选型和实践优化后.pdf

编号:155563 PDF 25页 4.76MB 下载积分:VIP专享
下载报告请您先登录!

2.美团OLAP引擎选型和实践优化后.pdf

1、曾林西美团-高级技术专家Apache Doris在美团的统一OLAP引擎实践Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20

2、23Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023曾林西14年加入美团数据组,深度参与美团Hadoop集群规模从百到十万级的架构演进美团查询引擎团队负责人14年加入美团数据组,深度参与美团Hadoop集群规模从百到十万级的架构演进主要负责离线数仓生产、Adhoc查询、OLAP分析引擎与服务在美团业务场景的落地与演进Doris Summit Asia 2023Doris Summi

3、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Dor

4、is Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023目录2.美团OLAP引擎选型与实践3.当前挑战与未来规划1.美团OLAP场景特点Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 202

5、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20231美团OLAP场景特点Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do

6、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asi

7、a 2023美团 OLAP 场景特点时延敏感大部分数据报表响应时延要求在3秒内,部分toB场景要求亚秒级响应数据量大千亿级别数据分析,传统RDBMS/MPP/SQL on Hadoop方案搞不定业务丰富交易、经营分析、用户、流量、广告、金融、财务、LBS、性能监控PM/运营BD/骑手B 端商户业务场景特点及挑战可用性QPSTP99响应Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023

8、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 场景举例:外卖经营分析特点:数据量大、变化维、复杂业务逻辑查看

9、各组织节点下所有商户的上百项经营指标关键技术:MPP、列存储、ColocatedJoin技术难点:多张亿级别大表现场实时关联聚合商家事实表(干万)商家基础信息维表(干万)商家扩展信息维表(干万)商家最新蜂窝表(百万)配送方式维表蜂窝组织架构维表日期,商家ID日期,商家ID蜂窝ID商家IDDoris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do

10、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 场景举例:到店餐饮 BD 人效分析关键技术技术难点特点预聚合 Bitmap去重指标、数据自动上卷到月粒度支持高

11、效的 count distinct 模型复杂:100个维度,60个SUM指标,20个去重指标查看各组织节点下BD的业绩指标 单天百万行,单次查询需要分析半年数据Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023

12、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 场景举例:B 端商户报表使用成本高技术难点特点前缀索引数据自动上卷到月粒度要求高并发低延迟4个9可用性数据总量大点查询为主时间跨度大商家查看旗下门店的经营状况(广告投放效果/营业额/菜品销售量等)Doris Summit Asia 202

13、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit

14、Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20232美团 OLAP 引擎选型与实践Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi

15、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023201920222023基于Mysql 维度固定不支持大数据先分治,再统一先解决“时延敏感”和“数据量大”,再解决场景丰富基于Doris的统一OLAP引擎方案美团OLAP引擎选型与演进历程Ky

16、lin、Druid 落地平台建设稳定性提升ROLAP调研Doris落地技术栈转换Doris覆盖Kylin、Druid场景能力对齐POC验证Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi

17、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023统一 OLAP 引擎:问题和挑战目标:建设能满足各种业务场景的统一OLAP引擎,解决“三高”问题使用成本高资源浪费高维护成本高业务学习使用,数据治理成本增加业务数据需要存到多个引擎中,资源浪费重复建设,精力分散,迭代速度变慢,支持新场景更困难过去针对不同场景选择不同引擎

18、的方式带来了“三高”问题Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do

19、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023RPCExecutorData Stream SPIWorker3Storage统一 OLAP 引擎:架构设计 MPP架构的计算引擎(通用性)内置存储格式和引擎(高性能)物化视图和多种索引(多场景)InterfaceMetadataPlannerCoordinatorSchedulerRPCExecutorData Stream SPIWorker1StorageRPCExecut

20、orData Stream SPIWorker2StorageCoordinator其他存储系统其他存储系统Read统一 OLAP 引擎Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit

21、 Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023统一 OLAP 引擎:技术选型自研基于Doris改造基于Presto改造技术栈匹配度代码可扩展性项目落地周期业务迁移成本项目落地周期业务迁移成本技术栈匹配度代码可扩展性系统掌控力长期迭代效率项目落地周期实施风险长高长高Doris Summit Asia 2023Doris

22、 Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2

23、023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023统一 OLAP 引擎 POC 验证 选型改造 Doris核心思路 复用社区基于 MPP 的灵活查询能力,Mysql 接口,增量更新等能力 存储层预聚合,倒排索引等优化手段“多模态共存”保留通用计算性能,特殊场景能力对齐 Kylin/DruidIn-memoryMetadataQuery PlannerQuery CoordinatorDrois FrontendQuery ExecutorLocal StorageDrois BackendSpark Loa

24、d(批量构建流程)HDFSMysql Tools(Client)存储层改造:支持全局字典批量导入支持基于bitmap的倒排索引重构读取接口,支持变长类型In-memoryMetadataQuery PlannerQuery CoordinatorDrois FrontendIn-memoryMetadataQuery PlannerQuery CoordinatorDrois FrontendQuery ExecutorLocal StorageDrois BackendQuery ExecutorLocal StorageDrois Backend查询层改造:支持全局字典构建流程物化视图重写

25、,支持bitmap去重算法SQL批量导入RPC通信数据传输改造模块原有模块新增模块外部组件Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit

26、Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023统一 OLAP 引擎 POC 验证 选型改造 Doris核心思路、单表聚合预计算多表聚合预计算高基数精确去重Kylin时序数据分析多维过滤查询Druid、灵活复杂查询主键更新聚合近实时分析Doris外卖用户行为分析酒店交互式报表流量漏斗分析其他系统ESSnappyDataSpark+Alluxio高基数精确去重多表聚

27、合预计算单表聚合预计算Kylin、灵活复杂查询多维过滤查询时序数据分析Druid酒店交互式报表流量漏斗分析其他系统SparkSQL/ES/Spark+Alluxio主键更新聚合近实时分析外卖用户行为分析Doris存量场景覆盖 复用社区基于 MPP 的灵活查询能力,Mysql 接口,增量更新等能力 保留通用计算性能,特殊场景能力对齐 Kylin/Druid 存储层预聚合,倒排索引等优化手段“多模态共存”Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ

28、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于

29、Doris的统一OLAP引擎实践软件架构层优化:开源社区深度合作引擎内核优化+内部平台产品建设提升易用性 建模优化:基于历史查询的智能物化、表模型优化建议 SQL优化:CBO 查询优化器 执行层:向量化执行、Pipeline执行框架、Colocate join 优化 存储层:Rollup、Bitmap 正交分桶、倒排索引 工具链:回归测试框架、性能 Profile、大查询自动检测与治理、集群健康监控新硬件应用(调研和摸索阶段):算力:GPU 加速查询执行器,目前代表性 GPU 数据库系统大部分是学术项目,使用 GPU 加速查询主要面临数据传输(PCIE 带宽有限)、显存管理、算子 GPU 化实

30、现方面的挑战 IO:通过 NVM/SSD 实现多级存储,加速对热点数据的访问Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20

31、23Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 引擎优化举例:Colocate Join外卖经营分析大表关联查询执行超时问题Shuffle Join的数据重分布开销大 哈希计算 序列号/反序列化开销 RPC开销原因Colocated Join 多表预先根据关联字段同分布存储方案基于Shuffle Join的实现Join性能平均提升3倍,满足外卖商家报表需求Doris

32、Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20

33、23Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 引擎优化举例:Bitmap 精确去重SQL百亿数据量级亿级别基数精确去重指标计算平均性能提升45倍,单表查询10s内Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit As

34、ia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 引擎优化举例:Spark on Doris支持跨集群读写能力,避免了重复计算逻辑,减少了计算

35、和存储资源将复杂的数据加工逻辑放在Spark上执行,大大降低 Doris 的计算压力对外提供的是SQL接口,原来的Doris2Doris数据加工任务可以无缝迁移Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023

36、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023OLAP 工具建设举例:大查询监控与治理 集群关键信息埋点收集 按查询实例、表、库粒度聚合 异常查询监控告警 打通大查询的监控与分析链路Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit

37、 Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Dori

38、s Summit Asia 2023Doris Summit Asia 2023OLAP 工具建设举例:查询性能可视化分析Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2

39、023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20233当前挑战与未来规划Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D

40、oris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023当前问题与挑战资源效率:存算一体架构在需要资源伸缩场景面临的挑战 业务高低

41、峰:节假日促销活动、用户在不同时间段的分析频率等因素带来的明显波峰和低谷 计算/存储资源扩容:计算资源不够需要扩容时,连带存储资源的扩容,磁盘空间使用率低读写稳定性:从离线分析到支持在线决策,业务对可用性要求越来越高 FE可扩展性:集群规模带来的元数据膨胀,FE单点瓶颈以及故障恢复时长影响 混合负载下的隔离:同一套分析系统里既要支持数据导入、加工处理,又要支持高并发低时延查询查询性能:从偏固定报表查询场景往 Adhoc 查询场景拓展 场景复杂:既有千亿级带精确去重指标的流量分析、也有多张大表现场关联查询的经营分析 使用灵活:支持用户自助指标维度拖拽分析Doris Summit Asia 202

42、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit

43、Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20231存算分离架构弹性扩缩容CBO 查询优化器Pipeline 执行框架智能物化视图构建2345未来规划Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 202

44、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023获取更多社区动态与最佳实践Doris Summit 峰会官网:doris- Doris Summit 峰会回放:https:/ Doris 官网:doris.

45、apache.orgApache Doris GitHub: Doris 官方平台:Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(2.美团OLAP引擎选型和实践优化后.pdf)为本站 (张5G) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部