《2-1 腾讯新一代多维分析引擎 HermesDB.pdf》由会员分享,可在线阅读,更多相关《2-1 腾讯新一代多维分析引擎 HermesDB.pdf(29页珍藏版)》请在三个皮匠报告上搜索。
1、Next generation of Tencent OLAP EngineTencent TEG LongYueCONTENTS01Background02Storage:Various Columns and Indexes04Benchmark and Applications03Computation:Integrated with PrestoBackgroundMercsDBlData:1.Thousand Columns,10 Billions Rows2.Index on arbitrary column(s)3.Real-Time Write4.Both Row-orient
2、ed and Column-oriented5.Different IndexeslPerformance:1.Second response on 10 Billions rows query2.MPP:both ad-hoc query and real-time query3.Real-time Write:100 Billion rows/day!=ElasticSearchPresto/ImpalaClickHouseHistoryMercsDBHermes 1.0Hermes 2.0Hermes 3.02013.12016.122019.62021.12Mass DataReal-
3、TimeBI:Use PicturesInverted IndexBasic OLAPLog AnalysisIntroduce SparkFull OLAPAdsIntroduce PrestoVectorizationFocus on PerformanceCurrent StatusClusters5k+NodesQuery10M/DayStorageTotal:100 PBDaily:1PBPeak IO100M rows/sCONTENTS01Background02Storage:Various Columns and Indexes04Benchmark and Applicat
4、ions03Computation:Integrated with PrestoBasic ArchLocalFSCompute EngineDataStoreWorkerQueryExecutionEngineDataReaderHDFSOZoneCEPHDistricted vs LocalHA vs Low LatencyHot Data vs Cold DataReplica managementAuto Disaster ToleranceIsomeric ArchLow LatencyLoss ServiceLRUCacheMMAPColumn-Oriented&IndexQ1:H
5、igh QPS?Q2:Second response for 10Bilion rows?Q3:Cost-friendly for mass data?lColumns:1.Retrieval Low latency2.Sorted Mass Data3.Compressed Cost-Friendly4.Nested Support parquetlIndexes:1.SpareIndex2.SkipListIndex3.InvertedIndex4.KeyIndex5.LBSIndexRetrieval ColumnlApplication1.Low latency2.Medium dat
6、a3.Simple QuerylImplementation1.Storage-Time2.Dictionary IndexIndex Size/Origin Data=40%Sorted ColumnlApplication1.Mass Data(much bigger than memory)2.No other accelerationslImplementation1.Sorted2.Index both on data and offsetImprovement:10 x speed upCompressed ColumnlApplication1.High Cardinal2.Di
7、rect compression:low performancelImplementation1.BitShuffle+LZ4Compression rate:50%upIndexsFST KeyIndexInverted IndexLBS(KDB)IndexCONTENTS01Background02Storage:Various Columns and Indexes04Benchmark and Applications03Computation:Integrated with PrestoMercsDB Dynamic RouterRouterComputationSQL RouteI
8、nterfaceJDBCHTTPgRPCNative EnginePersto EngineSimple QueryFull OLAPMPPNative WorkerPresto ExecutorNative WorkerPresto ExecutorNative WorkerPresto ExecutorNode1Node2Node3MercsDB Native EngineRouterComputationMercsDB ServerSQL RouteInterfaceJDBCHTTPgRPCNative EnginePresto EngineMercsDB Workercck1过滤t1t
9、2t3c求交index统计k1k2组合分桶abc按列存聚合max(a)sum(b)count(c)abc统计最大term统计term*doc统计所有docLate MaterializationMercsDB Presto EngineRouterMercs DB ServerComputationSQL RouteInterfaceJDBCHTTPgRPCNative EnginePresto EngineMercs DBWorkerStage nWorkerStage nWorkerStage nPush down into MercsDBSELECT*FROM(SELECT FROM A
10、)JOIN(SELECT agg FROM B WHERE X)ON C=DMercsAPIData transformationPresto BlockMercsDB Datalong ValuesResult of Mercs APIbool isNullsVectorizedJava Vector APIIncubator since JDK 16Tencent KONA JDK 17Vectorization with Vector APIOpt1:Unroll LoopOpt2:Unify Vector SpeciesOpt3:No Boxing&UnboxingNo Object
11、creationNo function callOthersBatch VectorizationSequential memory accessCONTENTS01Background02Storage:Various Columns and Indexes04Benchmark and Applications03Computation:Integrated with PrestoSSB Benchmark 1MercsDBOptimized MercsDB60B rows600M rowsRowsOriginSizeMercsDB(with index)ClickHouse(with s
12、pare and primary-keyindex)600M200GB89GB60GB6B2TB882GB593GBSSB Table:flat_lineorderSSB Benchmark 2:QPS 1CKMercsDB60B rows600M rowsSSB Benchmark 3:QPS 20CKMercsDB60B rows600M rowsLog Retrieval in WeChat PaylBackground:1.Real Time Write:100 Billion Rows/day2.Retrieval both Global and Specific3.Mass Dat
13、alSolutions:1.Write with TubeMQ2.Participle and Index on Data3.Separation of storage:I.IndexFile:Local DiskII.RowData:HDFSAB Test for AdslBackground:1.Second response for Mass Data2.Thousand columns in single query3.Arbitrary JoinlSolutions:1.Sorted by Primary Key(ad_id)2.Compressions on metric column,40%storageof origin data3.Presto/Spark supported with CacheThanks