上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

ColumnStore产品测试和技术支持(40页).pdf

编号:91187 PDF 40页 3.66MB 下载积分:VIP专享
下载报告请您先登录!

ColumnStore产品测试和技术支持(40页).pdf

1、 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。李玉衡ColumnStore 产品测试和技术支持 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDBColumnStoreColumnStoreProduct Training Product Training 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Analytics Introduction MariaDB Solution for Big Data Analytics MariaDB ColumnStore Deep Dive Use Cases and D

2、ifferentiations Cassandra Compare Sizing and Pricing Target Audience Message ContentContent 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Type of Type of AnalyticsAnalytics1234TraditionalTraditionalOLAPOLAPBig DataBig DataAnalyticsAnalytics 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB Solution for Solution for

3、Big Data AnalyticsBig Data AnalyticsHigh performance data management solution for big data analytics High performance data management solution for big data analytics Social MediaMariaDB MaxScaleMariaDB ColumnStore Node 1 Node 2Node 3Node N.Connectors,SPARK Integration etcDescriptive Descriptive Anal

4、yticsAnalyticsWhat is Happening?What is Happening?Diagnostic AnalyticsDiagnostic AnalyticsWhy did it Happen?Why did it Happen?Predictive AnalyticsPredictive AnalyticsWhat is likely to What is likely to happen?happen?Transactional,OperationalSensorsBiometricsMobileETL ToolsData CollectionAnalytics In

5、sightUMUMPMPMPMPMData ProcessingMariaDB ColumnStore.Prescriptive AnalyticsPrescriptive AnalyticsWhat should I do about What should I do about it?it?2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDB ColumnStoreDeep Dive 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColumnStoreArchitectureArchitectu

6、reUser Module:Processes SQL RequestsPerformance Module:Multi Threaded Distributed Processing EngineColumnar Distributed Data StorageMariaDB SQL Front EndDistributed Query EngineUser ModulesUser ModulesPerformance Performance Module 1Module 1.Performance Performance Module NModule NPerformance Perfor

7、mance Module 2Module 2Performance Performance Module 3Module 3ClientsUser ConnectionsLocal Disks,SAN,EBS,GlusterFS,HDFS 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColumnStore 1.0 1.0 PerformancePerformanceColumnar Storage,multiColumnar Storage,multi-threaded and Massively Parallel dis

8、tributed execution threaded and Massively Parallel distributed execution engineengineHigh High AvailabilityAvailabilityBuilt in redundancy and high availabilityBuilt in redundancy and high availabilityScaleScaleLinear scalability Linear scalability AnalyticsAnalyticsIn database analytics with Comple

9、x and Cross Engine JOINsIn database analytics with Complex and Cross Engine JOINsWindowing functions and UDFsWindowing functions and UDFsOut of box BI Tools connectivity,Out of box BI Tools connectivity,Analytics integration with RAnalytics integration with REase of UseEase of UseANSI SQL compatible

10、 ANSI SQL compatible ACID compliantACID compliantNo indexes,No materialized viewsNo indexes,No materialized viewsNo manual partitioningNo manual partitioningData IngestionData IngestionHigh speed parallel data load and extractHigh speed parallel data load and extractCreate Table as Select,Like Creat

11、e Table as Select,Like-locally,cross database joins,or over ODBClocally,cross database joins,or over ODBCSecuritySecuritySSL support,Audit Plugin,Authentication Plugin,Role Based AccessSSL support,Audit Plugin,Authentication Plugin,Role Based AccessDeployment Deployment On premise,AWSOn premise,AWS

12、2015,MariaDB Corp.灰色为遮挡区域,排版请注意。ODBC/JDBC MariaDB/MySQL Connectors BI toolsClient AccessClient Access 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Query parsed by mysqld on UM node Parsed query handed over to ExeMgr on UM node ExecMgr breaks down the query in primitive operations Query Query ProcessingProcessing

13、-UMUM 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Query Query ProcessingProcessing-UMUMSQL Operations are translated into thousands of Primitives Parallel/Distributed 2D Partitioned Data Access Parallel/Distributed Joins(Inner,Outer)Parallel/Distributed Sub-queries(From,Where,Select)Primitives Intermediate Prim

14、itives Intermediate ResultsResultsSQLColumnPrimitivesPerformance Performance ModuleModulePerformance Performance ModuleModulePerformance Performance ModuleModulePerformance Performance ModuleModulePerformance Performance ModuleModuleUser ModuleUser Module 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Query Query

15、ProcessingProcessing-PMPM Primitives processed on PM One thread working on a range of rows Typically 1/2 million rows,stored in a few hundred blocks of data Execute all column operations required(restriction and projection)Execute any group by/aggregation against local data Return results to ExeMgr

16、process in User Module Each primitive executes in a fraction of a second Primitives are run in parallel and fully distributed 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Query Query ProcessingProcessing-UM+PMUM+PM1.A request comes in through the Front end interface.MariaDB performs a table operation for all tab

17、les needed to fulfill the request and obtains the initial query execution plan from MariaDB Server.2.Storage engine interface converts the MariaDB table objects to MariaDB ColumnStore objects.These objects are then sent to a User Module.3.The User Module converts the MariaDB execution plan and optim

18、izes these objects into an MariaDBColumnStore execution plan.The User Module determines the steps needed to run the query and when they can run.4.The User Module consults the Extent Map for the locations of the data needed to satisfy the query and performs extent elimination based on the information

19、 contained within the Extent Map.5.The User Module sends commands to one or more Performance Modules to perform block I/O operations.6.The Performance Module(s)carry out predicate filtering,join processing,initial aggregation of data,and sends data back to the User Module for final result set proces

20、sing.7.The User Module performs final result set aggregation and composes the final result set for the query.8.The User Module returns the result set back for delivery to the user.2015,MariaDB Corp.灰色为遮挡区域,排版请注意。RowRow-OrientedOrientedvs vs ColumnColumn-OrientedOrientedRow-oriented:rows stored seque

21、ntially in a fileColumn-oriented:each column is stored in a separate fileEach column for a given row is at the same offset.Key FnameLnameStateZipPhoneAgeSales1BugsBunnyNJ11217(123)YosemiteSamCT95389(234)DaffyDuckIA10013(345)ElmerFuddCT04578(456)

22、WitchHazelCT01970(567)Key12345FnameBugsYosemiteDaffyElmerWitchLnameBunnySamDuckFuddHazelStateNJCTIACTCTZip000Phone(123)938-3235(234)375-6572(345)227-1810(456)882-7323(567)744-0991Age3452354357Sales250 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Storage Data Sto

23、rage Vertical Partitioning by ColumnEach column in its own column fileOnly do I/O for columns requestedLogical LayerPhysical LayerTableColumn1ColumnNExtent 1(8MB64MB8 million rows)Extent N(8MB64MB8 million rows)SegmentFile1(Extent)SegmentFileN(Extent)ServerDB RootBlocks(8KB)Horizontal Partitioning b

24、y range of rowsLogical grouping of 8 million rows of each column fileIn-memory mapping of extent to physical layer 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data StorageStorage-Extents and PMsExtents and PMsExtent 1Extent 2Extent 3Extent 4Extent 5Extent 6Extent 7Extent 8PM 1PM 2Extent 1Extent 2Extent 3Ex

25、tent 4Extent 5Extent 6Extent 7Extent 8PM 1PM 2PM 4PM 3 Extent Map In memory meta-data of an extents min,max value for a column,extents physical block offset and PM on which the extent resides 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data StorageStorage-Local DisksLocal Disks Each PM nodes stores data on

26、 local disk No PM node can access the data on another PM node Shared Nothing No data redundancy 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data StorageStorage-SANSAN Each PM node is attached to a set of volumes on SAN-called DBRoots Upon failure of PM node,another PM attaches to the failed PMs DBRoots Sha

27、red nothing during running state No data redundancy 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data StorageStorage-GlusterFSGlusterFS Distributed file system Software based storage system GlusterFS runs on every PM node Creates distributed file system with each PM nodes local disks and network interface a

28、cross PM nodes Data redundancy across multiple nodes Automatic data failover Data availability during failover and failback 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data StorageStorage-EBSEBS Dynamic scaling to handle variable workloads Data layer high availability with Elastic Block Store(EBS)2015,Mari

29、aDB Corp.灰色为遮挡区域,排版请注意。Data IngestionData IngestionBulk data load cpimport:CSV and Binary LOAD DATA INFILE:CSV Apache Sqoop Integration:Integration with cpimport and sql interfaceFuture Release Data Streaming from MariaDB/MySQL database to MariaDB ColumnStore cluster via Kafka Avro data record 2015,

30、MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data IngestionIngestion-cpimportcpimportFastest way to load data Load data from CSV file Load data from Standard Input Load data from Binary Source fileMultiple tables in can be loaded in parallel by launching multiple jobsRead queries continue without being blockedSu

31、ccessful cpimport is auto-committedIn case of errors,entire load is rolled back 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Data Data IngestionIngestion-LOAD DATA INFILELOAD DATA INFILETraditional way of importing data into any MariaDBstorage engine tableUp to 2 times slower than cpimport for large size imports

32、Either success or error operation can be rolled back 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。High AvailabilityHigh AvailabilityHA at UM nodeWhen one UM node goes down,another UM node takes overHA at PM node SAN/AWS EBS-When a PM node goes down,the data volumes attached to the failed PM node gets attached to

33、 another PMLocal Disks-If a PM node goes down,the data on its disks are not available,though queries continue on the remaining data setHA at Data StorageAWS EBS GlusterFS-Multiple copy of data block across storage.If a disk on a PM node fails,another PM node will have access to the copy of the dataH

34、DFS-Multiple copy of data block across storage.If a disk on a PM node fails,another PM node will have access to the copy of the data 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MultiMulti-UMUMDefault Default configurationconfigurationApplications connect to single UMAutomatic round-robin distribution/scale-out

35、of queries(based on connection id)across all UMs The two UMs schema and non-ColumnStore tables to be kept in synch with mysql replication.Setup during post config or use enableMysqlReplication in mcs console InterfaceExecMgrInterfaceExecMgrUM1UM2Connection Id based round-robin 2015,MariaDB Corp.灰色为遮

36、挡区域,排版请注意。MultiMulti-UM MultiUM Multi-Front Front end configurationend configurationApplications can connect to multiple UMFrom each UM Automatic round-robin distribution/scale-out of queries(based on connection id)across all UMs The two UMs schema and non-InfiniDB tables to be kept in synch with my

37、sql replication-Setup during post config or use enableMysqlReplication in calPont console InterfaceExecMgrInterfaceExecMgrUM1UM2Connection Id based round-robin 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColumnStore on on HadoopHadoopNative scoop integrationRuns on existing Apache Hado

38、op hardwareSQL access to Apache Hadoop datalibhdfs integrationMap ReduceHBaseMariaDB ColumnStoreHadoop Distributed File SystemPig/HiveBatch ProcessingBatch ProcessingHigh Performance High Performance analyticsanalytics 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColumnStore on on AWS A

39、WS Automated cluster installation on AWS Dynamic scaling to handle variable workloads Data layer high availability with Elastic Block Store(EBS)2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Use Cases 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Use CasesUse CasesSQLSQLPERFORMANCE AT SCALEPERFORMANCE AT SCALEPut massive data s

40、ets to work with real-time analytics for your growing businessNEW INSIGHTSNEW INSIGHTSUncover new insights and business opportunities with advanced big data analytics.HIGH PERFORMANCE HIGH PERFORMANCE ANALYTICS for HADOOPANALYTICS for HADOOPDemocratizes access to data in Hadoop to larger user baseUN

41、IFIED SIMPLICITYUNIFIED SIMPLICITYSimplify and reduce operational costs by uniting analytical and transactional workloads 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。DifferentiatorsDifferentiatorsSCALESCALE Massively parallel architecture designed for big data scaling to process petabytes of data Read performan

42、ce scales linearly with data growthSPEEDSPEED Exceptional performance Real-time response to analytics queriesSECURITY and RELIABILITYSECURITY and RELIABILITY Data with encryption for data in motion,role based access and audit features of MariaDB Enterprise Built-in high availability at access and da

43、ta layersSIMPLICITY with POWERSIMPLICITY with POWER Simplified management and maintenance,Easy installation and scaling Same interface as MariaDB and MySQL,Attaches to wide range of BI tools 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Use Use Case:Case:Scaling Scaling Big Big Data AnalyticsData AnalyticsAn orga

44、nization is generating large amount of operational data Multiple tera-bytes of historical data With growth in business and in operational data Analytics query performance degrades Impractical to do analytics Put past data into MariaDB ColumnStoreAs data grows Perform analytics without performance de

45、gradation Linear Scalability with data growthRows/DataSize ScopeRows/DataSize Scope1 100 10,000 1,000,000 100,000,000 10,000,000,000 100,000,000,00010-100GB 100-1000GB1-10TB10-100TB.PBMariaDB Enterprise OLTPMariaDB Enterprise OLTPMariaDB Enterprise ColumnStore MariaDB Enterprise ColumnStore Business

46、 ChallengeMariaDB ColumnStore Solution123MariaDB ColumnStore 1.0MariaDB ColumnStore 1.0Add new node(s)Add new node(s)Harvest new value from large historical datasets by deriving new insightsSupport growth in your business,while continue to deliver high service levels for data analytics 2015,MariaDB

47、Corp.灰色为遮挡区域,排版请注意。Use Use Case:Case:Discover Discover InsightInsightThe company plans to sell this solution as a service to commercial airlinersUncover new business opportunity with data exploration and analytics on big dataMariaDB ColumnStore SolutionReal-time in-flight performance dataAnalyticsAn

48、alyticsHistorical DataSQL Analytics on the real-time data and historically collected flight parameter data Proactively project parts replacement,maintenance and air-plane retirement Too time-consuming to perform analytics with current toolset Most of the data analyst have SQL backgroundTimely mainte

49、nance forecast,part replacement,flight retirementFamiliar SQL interface Complex-join,aggregation and windowing functions High speed real-time performanceMicro-batch upload real-time flight performance into MariaDB ColumnStore As planes go through flights,various parts and engine of the planes need t

50、o be maintained for safety of the flights 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Use Case:Accelerated Use Case:Accelerated Analytics with Hadoop Analytics with Hadoop Large amount of data in HadoopHadoop is suitable for batch processingTransforms via Map-Reduce programmingReal-time analytics on HadoopSpeed

51、 cannot meet business requirement with the Hadoop tool setShortage of Hadoop skills for Data Scientist/BA SQL interfaces on Hadoop Tools are not mature MariaDB ColumnStore OLAP can run on premise,on cloud or on Hadoop clusterIngest data from Hadoop Mature ANSI-SQL complianceStellar performance:70 to

52、 80 times faster than SQL-on-Hadoop counterparts Hive,Hbase and ImpalaMature interfaces Familiar SQL interfaces democratizes access to big data to larger user baseAttach wide range of BI tools via MariaDB/MySQL connectorsBusiness ChallengeMariaDB ColumnStore SolutionMap ReduceHBaseMariaDB ColumnStor

53、eHadoop Distributed File SystemPig/HiveBatchProcessingHigh Performance analytics 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。Use Case:Simplifying Use Case:Simplifying Big Data Management Big Data Management MariaDB ColumnStoreLiberation from Index managementAutomatic partitioningEasy to growMicro-batch bulkload

54、 for real-time data-flowComplexity of data management increases as data volume grows Tedious to keep up with indexes and partitioning as data grow Scaling-out or Scaling up management Moving operational data to big data analytics platform in real-timeImproved DBA productivityReduced operational comp

55、lexityGetting most value out of big data while minimizing DBA Business ChallengeMariaDB ColumnStore SolutionPM NodecpimportSourceSourceSourceUM NodePM NodePM Node 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。DeveloperDeveloperFriendlinessFriendlinessFocus on application development rather than tuning queries and

56、/or application as data growsHave flexibility to work varied tools and languages:SQL,BI tools,Python,Java,C+,GoEasily deploy and test analytics applicationsMariaDB ColumnStore empowers developers with No need to tune queries and applications as data grows Mature SQL interfaces Python,R,Java and C+co

57、nnector BI tools access through ODBC/JDBC and MariaDB connectors Cloud consumption options for AWS Easy installationImprove developer productivityLeverage existing investmentsMinimize Opex Developer ChallengeMariaDB ColumnStore Solution 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColum

58、nStoreOLAP SecurityOLAP Security Built upon MariaDB Server 10.1-secure open source databaseKeep valuable data secure,while getting the most value out of your data assetsReduce Risks and costs associated with security breaches Control Access-Unauthorized Access-SQL Injection attacks-DDoS attacksMake

59、The Prize Unattractive With EncryptionNative Mode Encryption protects data at restForensicsAudit Log for CompliancySSL Encryption protects data in motionCommunity Improves Security 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMariaDB ColumnStoreColumnStoreHigh AvailabilityHigh AvailabilityA financial orga

60、nization has mandate to detect fraudulent activities2015 US total credit-card fraud cost$600 billionEach fraud incident average cost$1900Average 13 frauds per minuteAny downtime in their fraud detection system is costly -$26,000 per Minute!MariaDB ColumnStore s distributed,MPP architecture has built

61、 in high availabilityActive/Standby data access nodes(UM)Data redundancy across distributed nodes(PM)Keep business runningMinimize costs associated with downtime Business ChallengeMariaDB ColumnStore SolutionFraud DetectionColumnar Distributed Data Storage 2015,MariaDB Corp.灰色为遮挡区域,排版请注意。MariaDBMari

62、aDB ColumnStoreColumnStore:Performance ComparisonPerformance ComparisonInnoDBColumnStoreDeltacpimportn/a27.70n/aLDI1,231.0768.271,803%InsertSelect1,532.2994.101,628%DBT3(disk)3,881.4021.0718,421%DBT3(cached)3,637.4914.7424,677%Performance for 1gb DTB3 database(in seconds)Tested on Amazon AWSInstance type:m4.2xlarge Disk:SSD 200 GB without encryption,internal Source data:On a 200gb EBS,attached 2015,MariaDB Corp.

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(ColumnStore产品测试和技术支持(40页).pdf)为本站 (云闲) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部