《解锁 SQL 查询性能 深入理解关系数据库的内部机制与优化策略-李小燕.pdf》由会员分享,可在线阅读,更多相关《解锁 SQL 查询性能 深入理解关系数据库的内部机制与优化策略-李小燕.pdf(21页珍藏版)》请在三个皮匠报告上搜索。
1、解锁SQL查询性能TheTradeDesk/Lisa LiLisa LiEver worked in IBM,Intel:storage,distributed systemJoined in TheTradeDesk in 2021,focus on Retail related business.Self introductionProgrammatic advertisingBuy SideSell SidePublisherPublisher(NY Times,Spotify,Coupang)Supply-Side PlatformSupply-Side Platform(Google
2、,Magnite)UserUserAdvertiserAdvertiser(BMW,Pizza Hut,P&G)Third-party Data Providers(Oracle,LiveRamp)Demand-SideDemand-Side PlatformPlatformAgencyAgency(WPP,Publicis)$First-party DataTheTradeDeskAgendaWhy investigate the topicUnderstanding Indexes and execution plansOptimizing with examplesItem-level
3、events for targeting and measurement Item-level events for targeting and measurementTables:Merchant MerchantProduct AdvertiserProductProductIdSourceProductIdMerchantIdProductNameShampoo11912MX986752ConditionerProblemUse SQL server to manage metadataData increased 10 x timesTimeout may hap
4、penUnderstand how query works and OptimizeProductCountBefore1,901,935After24,686,272IndexesWhat are indexes?Contain sorted data of the columns in the index definitionOn-disk structure associated with a table or viewContains keys built from one or more columns in a table or viewWhy use indexes?Speed
5、up the retrieval of dataReduce disk IOsImpactData updates and deletesB-Tree Data StructureSelf-balanced tree data structure that is a generalized form of the Binary Search Tree(BST).Each node contains at most a fixed number of keysAll leaf nodes are at the same level.Time complexity:O(logn),wherenis
6、 the number of keys stored in the treeFrom https:/www.geeksforgeeks.org/introduction-of-b-tree/IndexesClustered Index Sort data rows in the table itself Only one clustered index is allowed in a table A table without a clustered index is called HEAPNon-Clustered IndexA structure separate from the dat
7、a rowsCould be multiple index keysEach key points back to the data rowSingle IndexSingle-Column IndexesComposite IndexOn two or more columns of a tableClustered&non-clustered IndexNon-clusteredClusteredSingle/Composite IndexesSingleCompositeExecution plansThe sequence of steps that the database exec
8、utes a queryExampleTable:Product(ProductId,MerchantId,SourceProductId)Basic MetricsOperationNumber of Rows ReadPredicateObjectOutput ListExecution Plan cont.Index seekThe Query Optimizer uses the b-tree(non-left level)solely for navigation to the leaf level with the actual data.Index scanSQL Server
9、reads all rows in a table,and then returns only those rows that satisfy the search criteria.Key lookupThe key lookup operator occurs when the query optimizer performs an index seek on a non-clustered index,and that index does not have all the columns needed to fulfill the result set so it has to go
10、back to the table to retrieve those additional columns.Performance comparisonIndex seek Index scan key lookupUse Indexes to accelerateWith/without indexesselect*from ecommerce.MerchantProduct where MerchantId=99 and SourceProductId=sku_32 and IsDeleted=0;Whether indexes are used correctlyHow to use
11、indexesSourceProductId varchar(128)not null,Same typeDifferent typesIndexed view Need to scan all the records to aggregateselect mp.MerchantId,ap.AdvertiserId,count_big(*)ProductCountfrom ecommerce.AdvertiserProduct ap inner join ecommerce.MerchantProduct mp ON ap.ProductId=mp.ProductIdWHERE mp.IsDe
12、leted=0group by mp.MerchantId,ap.AdvertiserIdIndexed View cont.Indexed views are materialized views that store data like a table.Used to store some aggregated data.create view ecommerce.vw_AdvertiserMerchantProductCountwith schemabindingasselect mp.MerchantId,ap.AdvertiserId,count_big(*)ProductCount
13、from ecommerce.AdvertiserProduct ap inner join ecommerce.MerchantProduct mp ON ap.ProductId=mp.ProductIdWHERE mp.IsDeleted=0group by mp.MerchantId,ap.AdvertiserIdgocreate unique clustered index cix_vw_AdvertiserMerchantProductCount on ecommerce.vw_AdvertiserMerchantProductCount(MerchantId,AdvertiserId)goSummayCreate necessary indexes to improve query performanceCheck whether an index works as expectedUse indexed view to accelerate some calculationsTheTradeDesk