1A-202_Advantages and Use Cases for Adding the CXL interface to DPUs- ARM.PDF

编号：139665

PDF 11页 876.80KB 下载积分：VIP专享

下载报告请您先登录！

1A-202_Advantages and Use Cases for Adding the CXL interface to DPUs- ARM.PDF

1、Advantages and Use Cases for Adding the CXL interface to DPUsPavel(Pasha)Shamis,Sr.Principal EngineerKshitij Sudan,Director Storage&Accelerator Segment MarketingArm Inc.Enabling technologies for disaggregation and heterogeneityStorage Disagg(NVMe-oF)CompleteAccelerator Disagg(GPUs,FPGA,TPU,Habana,)I

2、n execution/deploymentMemory Disagg(DRAM,SCM)Arch ExplorationStorage+Memory convergence(SDIO)ConceptTodayFuture PastDisaggregationHeterogenouscomputeFPGAsSmartNICsVideoaccelTPUGen1 comp.storageGPU(multi-tenant)CXL Type-3(mem)CXL Type-1CXL Type-2GPUSan Jose,CA April 26-28,2022Drivers for CXL interfac

3、e on DPUs Domain specific accelerators gaining prominence in datacenter Line-rate processing of network traffic is critical for certain use-cases Like compression,de-dupe,encryption,streaming data-processing Near-storage/near-memory processing technologies advancing Computational storage,in-memory p

4、rocessingAll these use-cases need larger DPU memory at low-costCXL interface addresses this need3San Jose,CA April 26-28,2022Emerging datacenter storage&memory architectureDatacenter NetworkWarm caching tierTarget DPUNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVM

5、e FlashNVMe FlashPCIe SwitchCXL mem expansionCXLStorage/Cold tierNICSoCDRAMDRAMPCIeHBAHBAHBAHBAServer SoCDRAMDRAMDRAMDirect Attached Storage(Flash)Initiator DPU(NVMe-oF)CXLMem expansionOS page$Compute Server 1Compute Server NCompute Server 0General Purpose ComputeCXL mem expansionCXLPCIeCXLCXLContro

6、llerDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR InterfaceCXL InterfaceDisaggregated DRAM PoolSan Jose,CA April 26-28,2022Sharing DPU attached memory with host Advantages Lower pin-count on host Lower cost/bit Technology for page/data management over described by large cloud providersTMO:Transparent

7、Memory Offloading in Datacenters,J Weiner et.al.ASPLOS-2022Software-driven far-memory in warehouse-scale computers,A.Lagar-Cavilla et.al.APLOS-2019First-generation Memory Disaggregation for Cloud Platforms,H.Li et.al.Arxiv-2022 Concerns Stranded-ness of DRAM directly attached to DPUServer SoCDRAMDRA

8、MDRAMDirect Attached Storage(Flash)Initiator DPU(NVMe-oF)Compute Server 0General Purpose ComputeCXL mem expansionCXLPCIeCXLTo mem poolSan Jose,CA April 26-28,2022Reducing stranded-ness with memory pools Expand DPUs ability to leverage pooled memory Resolves stranded-ness challenges Future CXL specif

9、ications,discussing potential to share memory at the memory pool Security challenges 6Server SoCDRAMDRAMDRAMDirect Attached Storage(Flash)Initiator DPU(NVMe-oF)Compute Server 0General Purpose ComputeCXLPCIeCXLTo mem poolTo mem poolCXLControllerDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR InterfaceCXL

10、 InterfaceSan Jose,CA April 26-28,2022Appliance use-case for DPUs with CXL Advantages Builds on DPU-as-target use-case Uses large CXL-connected memory capacity for“conditioning tier”Lower pin-count for DPU memory Lower DPU memory cost/bit Concerns Increased BOM cost with CXL interface and CXL memory

11、 expansion card7Target DPUNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashNVMe FlashPCIe SwitchCXL mem expansionCXLSan Jose,CA April 26-28,2022DPU+NVDIMM Storage Appliance Study Bluefield 16 core SoC for off-chip in-memory processing Direct access to memory

12、and attached storage Integrated into Infiniband adapter Runs full Linux/IB stack NVDIMM-N NVDIMM-N installed with battery pack and fits into DDR module slot Appears as unique PMEM device type to Linux Using Linux DAX,appears as file system but behaves as memory when files are mmapd Syncs files to on

13、-module battery backed NVM at processor request or on emergency power loss,so data is persistent.HPE Apollo 70 client OpenSHMEM programming model8Grodowitz,Megan,Pavel Shamis,and Steve Poole.OpenSHMEM I/O Extensions for Fine-Grained Access to Persistent Memory Storage.Smoky Mountains Computational S

14、ciences and Engineering Conference.Springer,Cham,2020.San Jose,CA April 26-28,2022Edge Sort Total Runtime9024686802224262830Slowdown vs 2 PEsNumber of PEsWeak Scaling of Total Runtime for Graph Edge DecompositionFspace read app slowdownFile IO read app slowdownFspace generate/w

15、rite app slowdowFile IO generate/write app slowdownIdeal no slowdownLinear slowdownPOSIX File I/O on NFS degrades linearly as number of processes increase.All access in parallel to non-overlapping file regions.Read app(App1)also writes back after sort so performance degrades worse for App1 as expect

16、ed.Fspace File I/O over same network fabric shows only small performance degradation as number of processes increases.Key challenges emerging in datacenter memory and storage hierarchiesCloud vendor specific ProgrammabilityNew memory technologiesIncreasing network costsIncreasing compute heterogenei

17、tyMinimizing software overheads for emerging storage devicesNew price/perf points and interfaces for storage(SCM,CXL)Need to minimize data movementImproves efficiency via accelerationWatch the companion talk in session B-201:Extending DPUs to Enable Software-Defined I/O(SDIO)San Jose,CA April 26-28,2022Thank You!Thank You!11

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（1A-202_Advantages and Use Cases for Adding the CXL interface to DPUs- ARM.PDF）为本站（2200）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。