《PROGRE~1.PDF》由会员分享,可在线阅读,更多相关《PROGRE~1.PDF(17页珍藏版)》请在三个皮匠报告上搜索。
1、Power,Thermal&Interconnect aspects of a revolutionary computing ArchitectureProgress toward an Open&Sustainable,Energy Centric Computing Architecture for todays AI&HPC ApplicationsAllan Cantle,CEO,Nallasway OCP HPC SubProject Technical Lead Progress toward an Open&Sustainable,Energy Centric Computin
2、g Architecture for todays AI&HPC ApplicationsSustainable Scalable Computational InfrastructureSERVEROCP High Performance Compute Module,HPCM,from HPC to the EdgeDetailed Overview:OCP Global Summit,2023:-https:/ Efficient Power DeliveryThermal Management Concept including Energy Re-use Universal Topo
3、logy Interconnect Summary&Call to ActionOverview48V Bus BarPower,Thermal&Interconnect Challenge of Todays AI Racks120KW RackFacility Power to 48V PSUFacility Power to 48V PSUPower Distribution UnitGrace-Blackwell Bring Up SystemGrace-Blackwell Production SystemGrace-Blackwell RackGrace-Blackwell NVL
4、ink SwitchFansPower Efficiency Copper Losses from Facility to Chip distance up to 2.5MComplex Hybrid Liquid+Air CoolingSignal Integrity-From PCB Traces to CablesComposable HPC with the OCP Wall Of Compute Frontier Super Computer FootprintA Sustainable,Energy Centric Computing,System ArchitectureFron
5、tier Concept,OCP Wall of Compute50,000 HPCMs,High Performance Compute Modules50,000 CPUs,GPUs&SwitchesComposable HPC with the OCP Wall Of Compute Frontier Super Computer FootprintA Sustainable,Energy Centric Computing,System ArchitectureFrontier Concept,OCP Wall of Compute50,000 HPCMs,High Performan
6、ce Compute Modules50,000 CPUs,GPUs&Switches50pJ/bit+1-2pJ/bitAIs insatiable need for Memory Bandwidth AND Capacity at low latencyComposable,Domain Specific Architecture building blockInspired by OAM,OCP Accelerator ModuleAny Processor,Accelerator,Switch 1KW16x E3.S Memory/Media Modules Direct Water
7、CoolingWith Energy Re-Use8x Universal Interconnect Topology IOPotential for 8TBytes/s CPO Bandwidth Scalable from HPC to the Edge The High Performance Compute Module,HPCMEU Size BrickHPCMEdge Composable-OCP HPCM Wall Of Compute HPCM6 x 6 Edge Compute HPCMs in street side Telecoms CabinetEdge Composa
8、ble-OCP HPCM Wall Of Compute HPCMPower Delivery Copper Loss DistanceUp to 2.5M-Rack System 0.2M-HPCM ModuleLower Cost-Less CopperLess Loss-Higher EfficiencyEfficient Power Delivery In Wall-48V PSUFacility Power 48V ConnectorCopper Loss Distance 11x more efficient vs single phase1Conformally Coated C
9、opper HPCM7x more power/unit volume(copper)2Water 4x to 9x better thermal conductivity than PAO or Fluorinate3Potentially 300 x Improvement over conventional Liquid Cooling Techniques,all things being equalWall of Compute-Thermal Management Concept Facility Water InletFacility Water OutletValveHeat
10、ExchangerCtlLiquid Ring Vacuum Pump,LRVP(0.2Bar)Conductively Cooled Media&Active CablesEmerson Cooled ProcessorHPCMCtlValveUp to 8 HPCMs in Parallel65oC Return 60oC+for Energy Re-Use Unproven,hypothetical concept only2.UIUC Conformal Copper Coating Heat Spreader 1.Source:Peter C.Salmon of Electronic
11、Innovations.tech CtlWater Use Concerns Electrically ConductiveConformal Coating should IsolateBacterial Growth Eliminate stagnant water areas?Water Temperature may help?CorrosionMaterial mix Issues?Water TreatmentReduces thermal conductivitySustainability concernsWall of Compute-Thermal Management C
12、oncept Facility Water InletFacility Water OutletValveHeat ExchangerCtlConductively Cooled Media&Active CablesEmerson Cooled ProcessorHPCMCtlValveUp to 8 HPCMs in Parallel65oC Return 60oC+for Energy Re-Use Unproven,hypothetical concept onlyLiquid Ring Vacuum Pump,LRVP(0.2Bar)Ctl8x Universal Topology
13、ConnectorsProtocol AgnosticPCIe/CXL,GbE,NVLink,xGMI,Infiniband,etc.X8 Transceiver LanesFrom 32G up to 224Gbps SidebandsTunneling over LTPI+ClocksCable Present PinActive Cable Options-Retimers&Optical(10W)12V Power PinsPaddle Card Support for Active ComponentsThermal path to HPCM Cold plate Manifold
14、Universal Topology Interconnect PCISIG MCIO Connector Active PCIe RetimerThermal Path to Water ManifoldAvicena LightBundleTM ICProtocol AgnosticPCIe/CXL,GbE,NVLink,xGMI,Infiniband,etc.X8 Transceiver LanesFrom 32G up to 224Gbps SidebandsTunneling over LTPI+ClocksCable Present PinActive Cable Options-
15、Retimers&Optical(10W)12V Power PinsPaddle Card Support for Active ComponentsThermal path to HPCM Cold plate Manifold Universal Interconnect-Passive Copper to CPOPassive Optical cable Option for CPOPassive Optical cable Option for CPO8 TBytes/s Avicena LED LightbundleTM example4 LightbundleTM ICsBold
16、 Ambitions to bring our Open Computing Architectures into the 21st Century A Sustainable,Energy Centric Computing,First mindset is core to our VisionThis requires Innovations through collaboration across all our Industry SilosThermal Management is a major challenge and we have some crazy ideas!AI&HP
17、C require System IO Bandwidths to skyrocket&we need to plan for this!Revolutionary change requires Open Collaboration,Thanks OCP!SummarySERVERHelp OCP bring the HPCM Vision to a RealityWe need broad industry cross discipline collaborationThermal,Power,Electrical,Mechanical,Software,System Management
18、,TestHelp us convert our naive Thermal Management Concept into realityHelp us realize a Universal Interconnect that supports an evolving IO LandscapeJoin our OCP HPC Subproject Workgroup Mailing list:https:/ocp-all.groups.io/g/OCP-HPC Wiki:https:/www.opencompute.org/wiki/HPC Meeting Calendar:https:/www.opencompute.org/projects/project-and-ic-meetings-calendar Every other Tuesday,8am PacificThank you!