《SNIA-SDC23-Johnson-Maximizing-EDSFF-E3-SSD-Design_1.pdf》由会员分享,可在线阅读,更多相关《SNIA-SDC23-Johnson-Maximizing-EDSFF-E3-SSD-Design_1.pdf(40页珍藏版)》请在三个皮匠报告上搜索。
1、1|2023 SNIA.All Rights Reserved.Virtual ConferenceSeptember 28-29,2021Maximizing EDSFF E3 SSD DesignTrent Johnson FlashCore Hardware A2|2023 SNIA.All Rights Reserved.2023 IBM CorporationTrent Johnson is a Hardware Architect at IBM,with a focus on the IBM FlashCore Module.He joined IBM as part of the
2、 Cleversafe Acquisition where he was the System Hardware Architect of exabyte-scale Object Storage.Prior to Cleversafe,he developed system-level manufacturing and test solutions for AMD CPUs and GPUs where he was awarded the AMD Corporate Technical Achievement Award.He has 24 years of industry exper
3、ience,holds 7 US patents and has presented at the Burn-in and Test Socket Workshop,Flash Memory Summit as well as the Conference for Consumer Electronics.He earned BSEE and MSEE degrees from The University of Texas at Austin in Electrical Engineering with a focus on Manufacturing System Engineering.
4、3|2023 SNIA.All Rights Reserved.2023 IBM CorporationWhy migrate to EDSFF?4|2023 SNIA.All Rights Reserved.2023 IBM Corporation Connector Good for high amperage Low cost on card High Speed by design PCIe 5.0/6.0 Very high lane counts(up to 16)Thermal Architecture Each Form Factor has realistic and ach
5、ievable thermal performance targets Density is foremost in the standard Generous device power budgets Device Flexibility Its a“Standard”form factor,not SSD form factor Peripherals of almost any kind may be usedKey Benefits of Enterprise Datacenter Standard Form Factor(EDSFF)Relevant Specs:SFF-TA-100
6、2(Connector)SFF-TA-1008(Mechanical)SFF-TA-1009(Electrical)SFF-TA-1023(Thermal)5|2023 SNIA.All Rights Reserved.2023 IBM CorporationIBM FlashSystemHigh-End:FlashSystem 9500Mid-Range:FlashSystem 7300Entry-Level:FlashSystem 5200IBM FlashCore Module NVMe SSDEnterprise QLC storageCompression at speedEncry
7、ption at speedRAID assists6|2023 SNIA.All Rights Reserved.2023 IBM CorporationGreater drive power envelope allows for more flexible designMore efficient use of space allows greater enclosure densityFewer lane connections enabled by PCIe 5Increased switch bandwidth enabled by PCIe 5Simpler connectorS
8、lots are useful for more than just SSDsIncreased adoption of E3 is driving volumes from U.2 to E3Benefits of EDSFF for IBM FlashSystem6TodayFuture7|2023 SNIA.All Rights Reserved.2023 IBM CorporationSSD Design MethodologyFlashCore Module8|2023 SNIA.All Rights Reserved.2023 IBM CorporationNAND FlashMR
9、AMDRAMNAND FlashCapacitors for Power LossController/LogicU.2 ConnectorTop SideBottom SideThe Layout of Todays 3rd Gen IBM FlashCore Module9|2023 SNIA.All Rights Reserved.2023 IBM Corporation Migrate to a new industry standard,EDSFF E3 Maximize the overall Terabytes per rack unit Minimize cost per Te
10、rabyte High Quality&Reliability Utilize an FPGA for advanced computational storage techniques like inline compression,encryption,RAID assists and future featuresFCM Hardware Design GoalsTodayFuture10|2023 SNIA.All Rights Reserved.2023 IBM CorporationExploring E3.S10 E3.S 2T:Targeted to be high perfo
11、rmance SSDs,SCM,CXL Power budget up to 40W E3.S:Targeted to NVMe SSDs Power budget up to 25W The FCMs FPGA and the goal of high capacity dont fit well in E3.S Footprint too small Power budget too small E3.S 2T might work for FCM Close to U.2 dimensions Good power budgethttps:/ SNIA.All Rights Reserv
12、ed.2023 IBM CorporationExploring E3.L11 E3.L:Targeted to be a primary form factor for storage subsystems and server platforms requiring maximum capacity for each U Power budget up to 40W E3.L 2T:Targeted to FPGAs or accelerators(Computational Storage)Power budget up to 70W What if your product is a
13、mix of both use cases?If you dont need 70W of power,you lose 50%of your density by using E3.L 2T vs E3.L E3.L offers better enclosure density than E3.S 2Thttps:/ SNIA.All Rights Reserved.2023 IBM CorporationFPGA on E1.L?12SFF-TA-1007E1.L comes in two sizes:9.5mm:up to 25W18mm:up to 40W40 x40 mm FPGA
14、318.75 mm38.4 mm40mm x 40mm FPGAs clearly do not fit on a 38.4mm wide board35mm x 35mm FPGAs(or smaller)are plausible,but signal exits can only go east/westComplex routingReduced I/O for Flash 13|2023 SNIA.All Rights Reserved.2023 IBM CorporationLets not forget E1.S13Optimized for 1U5 sizes:31.5 x 1
15、11.49 x 5.9 mm:12W max31.5 x 111.49 x 8 mm:16W max33.75 x 118.75 x 9.5 mm:20W max33.75 x 118.75 x 15 mm:25W max33.75 x 118.75 x 25 mm:25W maxE1.S use casesBoot MediaCacheBlade serversEdge serversAI/MLHigh PerformanceRear-plug devices may operate in HT-LF thermal space(50C)Not really optimal for high
16、 capacity FCM14|2023 SNIA.All Rights Reserved.2023 IBM CorporationFlashCore Module E3.L Floor Planning ConceptMRAMNAND ArrayedgeFPGACapsDRAM Array15|2023 SNIA.All Rights Reserved.2023 IBM CorporationEDSFF E3.L Challenge With FPGAs7.5mm Z-heightThe E3 spec is optimal for thin ASICs&flash memoryThe ca
17、rd edge is very close to the center of the stack-upNANDNANDNANDNANDNANDNANDNANDNANDASIC1516|2023 SNIA.All Rights Reserved.2023 IBM CorporationE3 Problem:FPGAs Are Big!1.57mmFPGA1.5mm1.5mm1.5mm1.5mm1.5mm1.5mm1.5mm1.5mm1.5mm1.5mm7.5mm0.8mm1.57mm1.0mm1.0mm0.8mm0.8mm3.8 mm FPGA1.57mm2.16mm1.84mm4mm16Not
18、 just FPGAs!Regulators Capacitors Inductors17|2023 SNIA.All Rights Reserved.2023 IBM CorporationOne Potential Solution:Offset The PCBNANDNANDNANDNANDNANDNANDNANDNANDNANDNANDNANDNANDFPGAs are big!Challenges:Card edge alignment LED alignment Back-side components Thermal design1718|2023 SNIA.All Rights
19、 Reserved.2023 IBM CorporationShifting Your PCB And Maintain The Card Edge Methods Mezzanine soldering Rigid flex Plug/socket connectors Elastomer connections Challenges:Signal Integrity Mechanical Stability Tiny Dimensions Reliability Cost18?19|2023 SNIA.All Rights Reserved.2023 IBM CorporationAlig
20、ning LEDs LED hole is ideal for SMT LEDs on nominal board height Too high,and your LED holes are covered Too low,and your LED wont be visible Mitigations:Light Pipes Custom LED PCB Mezzanine Flare the hole19?AlignedToo LowToo HighSFF-TA-100820|2023 SNIA.All Rights Reserved.2023 IBM CorporationBack-S
21、ide ComponentsNANDNANDNANDNANDDRAMNANDNANDNANDNANDNANDNANDNANDNANDDRAMOops.Where are my components?You may find you run out of space when you adjust your Z height As with anything,its a tradeoff Mitigations:Move tall components to the taller side Thin your shell Use a mezzanine to put the PCB in the
22、 center again 2021|2023 SNIA.All Rights Reserved.2023 IBM CorporationThermal Design ConsiderationsNANDNANDNANDNANDDRAM FPGAs produce a lot of heat Much of the power envelope will come from the FPGA Case design is very important Material:Copper,Aluminum,alloys Consider heat spreaders Thermal Interfac
23、e Material Fin design and aerodynamics Thermal simulations are a must Test using SFF-TA-1023 methodology2122|2023 SNIA.All Rights Reserved.2023 IBM CorporationEnclosure Design MethodologyAcknowledgement:Brent Yardley23|2023 SNIA.All Rights Reserved.2023 IBM CorporationKeeping SSDs CoolA typical 2U U
24、.2 Storage Server:24 SSDs at 25W max each 600W of drive powerA 2U E3.L Storage Server:44 SSDs at 40W max each1760W of drive power23 SFF-TA-1023 defines thermal design criteria for both Enclosures and EDSFF devices Spec recommends operation in the blue region In a nutshell,characterize your enclosure
25、 to perform at or better than the level of your deviceConcept from 2023 EDSFF whitepaper,B.Lynn,P.Kaler,and J.Geldman24|2023 SNIA.All Rights Reserved.2023 IBM CorporationSFF-TA-1023 device test environment setupSet up an airflow chamber with various temperature and airflow setpointsBuild a test box
26、as suggested by the specRun various workloads on different NVMe power states and collect data25|2023 SNIA.All Rights Reserved.2023 IBM CorporationSFF-TA-1023 Airflow Impedance(AFI)levels 8 AFI levels are defined Enclosures are tested with devices producing different AFI levels to determine which lev
27、els they can support Devices are tested to characterize their AFI level AFI impacted by device shape(heat-sink fins,vents,etc)The device AFI must not exceed the enclosure AFI capability Fan efficiency loss may occurExample Enclosure CapabilityExample Device Characteristic26|2023 SNIA.All Rights Rese
28、rved.2023 IBM CorporationSFF-TA-1023 MaxTherm Levels Minimum airflow required at Thermal Design Power 8 MaxTherm Levels Devices are tested to determine their MaxTherm Level Enclosure supports devices at or below its MaxTherm levelExample Enclosure CapabilityExample Device Characteristic27|2023 SNIA.
29、All Rights Reserved.2023 IBM CorporationSFF-TA-1023 DTherm LevelsYour device may need to throttle in the case of a fan degradation or service actionSame MaxTherm Level definitionsDrive can self manage or be told to downgrade by the hostTypically use NVMe Power StatesQuality of Service is impactedExa
30、mple Enclosure CapabilityExample Device Characteristic28|2023 SNIA.All Rights Reserved.2023 IBM CorporationSFF-TA-1023 MinAmbient and MaxAmbient Documented approach temperature limits,outside which your device may have problems operating MinAmbient is the lowest approach temperature your device supp
31、orts The thermal level curves have a minimum temperature of 25C Testing may be impractical below 25C Some components may not be qualified for less than room temperature Change MinAmbient if your device has a specific minimum operating temperature MaxAmbient is the highest approach temperature your d
32、evice supports Allowed ranges from 50C to 65C MaxAmbient defines the point above which,throttling to DTherm levels may occurProbe29|2023 SNIA.All Rights Reserved.2023 IBM CorporationSFF-TA-1023 MaxAmbient vs ASHRAE MaxAmbient ranges from 50C to 65C per SFF-TA-1023 ASHRAE standard max temperatures ar
33、e 32C,35C,40C,45C Many servers are designed to ASHRAE standards ASHRAE standards are designed for optimal SSD cooling IBM would like to see MaxAmbient allowed below 50CASHRAE=American Society of Heating,Refrigerating and Air-Conditioning Engineers30|2023 SNIA.All Rights Reserved.2023 IBM Corporation
34、EDSFF Gap Design Considerations SFF-TA-1023 recommends E3 device to operate within the blue shaded design space envelope(MaxTherm Level3 in the left picture)Spec defines the SSD-SSD gap for E3 to be 1.8mm Example:Ta 35C,2.5 CFM can be translated to 1700 LFM(8.6 m/s)per equation 5-1 Compared to U.2 7
35、mmT(pitch 12.5mm,gap 5.5mm),LFM is much higher from smaller gap(5.5mm vs.1.8mm)But regarding U.2 15mmT(pitch 16.5mm,gap 1.5mm),we will see little differences vs 1.8mm Channel velocity will vary based on chassis design SSD to SSD pitch but can be calculated easily AFI is not easily calculated,but you
36、 can decrease your device AFI with a wider gapSFF-TA-1023 R1.0a Figure 5-5:Required Dimensions of E3 1T Test Fixture SFF-TA-1023 R1.0a Figure 4-3:MaxTherm and DTherm Levels 31|2023 SNIA.All Rights Reserved.2023 IBM CorporationMechanical Enclosure Slot DesignTo meet the thermal envelope requirements,
37、a design is proposed using an approx.3mm gap between E3 devicesE3 2T devices in a pair of the same E3 slots will have a slightly larger gap of 4.2mm32|2023 SNIA.All Rights Reserved.2023 IBM CorporationE3 Carrier DesignsE3.S Carrier AssemblyE3.L Carrier AssemblyCarrier bezel contains lever and EMI ga
38、sket E3carrier bezel E3 carrier RailE3.S SSDSpacer,to secure and carry forward LEDs33|2023 SNIA.All Rights Reserved.2023 IBM CorporationE3 2T Carrier DesignE3.S 2T Carrier AssemblyE3.L 2T Carrier AssemblyE3 2T carrier bezel E3 2T carrier RailE3.S 2T SSDCarrier bezel contains lever and EMI gasket 34|
39、2023 SNIA.All Rights Reserved.2023 IBM CorporationEnclosure ConfigurationsIn a white paper,B.Lynn,P.Kaler,and J.Geldman,several enclosure configuration options were explored“The flexibility of the E3 form factor gives platform architects a wide range of options when it comes to supporting multiple s
40、ystem use cases.The ability to optimize around either density,host bandwidth,system power or device type makes the E3 form factor an ideal choice for platform architects and system designers.“https:/ SNIA.All Rights Reserved.2023 IBM CorporationEnclosure Configuration ConsiderationsWe are looking at
41、 specifically 3 different enclosure types 1U Entry and Expansion 2U Midrange 4U High EndWant to maintain a common design for IBM storage brands E3.S/L is primary design pointFlexibility to support E3 and E3 2T in the same enclosureMeet the thermal design requirements based on anticipated slot to slo
42、t pitch36|2023 SNIA.All Rights Reserved.2023 IBM Corporation1U Mechanical ConceptSupport up to 20 is possible,however,the ability to maintain a 3mm is not possible for all slots.To support a true even gap,between all devices,a gap spacing of 2.4mm would be needed2T would be possible,however,uniform
43、spacing gaps is not maintained.The gap spacing would likely lower:2.4mm,middle:3mm,and top:3mm37|2023 SNIA.All Rights Reserved.2023 IBM CorporationFront ViewRear View24 Drives2U32 E3.S/L or16 2Tx4 E3 2T Slotsx4 E3 2T SlotsPSUPSUMisc IOHBA SlotsE1.S SlotsBBU2U Physical Mockup Concept 2U enclosure wit
44、h 2U half wide canisters configured in a side-by-side configuration Each canister supports single socket w/up to 12 DIMM slots per socket Up to 4 PCIe 5.0 slots per canister All mechanically x16,supporting a mix of x8 and x16 electrical Hot plug/replicable,E1.S boot drives Hot plug/replicable power
45、loss protection 1+1 Common Redundant Power Supplies Configuration using up to 32 E3 2x2 slots and 4 E3 2T 2x4 or 1x8 slots38|2023 SNIA.All Rights Reserved.2023 IBM Corporation4U Physical Mockup Concept 4U enclosure with 2U canisters configured in a top/bottom configuration Each canister supports dua
46、l socket w/up to 12 DIMM slots per socket Up to 8 PCIe 5.0 slots per canister All mechanically x16,supporting a mix of x8 and x16 electrical Hot plug/replicable,E1.S boot drives Hot plug/replicable power loss protection 2+2 Common Redundant Power Supplies Configuration using up to 64 E3 2x2 slots an
47、d 8 E3 2T 2x4 or 1x8 slots24 Drives24 Drives4UHBA SlotsBattery/UltraCapPSUPSUMisc IOPSUPSUFront ViewRear View32 E3.S/L or16 E3 2T32 E3.S/L or16 E3 2TE3 2T SlotsE3 2T SlotsE1.S Slots39|2023 SNIA.All Rights Reserved.2023 IBM CorporationSummarySSD and storage server suppliers need an EDSFF roadmap to s
48、tay competitiveDesign concepts for SSDs and enclosures were presentedAn FPGA can fit into an E3.L form factor to maximize densityThe 1.8mm device gap on the enclosure is an example,but not ideal for all use cases Not everyone is going for maximum enclosure density AFI can be adjusted by gap40|2023 SNIA.All Rights Reserved.2023 IBM CorporationPlease take a moment to rate this session.Your feedback is important to us.