《Sensor Discovery and Manageability for Pluggable DC-MHS Designs.pdf》由会员分享,可在线阅读,更多相关《Sensor Discovery and Manageability for Pluggable DC-MHS Designs.pdf(20页珍藏版)》请在三个皮匠报告上搜索。
1、When considering DC-MHS-based platform designs,it is important to consider ascalable and vendor-agnostic approach to discovering,interfacing,and firmwareprovisioning for the HPM,modular IO,accelerators,sensors,and other peripheraldevices.DC-MHS workstreams are currently defining a plug-and-play arch
2、itecturethat can address these,and other related factors,with the DC-MHS composablearchitecture.In this presentation,we will explore different methods of enabling the BMC on theDC-SCM card to perform accurate thermal management by providing temperaturesensor information for the platform using differ
3、ent HPM cards.We will delve into thepros and cons of each option starting with a BMC-centric approach that utilizesOpenBMC Entity Manager and an HPM Definition file,and then through to a designthat abstracts the sensor interface.The goal of these designs is to allow the BMC tocommunicate with HPM se
4、nsors without prior knowledge of the HPM configuration.Sensor Discovery and Manageability for Pluggable DC-MHS DesignsTodd Rosedahl,Senior System Architect,JabilAdonay Berhe,Product Marketing Manager,AMISensor Discovery and Manageability for Pluggable DC-MHS DesignsSustainable Scalable Computation I
5、nfrastructureGoal:Achieving a Plug-n-Play EcosystemOff-the-Shelf Host Processor Module(HPM)InteroperabilityDC-SCM card should operate seamlessly DC-SCM and HPM should be fully functionalMinimal compatibility issuesDC-SCM and HPM development should be IndependentCo-design not requiredDifferent Silico
6、nDifferent temperature sensorsDifferent device typesDifferent topologies Different thermal requirementsExample Scenario:Three(3)HPM Boards WithSource:DC-MHS R1 Overview SpecSensor TypeSensor LocationSupported InterfacesTopologyBehind a mux/notThermal trip limits and actions Shutdowns,Fan speed adjus
7、tments,ThrottlingBMC HPM Sensor Discoverability:Required DataROTBMCBIOSLTPISCMCPLD1.Complete Standardization2.Turn-key BMC Solution3.BMC Discovery4.HPM Definition File(HDF)BMC HPM Sensor Discoverability:ApproachesROTBMCBIOSLTPISCMCPLDStandardized access methods and interfaces Standardized location a
8、nd configurationEvery HPM has X temp sensor at Y location to be used for Z purposeCons Stifles innovation Does not scale across multiple architecturesProne to incompatibility issuesApproach 1:Complete StandardizationROTBMCBIOSLTPISCMCPLDBMC firmware developed and ported for each HPMBMC identifies th
9、e HPM(topology/requirements)and loads the correct imageHPM identification via FRU or other mechanismsFirmware orchestration/update via external system management service Cons Requires HPM/BMC image co-designMimics current(traditional)firmware/platform development process and architectureUnderutilize
10、s composability/modularityApproach 2:Turn-key BMC SolutionROTBMCBIOSLTPISCMCPLDBMC can“walk”certain busses and registers to find sensorsE.g.,i2cdetectWill Require modular firmware architecture and run-time(dynamic)reconfigurations/buildsCons Certain devices may not respond or not supported(E.g.,Muxe
11、s)Requiring a different interface or configurationDevice type information and other advanced communication can be limitedDevices may be missing/broken,but are requiredApproach 3:BMC DiscoveryROTBMCBIOSLTPISCMCPLDBMC reads a file HPM Definition File(HDF)that contains HPM sensor location and configura
12、tionE.g.,Platform Design Documentation(PDD)HDF is created and distributed by HPM ManufacturerMoves the responsibility of hardware knowledge from BMC to HPMRequires industry alignment File access methods ToolingStorage locationData/File formatContent syntax,etc.Approach 4:HPM Definition File(HDF)ROTB
13、MCBIOSLTPISCMCPLDBMC needs drivers specific to the sensorsE.g.,Temperature sensorSensor initialization and reading will require protocol and command supportWhat is the exact protocol needed to initialize a sensor and read the temperature?Accurate response packet/message parsingPacket header definiti
14、onTemperature value locationError code location and meaningTemperature unitsHPM Definition File(HDF):Whats Still Missing?1.External Manager Solution 2.Single BMC Image Solution3.Driver Abstraction Solution4.HPM Offload SolutionHPM Definition File(HDF):Possible SolutionsSolution 1:External Manager(Co
15、-design)BMC reads HPM type(LIMITED communication)System Manager(SM)downloads correct imageRemote Firmware UpdateFULL communication b/n HPM and DC-SCM(BMC)BMCLTPIGPIOsHPMDC-SCMI2CSPIDC-SCI168 pinsFPGAGPIOsI2CUARTSGPIOFPGACPUBoard Temperature SensorSMDiscoverHPM TypeLoad Proper BMC ImageSolution 2:Sin
16、gle Image BMC-Driver SupersetBMC reads the“HPM Definition File”from the HPMBMC Image contains all drivers needed to talk to any HPMBMC loads/uses correct drivers based on HPM device types presentCons:BMC image bloat.Example:OpenBMC Single Image SolutionENTITY MANAGERPlatform Detection Initial HPM de
17、tection through FRU,M-PESTI(Device Tree),etc.Pre-defined device tree for DYNAMICALLY loading the required drivers in the u-Boot and Kernel after HPM detection Runtime component detection based on published/defined interfaces(e.g.,Fru-device,peci-pcie,smbios-mdr)Platform Configuration EM configuratio
18、n files(JSON)provide feature sets(Expose)and detection/interface definitions(Probe)One configuration file per supported device model.Execution Map system components to software resources within the BMC Entity manager configurations used to execute/enable the features that they describe on the D-Bus(
19、e.g.,dbus-sensors)All BMC applications(multi-silicon support)must be enabled in the imageBMC reads the Driver Abstraction File(DAF)from the HPMDAF contains the information required to talk to the temperature sensor.The command string,the response definition,etc.Cons:This abstraction will be difficul
20、t and new devices may have new requirements that need further abstraction such as a multi-step error collection procedure.Solution 3:Driver AbstractionBMC reads the“HPM Definition File”from the HPM via a defined method.HDF lists sensor read locations in the HPM and all meta-data needed.Sensor offset
21、,units,error conditions,read frequency,timestamps,trip limits,actions,etc.HPM does the actual sensor reading.BMC reads the sensor values once/sec from the HPM.Cons:Requires FPGA access to the hardware.Offloads the BMC and adds load to the FPGA.Solution 4:HPM Offload Plug-n-Play support will enhance
22、the effectiveness and adoption of composable hardware designs at scaleIsolate knowledge domains BMC firmware should not need to know about HPM contentsInteroperability requires re-thinking the way device detection,configuration,and enablement will workBMC-centric approach with multi-platform support
23、 in a single image or multiple imagesAt the risk of image bloat and images bloatRequires relatively smaller changes to achieveHPM-centric approach with HPM sensor interface offloaded to the HPMMinimizes BMC processing and BMC image sizeEnables multi-node support(1 BMC to N HPMs)Requires major change
24、s and industry alignment to realizeConclusionGet involved with the DC-MHS OCP Plug-n-Play Sub-Projecthttps:/www.opencompute.org/wiki/Server/DC-MHS#Get_InvolvedContact AMI and Jabil if interested in an HPM interop demo for OCP Global 2024Explore the OCP marketplace for DC-MHS solutionsCall-to-ActionEnd