《SNIA-SDC23-Lund-xNVMe-and-io-uring-NVMe-passthrough_1.pdf》由会员分享,可在线阅读,更多相关《SNIA-SDC23-Lund-xNVMe-and-io-uring-NVMe-passthrough_1.pdf(68页珍藏版)》请在三个皮匠报告上搜索。
1、1|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Virtual ConferenceSeptember 28-29,2021xNVMe and io_uring NVMe passthroughWhat does it mean for the SPDK NVMe driver?Simon A.F.Lund(Samsung)2|2023 SNIA.All Rights Reserved.AgendaHow(and why)did SPDK start?SPDKs MotivationLinux Storage A
2、bstractionsxNVMe OverviewPerformance ComparisonsNext Steps3|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.How(and why)did SPDK start?“We have all of these SAS SSDs in this system,but cant get all of the performance out of them.”Meeting with enterprise storage company The performance
3、 problem was only going to get worse!NVMe ratified but not yet commercially available Including BSD-licensed FreeBSD driversOS support for NVMe ramping quickly DPDK already tackling this same problem for network packet processingIntel Storage Group merged with division responsible for DPDKTimeline:2
4、0134|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.SPDKs MotivationBreak the software bottleneck for high-performance storage workloadsBuild an open-source community to innovate and collaborateBalance between”develop new”and“optimize existing”Broad set of abstractions and implementa
5、tions5|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.SPDK and NVMe Performant and efficient NVMe access is priority#1!Break the software bottleneck Collaboration with xNVMe and Linux kernelBuild an open-source community Improve SPDKs ability to leverage Linux NVMeBalance between“dev
6、elop new”and“optimize existing”Enable multiple ways of accessing NVMe with SPDKBroad set of abstractions and implementations 6|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.OutlineWhy What do you do,when the OS storage abstractions fail?What do you do,when the deployment environment
7、s fail?What Device handles via generic and anonymous namespaces(e.g./dev/ng0n1)Device communication via io_uring command(with NVMe Passthrough)SPDK Integration:xNVMe and bdev_xnvmePerformance ComparisonNext Steps7|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Res
8、erved.Why?1/2General storage abstractions8|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:storage abstractions Generic abstractions Supporting a variety of devices in the same fashion Long-lived and well-known abstractions of blocks and files When/how/why do abstractions fail for
9、 NVMe?LinuxUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriverSpeak NVMe9|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:storage abstractions“speaking NVMe”Speaking NVMe Read/write using extended LBA formats Ext:directive
10、s/write_zeroes/copy ZNS:mgmt.send/receive,append Key-Value:store(k,v)/retrieve(v),list,delete,exists New command-sets:Computational StorageUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockSpeak NVMeFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriversyncIoctl()Speak NVMe10|2023 SNIA.Simon A
11、.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:storage abstractions“speaking NVMe”Speaking NVMe Read/write using extended LBA formats Ext:directives/write_zeroes/copy ZNS:mgmt.send/receive,append Key-Value:store(k,v)/retrieve(v),list,delete,exists New command-sets:Computational StorageAbstraction
12、 failure;must bypass OS abstractions to utilize devicesUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockSpeak NVMeFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriversyncIoctl()Speak NVMe11|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device handlesEverything is a file
13、 with NVMe represented as NVMe Controllers as char devices(e.g./dev/nvme0)NVMe Namespaces as block devices(e.g./dev/nvme0n1)Caveat:only for NVM and ZNS Command-Sets12|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device handlesEverything is a file with NVMe represented as NVMe C
14、ontrollers as char devices(e.g./dev/nvme0)NVMe Namespaces as block devices(e.g./dev/nvme0n1)Caveat:only for NVM and ZNS Command-SetsPlug in a device with a command-set other than NVM/ZNS Only the controller handle appears(e.g./dev/nvme0)Device does not fit,or match assumptions of,the Linux Block Dev
15、ice model No representation of/FS entry to get a handle to the namespace13|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device handlesEverything is a file with NVMe represented as NVMe Controllers as char devices(e.g./dev/nvme0)NVMe Namespaces as block devices(e.g./dev/nvme0n1)
16、Caveat:only for NVM and ZNS Command-SetsPlug in a device with a command-set other than NVM/ZNS Only the controller handle appears(e.g./dev/nvme0)Device does not fit,or match assumptions of,the Linux Block Device model No representation of/FS entry to get a handle to the namespaceAbstraction failure;
17、no means to get a handle to the namespace14|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device communicationEfficiency via io_uring reducing the cost of crossing the border between userland and kernelShared memory(rings)Instead of memory-transfersResource registration Reduce l
18、ookup-costPolling(IOPOLL|SQPOLL)Batching One syscall multiple commandsUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriverSpeak NVMeio_uring command opcodesIORING_OP_(READ|WRITE)VIORING_OP_(READ|WRITE)IORING_OP_(READ|WRITE)_FIXED15|2023 SNIA.
19、Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device communicationEfficiency via io_uring reducing the cost of crossing the border between userland and kernelShared memory(rings)Instead of memory-transfersResource registration Reduce lookup-costPolling(IOPOLL|SQPOLL)Batching One syscall m
20、ultiple commandsUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockSpeak NVMeFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriversyncIoctl()Speak NVMeio_uring command opcodesIORING_OP_(READ|WRITE)VIORING_OP_(READ|WRITE)IORING_OP_(READ|WRITE)_FIXED16|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.
21、All Rights Reserved.Why:device communicationSpeaking NVMe Read/write using extended LBA formats Ext:directives/write_zeroes/copy ZNS:mgmt.send/receive,append Key-Value:store(k,v)/retrieve(v),list,delete,exists New command-sets:Computational StorageFacility:NVMe driver ioctl()ioctl()no scaleIoctl()+t
22、hreadpoolin-efficient scaleio_uringefficient scale17|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:device communicationSpeaking NVMe Read/write using extended LBA formats Ext:directives/write_zeroes/copy ZNS:mgmt.send/receive,append Key-Value:store(k,v)/retrieve(v),list,delete,e
23、xists New command-sets:Computational StorageFacility:NVMe driver ioctl()Abstraction failure;no kernel facility to“Speak NVMe”efficientlyioctl()no scaleIoctl()+threadpoolin-efficient scaleio_uringefficient scale18|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Existing solutionsMove t
24、he storage abstraction out of the kernel and into userlandThe SPDK Block Device abstraction(bdev)The SPDK NVMe driverSo,when does this fail?19|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why?2/2Deployment Environments20|2023 SNIA.Simon A.F.Lund/SSDR/Sa
25、msung/GOST.All Rights Reserved.Why:deployment environmentsDeployment of SPDK Apps using the SPDK NVMe driver Requirement:detach the Kernel NVMe driver bind to vfio-pci/uio_generic21|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:deployment environmentsDeployment of SPDK Apps usin
26、g the SPDK NVMe driver Requirement:detach the Kernel NVMe driver bind to vfio-pci/uio_genericHW Failure Other devices in the same iommu-group No detachment Unsupported IOMMU/PCIe bar address-space binding failure22|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:deployment environ
27、mentsDeployment of SPDK Apps using the SPDK NVMe driver Requirement:detach the Kernel NVMe driver bind to vfio-pci/uio_genericHW Failure Other devices in the same iommu-group No detachment Unsupported IOMMU/PCIe bar address-space binding failureCloud failure Sheer lack of NVMe devices Encapsulated s
28、torage-device-services Restrictive environments23|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:io_uring command for SPDK?What do you do,when the deployment environment fails?Fallback:operating system managed(bdev_aio/bdev_uring)24|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All
29、Rights Reserved.Why:io_uring command for SPDK?What do you do,when the deployment environment fails?Fallback:operating system managed(bdev_aio/bdev_uring)Enable deployment of SPDK in environments otherwise unavailableEnable deployment of SPDK with minimal performance hitGoals of Linux and SPDK are al
30、igned25|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Why:goals for LinuxAn open-ended representation of NVMe devices for existing and new NVMe Command-Sets with a fast-path for communicationHandles Bring up devices regardless of Linux device model matchCommunicationSpeak NVMe“nativ
31、ely”Scale as efficiently as io_uringScale as efficiently as the SPDK NVMe Driver26|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What?1/3Generic device handles27|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What:a solution to handlesHan
32、dlesNVMe generic char interface e.g./dev/ng0n128|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What:a solution to handlesHandlesNVMe generic char interface e.g./dev/ng0n1Initial support:Linux 5.13(June 2021)Brings up handles for namespaces with NVM and ZNS command-setsCommand-set in
33、dependence:Linux 6.0 Brings up handles for namespaces with any command-set29|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What:a solution to handlesHandlesNVMe generic char interface e.g./dev/ng0n1Initial support:Linux 5.13(June 2021)Brings up handles for namespaces with NVM and ZN
34、S command-setsCommand-set independence:Linux 6.0 Brings up handles for namespaces with any command-setDevice files are provided regardless of a matching device model,thereby enabling handles for existing and future NVMe command-sets30|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsun
35、g/GOST.All Rights Reserved.What?2/3Communication via io_uring command(io_uring_cmd)31|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What:io_uring commandGeneric facility to attach io_uring capabilities to a command providerLarger ring-entries embedding commands and their completions
36、Command Provider(driver,file-system,etc.)32|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What:io_uring commandGeneric facility to attach io_uring capabilities to a command providerLarger ring-entries embedding commands and their completionsCommand Provider(driver,file-system,etc.)O
37、ne such command Provider is the NVMe driver Providing NVMe passthrough commands Commands defined equivalent to NVMe driver IOCTLs NVMe driver IOCTL extended with iovec supportnote:this was a requirement enabling non-bounce-buffer utilization by the SPDK bdev abstraction33|2023 SNIA.Simon A.F.Lund/SS
38、DR/Samsung/GOST.All Rights Reserved.What:io_uring commandHandlesBring up devices regardless of Linux device model matchCommunicationSpeak NVMe“natively”Scale as efficiently as io_uring?Scale as efficiently as the SPDK NVMe Driver?For more:see Kanchan JoshisLinux Plumbers Conference slideshttps:/lpc.
39、events/event/16/contributions/1382/attachments/1119/2151/LPC2022_uring-passthru.pdfUserlandKernelIO StackDeviceSyscallSpeak FileSpeak BlockSpeak NVMeFS AbstractionBlock Abstraction/dev/nvme0n1NVMe DriversyncIoctl()Speak NVMe/dev/ng0n1asyncIo_uring_cmd34|2023 Storage Developer Conference 2023 Simon A
40、.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What 3/3SPDK Integration via xNVMe(bdev_xnvme)35|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Core API Commands and Buffers Queues&CallbacksCommand-Set Helpers NVM read/write/write_zeroes/copy ZNS mgmt.send/receive/append KV store/retri
41、eve/list/exists/deleteCommand-Line Tools xnvme,lblk,zoned,kvsCORE APISynchronousAsynchronousBuffer(s)Command(s)Storage Device or FileQueueCallbackImplementationLinuxFreeBSDWindowsread()/write()libaioPOSIX aioIOCPBlock IOCTLsNVMeIOCTLsThread PoolsDevFs/SysFSio_uringIORINGSPDK Driverread()/write()NVMe
42、IOCTLsThread PoolsWin32Object ModelThread Poolslibvfnio_uring_cmdSPDK DriverPOSIX aio36|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.xNVMe is used for I/O interface independence Minimal abstraction cost Convenient command-line tools Rapid experimentation via PythonFurther detailsSY
43、STOR22 Presentation and Paperhttps:/ APISynchronousAsynchronousBuffer(s)Command(s)Storage Device or FileQueueCallbackImplementationLinuxFreeBSDWindowsread()/write()libaioPOSIX aioIOCPBlock IOCTLsNVMeIOCTLsThread PoolsDevFs/SysFSio_uringIORINGSPDK Driverread()/write()NVMeIOCTLsThread PoolsWin32Object
44、 ModelThread Poolslibvfnio_uring_cmdSPDK DriverPOSIX aio37|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.SPDK Integration:bdev_xnvmeWith SPDK v22.09 a new bdev module is introduced:bdev_xnvmeThe xNVMe bdev module calls intothe core xNVMe APIA single bdev implementation for libaio,io
45、_uring,and io_uring_cmd Device-specific handling(zone mgmt.)Further details,Krishna K.Reddy SDC Presentationhttps:/ Abstraction LayerBDEV ModulesSPDK DriversLinuxHARDWARENVMeVirtioaioxNVMeio_uringaioio_uring PTxNVMelibaioio_uring_cmdio_uringuring38|2023 Storage Developer Conference 2023 Simon A.F.Lu
46、nd/SSDR/Samsung/GOST.All Rights Reserved.Comparison:peak IOPS for saturated CPUio_uring_cmd vs io_uringio_uring_cmd vs SPDK NVMe DriverSPDK Bdev implementations(aio,uring,xNVMe)39|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:system and softwareCore i5-12600,SMT enabled,T
47、urbo-Boost disabled 4x Samsung 980 Pro 1TB(512 RR 1.0M IOPS/4K RR 1.0M IOPS)4x Samsung 980 Pro 2TB(512 RR 0.8M IOPS/4K RR 0.8M IOPS)Device roofline 8M IOPS(according to spec.Sheet)Software Linux 6.5 fio 3.34 xNVMe v0.7.1 SPDK v23.04+patches for xNVMe submodule updated to v0.7.140|2023 SNIA.Simon A.F
48、.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:system and software Linux Kernel version 6.5 Debian Bullseye kernel config with the following changes CONFIG_BLK_CGROUP=N CONFIG_BLK_WBT_MQ=N CONFIG_HZ=250 CONFIG_RETPOLINE=N CONFIG_PAGE_TABLE_ISOLATION=N NVMe driver loaded with as modprobe-r nv
49、me&modprobe nvme poll_queues=1/sys/block/device/queue/iostats set to 0/sys/block/device/queue/nomerges set to 2/sys/block/device/queue/wbt_lat_usec set to 041|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:system and softwareTools fio:t/io_uring via one-core-peak.sh“fio:t/
50、io_uring manually invocation bdevperfLogs of all runs are provided for inspection and reproducibility https:/ contains scripts,hw-info information,kernel-config etc.42|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.io_uring vs.io_uring_cmd#DevicesMillions of 512 byte IOPS via io_urin
51、g-n=#DevicesIOPOLL-n2-c16 s16IOPOLL-n2NOPOLLNOBATCH-n1SQPOLL11.171.161.161.1622.322.321.332.3332.243.181.352.5442.184.161.362.3952.104.121.382.4362.033.971.392.5072.033.821.392.3682.023.971.392.3643|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.io_uring vs.io_uring_cmd#DevicesMillio
52、ns of 512 byte IOPS via io_uring-n=#DevicesIOPOLL-n2-c16 s16IOPOLL-n2NOPOLLNOBATCH-n1SQPOLL11.171.161.161.1622.322.321.332.3332.243.181.352.5442.184.161.362.3952.104.121.382.4362.033.971.392.5072.033.821.392.3682.023.971.392.36#DevicesMillions of 512 byte IOPS via io_uring_cmd-n=#DevicesIOPOLL-n2-c1
53、6 s16IOPOLL-n2NOPOLLNOBATCH-n1SQPOLL11.161.161.161.1622.322.311.332.3032.233.261.352.5442.184.101.372.5252.094.351.382.4262.034.631.392.4972.024.861.382.5182.024.851.382.3944|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Eval:goals for LinuxAn open-ended representation of NVMe devic
54、es for existing and new NVMe Command-Sets with a fast-path for communicationHandles Bring up devices regardless of Linux device model matchCommunicationSpeak NVMe“natively”Scale as efficiently as io_uringScale as efficiently as the SPDK NVMe Driver?Peak IOPS in Millionsio_uring4.16io_uring_cmd4.8645
55、|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:IOPS via SPDKI/O generator bdevperf q 128 o 512 w randread t10 -m Two variations-m0;using a single core and no thread-sibling -m0,1;using a single core and its thread-sibling Equivalent comparison of SMT effect as is done by
56、t/io_uring46|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:IOPS via SPDK#DevicesMillions of 512 byte IOPSvia the SPDK NVMe Driver-m0-m0,811.151.1522.312.3033.343.3144.354.3455.225.2266.116.1077.117.1087.248.08Satures a single SMT thread47|2023 SNIA.Simon A.F.Lund/SSDR/Sam
57、sung/GOST.All Rights Reserved.Comparison:IOPS via SPDK#Devices Millions of 512 byte IOPSvia the SPDK NVMe Driver-m0-m0,811.151.1522.312.3033.343.3144.354.3455.225.2266.116.1077.117.1087.248.08Why the gap?Generic facility Does more than specialized user-space driver Taps into generic kernel-infra io_
58、uring_cmd specific I/O path reductionUn-tapped optimizations Management of DMA MappingPeak IOPS in Millionsio_uring4.16io_uring_cmd4.86SPDK8.0848|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Eval:goals for LinuxAn open-ended representation of NVMe devices for existing and new NVMe
59、Command-Sets with a fast-path for communicationHandles Bring up devices regardless of Linux device model matchCommunicationSpeak NVMe“natively”Scale as efficiently as io_uringScale as efficiently as the SPDK NVMe Driver?Peak IOPS in Millionsio_uring4.16io_uring_cmd4.86SPDK8.0849|2023 SNIA.Simon A.F.
60、Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:bdev implementationsCompare the following bdev_xnvme vs bdev_uring bdev_xnvme vs bdev_aio bdev_xnvme with io-mechanisms:libaio/io_uring/io_uring_cmdUsing bdevperf Compare single-device qd=1 for a sense of overhead Compare single-device qd=128 for
61、 a sense of scaleProvide the data to motivating next steps for bdev_xnvme50|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:SPDK bdevs using libaiobdev_xnvme vs bdev_aiobdev_xnvme:io_mechanism=libaio51|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.
62、All Rights Reserved.bdev_aio vs bdev_xnvme1 Device8 Devices bdev_xnvme at scale with bdev_aio52|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:SPDK bdevs using io_uringbdev_xnvme vs bdev_uringbdev_xnvme:io_mechanism=io_uring53|2023 SNIA.Simon A
63、.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.bdev_uring vs bdev_xnvme bdev_xnvme at scale with bdev_uring 1 Device8 Devices54|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.bdev_uring vs bdev_xnvme bdev_xnvme at scale with bdev_uring bdev_xnvme“out-scales”bdev_uring with IOPOLL enab
64、led1 Device8 Devices55|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:SPDK bdev using io_uring_cmdbdev_xnvme vs bdev_uringbdev_xnvme:io_mechanism=io_uring_cmdSingle device56|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.bdev_ur
65、ing vs bdev_xnvme bdev_xnvme(io_uring_cmd)bdev_uring1 Device1 Device57|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.bdev_uring vs bdev_xnvme bdev_xnvme(io_uring_cmd)bdev_uring bdev_xnvme(io_uring_cmd)bdev_xnvme(io_uring)1 Device1 Device58|2023 Storage Developer Conference 2023 Simo
66、n A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Comparison:SPDK bdev using io_uring_cmdbdev_xnvme vs bdev_uringbdev_xnvme:io_mechanism=io_uring_cmdMultiple device59|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.bdev_uring vs bdev_xnvme8 Devices8 Devices bdev_xnvme(io_uring_cmd)bdev
67、_uring Both with and without IOPOLL bdev_xnvme(io_uring_cmd)bdev_xnvme(io_uring)60|2023 Storage Developer Conference 2023 Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.What are next steps?61|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Next Steps:io_uring_cmdHandles/Encapsul
68、ation I/O access-control matching file-permissions on/dev/ng*n*Disable CAP_SYS_ADMIN for identify-commands(ns,ns-cs,ctrlr,ctrlr-cs,etc.)Enable non-root access to device information such as maximum-data-transfer-size(MDTS),device properties Communication Investigate potentials for large-block-sizes/h
69、ugepages Investigate DMA pre-mapping62|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Next Steps:io_uring_cmdHandles/Encapsulation I/O access-control matching file-permissions on/dev/ng*n*Disable CAP_SYS_ADMIN for identify-commands(ns,ns-cs,ctrlr,ctrlr-cs,etc.)Enable non-root access
70、to device information such as maximum-data-transfer-size(MDTS),device properties Communication Investigate potentials for large-block-sizes/hugepages Investigate DMA pre-mappingDONE63|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Next Steps:bdev_xnvmeEfficiency;match the IOPS rate a
71、chieved by the other bdevs Exploring opportunities to enable batching Performance“policy”e.g.“conserve_cpu”to disable optimizations Otherwise:auto-enable io_uring optimizations where applicable and gracefully degrade in case of lacking system supportFunctionality NVM commands:Write Zeroes,Flush ZNS
72、commands:(Zone Management Send/Receive)Deployment on Windows(IOCP and IORING)Broaden SPDK deployment while matching interface efficiency64|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Next Steps:bdev_xnvmeEfficiency;match the IOPS rate achieved by the other bdevs Exploring opportun
73、ities to enable batching Performance“policy”e.g.“conserve_cpu”to disable optimizations Otherwise:auto-enable io_uring optimizations where applicable and gracefully degrade in case of lacking system supportFunctionality NVM commands:Write Zeroes,Flush ZNS commands:(Zone Management Send/Receive)Deploy
74、ment on Windows(IOCP and IORING)Broaden SPDK deployment while matching interface efficiencyexceed65|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Next Steps:xNVMeCurrently supported IORING_SETUP_IOPOLL|SQPOLL|SINGLE_ISSUER Resource-registration(files)Batching:done on-behalf of the u
75、ser via delayed submissionCurrently missing IORING_SETUP_COOP|DEFER_TASKRUN Resource-registration(buffers,rings)General optimizations:sqe-reuse,alignment,command-construction66|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.So,what does it mean for SPDK?The xNVMe bdev shows promise o
76、f encapsulating Linux kernel NVMe interface for the bdev abstraction Single bdev to handle libaio,io_uring,and io_uring_cmd Single bdev to handle zone-managementA wider range of deployment of SPDK ApplicationsCloser collaboration and integration of storage eco-systemsWhat does it mean for the SPDK N
77、VMe driver?67|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Thanks!Collaboration Reproducing io_uring_cmd vs SPDK NVMe benchmarks Linux Kernel io_uring_cmd optimizations SPDK bdev_xnvme optimizations and functional expansion xNVMe optimization and functional expansion Link to previo
78、us presentation at SPDK Virtual Forum 2022 https:/youtu.be/aYALmcP6PDU?si=H-TC_CJWgERzrd8W Contact SPDK Slack Channels:https:/spdk- GOST/xNVMe Discord:https:/discord.gg/XCbBX9DmKf68|2023 SNIA.Simon A.F.Lund/SSDR/Samsung/GOST.All Rights Reserved.Please take a moment to rate this session.Your feedback is important to us.