《SNIA-SDC23-Smith-Ramadoss-Massively-scalable-storage-for-stateful-containers-on-Azure.pdf》由会员分享,可在线阅读,更多相关《SNIA-SDC23-Smith-Ramadoss-Massively-scalable-storage-for-stateful-containers-on-Azure.pdf(33页珍藏版)》请在三个皮匠报告上搜索。
1、1|2023 SNIA.All Rights Reserved.Virtual ConferenceSeptember 28-29,2021Massively scalable storage for stateful containers on AzureMalcolm Smith,Principal Software Engineer|Vybava Ramadoss,Principal Product Manager2|2023 SNIA.All Rights Reserved.AGENDA Stateful workloads challenges Azure Container Sto
2、rage overview Architecture deep dive and extensibility Elastic SAN Overview Scaling with Azure Elastic SAN3|2023 SNIA.All Rights Reserved.Stateful workloads running on containersRun large scale stateful container workloads with scalable,performant,available and cost-effective Storage PostgresDBMySQL
3、CassandraKafkaSparkKubeflowElasticSearchmongoDBRedisStateful Workloads4|2023 SNIA.All Rights Reserved.ChallengesExisting solutions built for IaaS centric architecture and retrofitted to containersUnable to match the scale out speed and target of containers(pods)Slow pod failover resulting in degrade
4、d availability of stateful containers Limited coverage of available storage offerings5|2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Azure Container Storage overview6|2023 SNIA.All Rights Reserved.Why Azure Container StorageReduced Total Cost of Ownership(TCO)Kuberne
5、tes-native volume orchestrationRapid scale out and fast failoverUnified management experience for the storage of your choiceAzure Kubernetes ClusterAzure Container StoragePodPVBacking Storage(RWO)Azure Disk Ephemeral DiskAzure Elastic SANPVPodPVPVIndustrys first platform-managed container storage of
6、fering7|2023 SNIA.All Rights Reserved.Inside Azure Container StorageAzure Kubernetes Service(AKS)ClusterPodPVPVEphemeral DiskNVMeVolumePodPVPVAzure DisksPodPVPVAzure Elastic SANVolumeVolumeAzure Container StorageDiskVolumeiSCSIiSCSINVMe-oFDirect attach to nodeDirect attach to nodeNVMe-oFStorage Pool
7、s8|2023 SNIA.All Rights Reserved.Use Persistent Volumes with ContainersAzure Container StorageEnable ExtensionAKS ClusterEnable ExtensionExtension DeployedStorage PoolPool NamePool CapacityPool ResourceStorage Pool CreatedPersistent Volume Claim(PVC)PVC NamePVC CapacityPVC DeployedPersistent Volume(
8、PV)Prepare Pod PVC Mount PathDeploy PodPV MountedOperatorsDevelopers9|2023 SNIA.All Rights Reserved.ExampleStorage Pool DefinitionStorage ClassesStateful SetsapiVersion: namespace:acstorspec:poolType:azureDisk:resources:requests:storage:1Tikubectl describe sc acstor-azurediskProvisioner:Parameters:
9、apiVersion:apps/v1kind:StatefulSetmetadata:name:statefulset-kafka.spec:.volumeClaimT emplates:-metadata:name:persistent-storage spec:storageClassName:acstor-azuredisk accessModes:ReadWriteOnce resources:requests:storage:10Giv1beta1 10|2021 Storage Developer Conference.Insert Company Name Here.All Ri
10、ghts Reserved.Azure Container Storage deep dive11|2023 SNIA.All Rights Reserved.Deep DiveProtocols Guest based attachData Services Extend storage provider capabilitiesVolume ProvisionerShared provisioningCapacity ProvisionerHandle multiple backendsNVMeOFiSCSIReplicationEncryptionSnapshotsExposed as
11、FileConnect with CSI DriversBacking StorageProvides perf/scale optionsLocal NVMeAzure DiskAzure Elastic SANOtherStorage PoolResource ProviderExternalize stateMetadata12|2023 SNIA.All Rights Reserved.Layering on the data servicesVolume Snapshots Aligned with CSI Snapshot API Works with volume provisi
12、oner to restore snapshots Instant snapshot create and read using copy on writeReplication Storage pool configured with 1+replicas Replication engine sync writes using n/2+1 quorum Round robin reads across replicas Checks for data integrityStorage PoolVolume1Create volume snapshotChange1VolumeSnapsho
13、t1Storage PoolVolume Replica1Volume Replica2Volume Replica3VolumeSnapshot213|2023 SNIA.All Rights Reserved.Intelligent placement with capacity provisionerUsing Storage Pools Storage pool aggregates capacity and performance across homogenous storage in the cluster Dynamic placement decisions based on
14、 application performance and availability needs Enables parameterized capacity request mapping to storage pool with capability E.g.,100 GB 20K IOPS volume can be served from Premium storage pool Understands Topology including multi-zone pools Provides a consistent experience by abstracting backing s
15、torage and adding capabilities as neededAzure Kubernetes ClusterStorage PoolZone 1Volume Zone 2Zone 3PodZone 2Zone 3Zone 114|2023 SNIA.All Rights Reserved.Extensible to support multiple backing storageAzure Container StoragePersistent volumenvme-of targetCompression EncryptionReplicationThin provisi
16、oningLocal SSD/NVMeAzure Container StoragePersistent volumenvme-of targetCompression EncryptionReplicationThin provisioningRemote SSD/NVMeAzure Container StoragePersistent volumenvme-of targetCompression EncryptionReplicationThin provisioningRemote SSD/NVMeLocal DisksRemote DisksSAN15|2021 Storage D
17、eveloper Conference.Insert Company Name Here.All Rights Reserved.Azure Elastic SAN overview16|2023 SNIA.All Rights Reserved.Inside Azure Container StorageAzure Kubernetes Service(AKS)ClusterPodPVPVEphemeral DiskNVMeVolumePodPVPVAzure DisksPodPVPVAzure Elastic SANVolumeVolumeAzure Container StorageDi
18、skVolumeiSCSIiSCSINVMe-oFDirect attach to nodeDirect attach to nodeNVMe-oFStorage Pools17|2023 SNIA.All Rights Reserved.SANWhy use SAN for Container Storage?What is a traditional SAN?Pool of capacity and IOPS for a single group,accessible to any workload“Group”might be a company,department,or anythi
19、ng Workloads that shift within the“group”are interchangeable If the whole“group”is idle,we can do intelligent things like deduplicationIsnt that a cloud?Clouds support multiple“groups”The goal is to ensure consistent and high resource utilization Shifting workloads trigger load balancing operationsD
20、ynamic workloads want interchangeable storageVolumeStorage ServerStorage ServerStorage Server nVolumeVolumeVolume18|2023 SNIA.All Rights Reserved.Introducing Azure Elastic SAN SAN ApplianceNetwork Endpoints/Data Protection GroupsIndustrys first fully managed SAN offering in the cloudDeploy,manage,an
21、d host workloads on Azure with anend-to-end experience similar to an on-premises SANBulk provision storage to achieve massive scale(millions of IOPS,double-digit GB/s)Simplified provisioning,scaling,and access management,with redundancy built inSupport standard industry protocol(iSCSI)for data acces
22、sVolumeElastic SANVolume GroupAzure Elastic SANOn-prem SANBillingProvisioningAccessManagementStorageA brand-new cloud native SAN solutionRather than manage individual disks for each of your workloads,save time and money with a cloud-native SANM I C R O S O F T C O N F I D E N T I A LElastic SANProvi
23、sioning resources and BillingBilled on provisioned storage for capacity and performanceOperations include:Create Elastic SANUpdate provisioned resources in base or capacity-only scale unitsDelete Elastic SANTwo provisioning units:Base and Capacity-onlyVolumeElastic SANVolume GroupM I C R O S O F T C
24、 O N F I D E N T I A LVolume GroupApplying security,encryption,data protection configurationsConfigurations on the volume groups apply to all volumes within the groupA Elastic SAN can have up to 20 volume groupsOperations include:Create volume groupUpdate network configurationsUpdate encryption Sett
25、ingsDelete volume groupVolumeElastic SANVolume GroupM I C R O S O F T C O N F I D E N T I A LVolumeData storage,the actual storage unitRead/Write access over iSCSIDynamic pooling of provisioned performance targetsOperations include:Create volumeUpdate volume size to scale up performance or capacityD
26、elete volumeVolumeElastic SANVolume GroupM I C R O S O F T C O N F I D E N T I A LAzure Elastic SANEnabling container native scaleShare provisioned performance,achieve cost efficiency at scaleIntegrate with Azure Container Storage via iSCSI*preview expected later this yearSimplify volume management
27、through groupingVolume GroupVolumeConfiguration applies 1 TiB Elastic SAN1+GiB per Volume5K IOPS provisionedPodPViSCSIVolume23|2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Azure Elastic SAN deep dive24|2023 SNIA.All Rights Reserved.Inside Azure Storage(Recap)Front-e
28、nd layer Protocol endpoint Authentication/Authorization Metrics/loggingPartition layer Understands and manages our data abstractions Massively scalable key/value store Key ranges assigned to serversStream layer(Distributed File System)Data persistence and replication(JBOD)Append-only file system25|2
29、023 SNIA.All Rights Reserved.Azure Storage multi-protocol supportAuthentication,throttling,loggingStorage operations business logicFront End layerProtocol layerService layerPartition layerREST APIiSCSIOthers PutGetDeleteProtocol frontend,versioning26|2023 SNIA.All Rights Reserved.Scaling with networ
30、ked storage Each network connection to a different front end Not limited by IOPS from a single storage server MPIO is well supportedUbiquitous initiator support with iSCSI Integrate with Linux and Windows nodes Not limited by“local”SCSI bus under a VM Can have far more storage devices attached Impor
31、tant for a VM hosting many containersExpansion to other protocols in future Particularly for encryption-in-transit Talk to me about TLS-PSKiSCSIAKS cluster nodePodPodPodStorage27|2023 SNIA.All Rights Reserved.iSCSI target scaling with Elastic SANHow do we scale containers with SAN?Containers may hav
32、e short lifetime.Network connections are relatively fast to establish and terminate From a user perspective,authentication counts as part of connection cost Administratively simpler to have a fleet of containers and a pool of storage Should be able to allocate IOPS and capacity based on demandAzure
33、Elastic SANVolumeVolumeiSCSIPodPodPodAzure Kubernetes ClusterPodPodPodPodPodPodVolumeVolumeVolumeSecure accessAzure Container Storage28|2023 SNIA.All Rights Reserved.Making storage interchangeableSolution:sharding!If all storage exists on the same servers,shifting workloads wont shift loads A fronte
34、nd protocol iSCSI or NVMe can redirect requests as needed Sharding normally creates challenges for global state think snapshots But here we have a single frontendElastic SANVolume GroupmetadataShard 1Shard 2Shard NStorage tenant NCross tenantVolumeStorage tenant 1Volume29|2023 SNIA.All Rights Reserv
35、ed.I lied.Depending on workload,one TCP connection might not be enough.Scaling“out”lots of workloads,one connection each Scaling“up”many connections for one workload Unlike traditional SAN,each connection is managed by a different physical server But the total count remains manageable its possible t
36、o coordinateAzure Elastic SANVolumeAzure Kubernetes ClusterPodAzure Container StoragePodPodVolumeVolumeAzure Elastic SANVolumeAzure Kubernetes ClusterPodAzure Container Storage30|2023 SNIA.All Rights Reserved.Shard creation and deletion Shards have defined names,distributed via hash Shards can be cr
37、eated on demand Passing all state to describe a shard on every write is expensive Instead,if a shard is being created,it can query state from the volume Shard deletion occurs via background scans If a volume is gone,shard records that it has expiredmetadataShard 1Shard 2Shard NVolumeShard N+1Query m
38、etadata31|2023 SNIA.All Rights Reserved.Storage tenantServerPer Pool scaling up Each volume can be created on a different tenant Each volume is physically stored across a set of storage servers Each connection can be served by a different server Result:millions of IOPS,double-digit GBps,single digit millisecond latenciesServerServerVolumePodAzure Container Storage32|2023 SNIA.All Rights Reserved.Thank YouPlease reach out to us at askcontainerstorage 33|2023 SNIA.All Rights Reserved.Please take a moment to rate this session.Your feedback is important to us.