《MULTI-INSTANCE GPU(MIG)深度学习最佳用法示例.pdf》由会员分享,可在线阅读,更多相关《MULTI-INSTANCE GPU(MIG)深度学习最佳用法示例.pdf(53页珍藏版)》请在三个皮匠报告上搜索。
1、张雪萌, 杨岱 MULTI-INSTANCE GPU (MIG) 深度学习最佳用法示例 2 Introduction to MIG (Multi-Instance GPU) MIG management Kubernetes support for MIG MIG for deep learning Training Fine-tuning Inference with Triton Mixed Workloads AGENDA 3 Motivation Why we use Multiple-Instance GPU (MIG) Why? To maximize GPU utilizatio
2、n. When? If your application cannot fully utilize a single GPU. How? Use MIG to run multiple workloads in parallel on a single A100 GPU. One GPU to serve single user with multiple applications, or multiple users. 4 MULTI-INSTANCE GPU (MIG) Optimize GPU Utilization, Expand Access to More Users with G
3、uaranteed Quality of Service Up To 7 GPU Instances In a Single A100: Dedicated SM, Memory, L2 cache, Bandwidth for hardware QoS & isolation Simultaneous Workload Execution With Guaranteed Quality Of Service: All MIG instances run in parallel with predictable throughput & latency Right Sized GPU Allocation: Different sized MIG instances based on target workloads Diverse Deployment Environments: Sup