Topology-Aware GPU Scheduling for Learning Workloads in
Cloud Environments
Event Type
Paper
TimeTuesday, November 14th4pm -
4:30pm
Location405-406-407
DescriptionRecent advances in hardware, such as systems with
multiple GPUs and their availability in the cloud, are
enabling deep learning in various domains including
health care, autonomous vehicles, and Internet of
Things. Multi-GPU systems exhibit complex connectivity
among GPUs and between GPUs and CPUs. Workload
schedulers must consider hardware topology and workload
communication requirements in order to allocate CPU and
GPU resources for optimal execution time and improved
utilization in shared cloud environments.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.
Download PDF:
here




