P05: ooc_cuDNN : A Deep Learning Library Supporting CNNs
over GPU Memory Capacity
SessionPoster Reception
Authors
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionGPUs are widely used to accelerate deep learning with
convolutional neural network (CNN). However, since GPU
memory capacity is limited, it is difficult to implement
efficient programs that compute large CNN on GPU. This
poster describes the design and implementation of
out-of-core cuDNN (ooc_cuDNN) library, which supports to
compute CNN exceeding GPU memory capacity using capacity
of CPU memory. ooc_cuDNN is an extension of cuDNN, which
is high performance and popular deep learning library.
ooc_cuDNN divides CNN computation based on its
performance model for better performance. In addition,
ooc_cuDNN provides fused functions combined some
computation to reduce extra communication. With
ooc_cuDNN, we successfully computed CNN requiring more
than 60 GB memory on a single GPU with 16 GB memory.
Compared with an in-core case using cuDNN, performance
degradation was 13 %.




