Presentation

· Presenters · Organizations · Search Program

ACM Student Research Competition

Poster

Reception

: P42: TRIP: An Ultra-Low Latency, TeraOps/s Reconfigurable Inference Processor for Multi-Layer Perceptrons

SessionPoster Reception

Authors

Ahmed Sanaullah

Chen Yang

Yuri Alexeev

Kazutomo Yoshii

Martin C. Herbordt

Event Type

ACM Student Research Competition

Poster

Reception

Tags

TimeTuesday, November 14th5:15pm - 7pm

LocationFour Seasons Ballroom

DescriptionMulti-Layer Perceptron (MLP) is one of the most commonly deployed Deep Neural Networks, representing 61% of the workload in Google data-centers. MLP Inference, a memory bound problem, typically has hard response time deadlines and prefers latency over throughput. In our work, we designed a TeraOps/s Reconfigurable Inference Processor for MLPs (TRIP) on FPGAs that alleviates the memory bottleneck by storing all application specific weights on-chip. It can be deployed in multiple configurations, including host-independent operation. We have shown that TRIP achieves 60x better performance than the current state-of-the-art Google Tensor Processing Unit (TPU) for MLP Inference. It was demonstrated on the cancer patient datasets used in the Candle Exascale Computing Project (ECP).

Authors

Ahmed Sanaullah

Boston University

Chen Yang

Boston University

Yuri Alexeev

Argonne National Laboratory

Kazutomo Yoshii

Argonne National Laboratory

Martin C. Herbordt

Boston University

Navigation