P69: Portable Methods for Measuring Cache Hierarchy
Performance
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionThere has been a recent influx of different processor
architecture designs into the market, with many of them
targeting HPC applications. When estimating application
performance, developers are used to considering the most
common figures of merit, such as peak FLOP/s, memory
bandwidth, core counts, and so on. In this study, we
present a detailed comparison of on-chip memory
bandwidths, including single core and aggregate across a
node, for a set of next-generation CPUs.
We do this in such a way as to be portable across difference architectures and instruction sets. Our study indicates that, while two processors might look superficially similar when only considering the common figures of merit, those two processors might have radically different on-chip memory bandwidth, a fact which may be crucial when understanding observed application performance. Our results and methods will be made available on GitHub to aid the community in evaluating cache bandwidths.
We do this in such a way as to be portable across difference architectures and instruction sets. Our study indicates that, while two processors might look superficially similar when only considering the common figures of merit, those two processors might have radically different on-chip memory bandwidth, a fact which may be crucial when understanding observed application performance. Our results and methods will be made available on GitHub to aid the community in evaluating cache bandwidths.




