Evaluating Shared-Cache Performance with Microbenchmarks and Reuse Distance Analysis
Emergence of multicore architectures has opened up new opportunities for thread-level parallelism and dramatically increased the theoretical peak on current systems. However, achieving a high fraction of peak performance requires careful orchestration of many architecture-sensitive parameters. In particular, the presence of shared-caches on multicore architectures makes it necessary to consider, in concert, issues related to both parallelism and data locality. This research evaluates the shared-cache performance of several scientic kernels. A synthetic microbenchmark along with hardware performance counter measurements are used to estimate cache sharing among multiple threads in parallel applications. A novel reuse-distance based algorithm is developed to identify correlations between reused distance patterns and shared-cache utilization.
Vara, S. (2011). <i>Evaluating shared-cache performance with microbenchmarks and reuse distance analysis</i> (Unpublished thesis). Texas State University-San Marcos, San Marcos, Texas.