Automated Compiler Driven Superpage Allocation and its Applications
Magee, Joshua A.
The translation look-aside buffer (TLB) can represent a significant performance bottleneck in modern microprocessor-based systems. The amount of memory available to a system is continuously increasing due to the abundance and affordability of RAM, yet the size of the TLB has grown very little. The increasing ratio of memory page entries to TLB entries has resulted in an increase of TLB misses. Given that TLB misses present a substantial bottleneck to system performance, the need to reduce the pressure placed upon the TLB is well justified. Superpages are one method that aims to extend the reach of the TLB and therefore reduce the number of misses. Superpages are supported at both the hardware and software level on most modern microprocessor-based systems. Previous research has studied the usage, management, and implications of superpages from an architectural and operating system perspective, but there has been no research of superpages from the compiler perspective. This thesis presents a strategy for compiler-driven superpage allocation. Judicious usage of superpages can improve system performance by reducing the number of TLB misses, but indiscriminate superpage allocation can result in page fragmentation and increased application footprint. A significant advantage afforded by a compiler driven superpage allocation strategy is the availability of data-reuse information within an application, a luxury that architectural and operating systems lack. The compiler strategy employs data-locality analysis to estimate the TLB demands of a program and uses this information to allocate superpages only when beneficial. If the compiler determines that it is prudent to use superpages then an optimization is performed that replaces all memory allocation with a custom malloc implementation. This malloc implementation is superpage-aware and supports both statically and dynamically determined superpage allocation. In addition to the advantages afforded by the compiler when making judicious use of superpages, superpages also present opportunities for optimization to the compiler. Compiler optimizations attempting to reduce conflict misses, such as array padding, can benefit when used in conjunction with superpages. The fact that superpages allow for a predictable and contiguous allocation of memory allows for the profitability of data-locality optimizations to be increased. Not only are superpages beneficial to application performance and compiler optimizations but they can also help in benchmarking and empirical tuning. To this end, a method of utilizing superpages to measure certain hardware parameters, such as L2 cache associativity, is presented. The effectiveness of the strategy is demonstrated on two different platforms with different TLB configurations.
Compilers, Buffer storage, Microprocessors
Magee, J. A. (2009). <i>Automated compiler driven superpage allocation and its applications</i> (Unpublished thesis). Texas State University-San Marcos, San Marcos, Texas.