Date of Award
Level of Access
Master of Science (MS)
Second Committee Member
Third Committee Member
Performance analysis is an essential step for better software optimization, which is critical for embedded systems, desktop applications and scientific computing. Most modern microprocessors contain hardware performance counters that can help with performance analysis. The PAPI library is a widely-used self-monitoring performance measurement interface that supports the performance counter hardware found in most major microprocessors. PAPI supports self-monitoring: letting programs instrument chunks of code and gather detailed performance values.
A key aspect of self-monitoring is reading hardware performance counters with minimum possible overhead. Any overhead in the measurements can affect the accuracy of the results. In perf_event, the Linux interface to performance counters, the values are read via the read system call. This involves a large overhead when entering and exiting the operating system kernel.
In this work, we modify PAPI to use the rdpmc instruction which allows userspace measurement of counters on x86 systems. This replaces the use of the high-overhead read () system call. We tested the result across 14 modern systems and 4 benchmarks. We find that the performance measurement latency is improved by at least a factor of three (and often a factor of six or more) in our test cases.
Liu, Yan, "Optimizing PAPI for Low-Overhead Counter Measurement" (2017). Electronic Theses and Dissertations. 2803.