Date of Award
Winter 12-15-2017
Level of Access Assigned by Author
Campus-Only Thesis
Degree Name
Master of Science (MS)
Department
Computer Engineering
Advisor
Vincent Weaver
Second Committee Member
Bruce Segee
Third Committee Member
Yifeng Zhu
Abstract
Performance analysis is an essential step for better software optimization, which is critical for embedded systems, desktop applications and scientific computing. Most modern microprocessors contain hardware performance counters that can help with performance analysis. The PAPI library is a widely-used self-monitoring performance measurement interface that supports the performance counter hardware found in most major microprocessors. PAPI supports self-monitoring: letting programs instrument chunks of code and gather detailed performance values.
A key aspect of self-monitoring is reading hardware performance counters with minimum possible overhead. Any overhead in the measurements can affect the accuracy of the results. In perf_event, the Linux interface to performance counters, the values are read via the read system call. This involves a large overhead when entering and exiting the operating system kernel.
In this work, we modify PAPI to use the rdpmc instruction which allows userspace measurement of counters on x86 systems. This replaces the use of the high-overhead read () system call. We tested the result across 14 modern systems and 4 benchmarks. We find that the performance measurement latency is improved by at least a factor of three (and often a factor of six or more) in our test cases.
Recommended Citation
Liu, Yan, "Optimizing PAPI for Low-Overhead Counter Measurement" (2017). Electronic Theses and Dissertations. 2803.
https://digitalcommons.library.umaine.edu/etd/2803