Home >Backend Development >C++ >How Can We Accurately Capture Function Exit Times for Performance Profiling on Embedded Systems?

How Can We Accurately Capture Function Exit Times for Performance Profiling on Embedded Systems?

Susan Sarandon
Susan SarandonOriginal
2024-12-18 11:35:11959browse

How Can We Accurately Capture Function Exit Times for Performance Profiling on Embedded Systems?

Capturing Function Exit Time with __gnu_mcount_nc

In an attempt to perform performance profiling on an embedded platform, implementing a function that solely records the stack frame and current cycle count for each function entry resulted in useful insights regarding caller/callee graphs and frequently utilized functions. However, the lack of visibility into function exit times posed a challenge for capturing the complete time spent within function bodies.

GNU Profiling Tool Approach

In contrast to the aforementioned implementation, GNU profiling tools like gprof overcome this limitation by utilizing stack sampling. Instead of relying on function entry and exit timing, gprof measures the self-time of each function by counting PC samples within it. This self-time is then distributed among callers based on the function-to-function call counts.

Advantages of Stack Sampling

Compared to PC sampling, stack sampling provides several advantages:

  • Accuracy: Stack sampling eliminates uncertainty arising from short function calls and library routines not compiled with -pg.
  • Efficiency: Capturing stack samples is more expensive than PC samples, but fewer samples are required for accurate profiling.
  • Robustness: Stack sampling is not impacted by recursion and works effectively in multithreaded/multicore environments.

Alternatives to Call-Graphs and Hot-Spots

While call-graphs and hot-spots can provide some insights, they may not expose hidden performance issues. For effective profiling, it is recommended to examine random raw stack samples to identify functions that are responsible for excessive time consumption and why they are being called. This approach provides a deeper understanding of the code structure and potential areas for optimization.

The above is the detailed content of How Can We Accurately Capture Function Exit Times for Performance Profiling on Embedded Systems?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn