|Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications|
by J. Seward, N. Nethercote, J. Weidendorfer and the Valgrind Development Team
Paperback (6"x9"), 164 pages
RRP £12.95 ($19.95)
6.5 Acting on Cachegrind's information
So, you've managed to profile your program with Cachegrind. Now what? What's the best way to actually act on the information it provides to speed up your program? Here are some rules of thumb that we have found to be useful.
First of all, the global hit/miss rate numbers are not that useful. If you have multiple programs or multiple runs of a program, comparing the numbers might identify if any are outliers and worthy of closer investigation. Otherwise, they're not enough to act on.
The line-by-line source code annotations are much more useful. In our experience, the best place to start is by looking at the ‘Ir’ numbers. They simply measure how many instructions were executed for each line, and don't include any cache information, but they can still be very useful for identifying bottlenecks.
After that, we have found that L2 misses are typically a much bigger source of slow-downs than L1 misses. So it's worth looking for any snippets of code that cause a high proportion of the L2 misses. If you find any, it's still not always easy to work out how to improve things. You need to have a reasonable understanding of how caches work, the principles of locality, and your program's data access patterns. Improving things may require redesigning a data structure, for example.
In short, Cachegrind can tell you where some of the bottlenecks in your code are, but it can't tell you how to fix them. You have to work that out for yourself. But at least you have the information!
|ISBN 0954612051||Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications||See the print edition|