|Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications|
by J. Seward, N. Nethercote, J. Weidendorfer and the Valgrind Development Team
Paperback (6"x9"), 164 pages
RRP £12.95 ($19.95)
7.3.3 Avoiding cycles
Informally speaking, a cycle is a group of functions which call each other in a recursive way.
Formally speaking, a cycle is a nonempty set S of functions, such that for every pair of functions F and G in S, it is possible to call from F to G (possibly via intermediate functions) and also from G to F. Furthermore, S must be maximal--that is, be the largest set of functions satisfying this property. For example, if a third function H is called from inside S and calls back into S, then H is also part of the cycle and should be included in S.
Recursion is quite usual in programs, and therefore, cycles sometimes appear in the call graph output of Callgrind. However, the title of this chapter should raise two questions: What is bad about cycles which makes you want to avoid them? And: How can cycles be avoided without changing program code?
Cycles are not bad in themselves, but tend to make performance analysis of your code harder. This is because inclusive costs for calls inside a cycle are meaningless. The definition of inclusive cost, i.e. self cost of a function plus inclusive cost of its callees, needs a topological order among functions. For cycles, this does not hold true: callees of a function in a cycle include the function itself. Therefore, KCachegrind does cycle detection and skips visualization of any inclusive cost for calls inside of cycles. Further, all functions in a cycle are collapsed into artifical functions called like ‘Cycle 1’.
Now, when a program exposes really big cycles (as is true for some GUI code, or in general code using event or callback based programming style), you lose the nice property of pinpointing the bottlenecks by following call chains from ‘main()’, guided via inclusive cost. In addition, KCachegrind loses its ability to show interesting parts of the call graph, as it uses inclusive costs to cut off uninteresting areas.
Despite the meaningless of inclusive costs in cycles, the need for some kind of visualization motivates the possibility of temporarily disabling cycle detection in KCachegrind. This can lead to a misguided visualization. However, often cycles appear because of an unlucky superposition of independent call chains in a way that the profile result will see a cycle. Neglecting calls with very small measured inclusive cost can break these cycles. In such cases, the incorrect handling of cycles (by not detecting them still) gives a meaningful profiling visualization.
It has to be noted that currently,
does not do any cycle detection at all. For program executions with function
recursion, it can print inclusive costs above 100%.
After describing why cycles are bad for profiling, it is worth talking about cycle avoidance. The key insight here is that symbols in the profile data do not have to exactly match the symbols found in the program. Instead, the symbol name could encode additional information from the current execution context such as recursion level of the current function, or even some part of the call chain leading to the function. While encoding of additional information into symbols is quite capable of avoiding cycles, it has to be used carefully to not cause symbol explosion. The latter imposes large memory requirement for Callgrind with possible out-of-memory conditions, and big profile data files.
Another way of avoiding cycles in Callgrind's profile data
output is to leave out selected functions in the call graph. Of course, this
also skips any call information from and to an ignored function, and thus can
break a cycle. Candidates for this typically are dispatcher functions in event
driven code. The option to ignore calls to a function is
--fn-skip=function. Aside from
possibly breaking cycles, this is used in Callgrind to skip
trampoline functions in the PLT sections
for calls to functions in shared libraries. You can see the difference
if you profile with
If a call is ignored, its cost events will be propagated to the
If you have a recursive function, you can distinguish the first
10 recursion levels by specifying
Or for all functions with
--separate-recs=10, but this will
give you much bigger profile data files. In the profile data, you will see
the recursion levels of
func as the different functions with names
func'3 and so on.
If you have call chains “A > B > C” and “A > C > B”
in your program, you usually get a “false” cycle “B <> C”. Use
C will be treated as different functions
depending on the direct caller. Using the apostrophe for appending
this “context” to the function name, you get “A > B'A > C'B”
and “A > C'A > B'C”, and there will be no cycle. Use
--separate-callers=2 to get a 2-caller
dependency for all functions. Note that doing this will increase
the size of profile data files.
|ISBN 0954612051||Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications||See the print edition|