- publishing free software manuals
Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications
by J. Seward, N. Nethercote, J. Weidendorfer and the Valgrind Development Team
Paperback (6"x9"), 164 pages
ISBN 0954612051
RRP £12.95 ($19.95)

Get a printed copy>>>

7.3.3 Avoiding cycles

Informally speaking, a cycle is a group of functions which call each other in a recursive way.

Formally speaking, a cycle is a nonempty set S of functions, such that for every pair of functions F and G in S, it is possible to call from F to G (possibly via intermediate functions) and also from G to F. Furthermore, S must be maximal--that is, be the largest set of functions satisfying this property. For example, if a third function H is called from inside S and calls back into S, then H is also part of the cycle and should be included in S.

Recursion is quite usual in programs, and therefore, cycles sometimes appear in the call graph output of Callgrind. However, the title of this chapter should raise two questions: What is bad about cycles which makes you want to avoid them? And: How can cycles be avoided without changing program code?

Cycles are not bad in themselves, but tend to make performance analysis of your code harder. This is because inclusive costs for calls inside a cycle are meaningless. The definition of inclusive cost, i.e. self cost of a function plus inclusive cost of its callees, needs a topological order among functions. For cycles, this does not hold true: callees of a function in a cycle include the function itself. Therefore, KCachegrind does cycle detection and skips visualization of any inclusive cost for calls inside of cycles. Further, all functions in a cycle are collapsed into artifical functions called like ‘Cycle 1’.

Now, when a program exposes really big cycles (as is true for some GUI code, or in general code using event or callback based programming style), you lose the nice property of pinpointing the bottlenecks by following call chains from ‘main()’, guided via inclusive cost. In addition, KCachegrind loses its ability to show interesting parts of the call graph, as it uses inclusive costs to cut off uninteresting areas.

Despite the meaningless of inclusive costs in cycles, the need for some kind of visualization motivates the possibility of temporarily disabling cycle detection in KCachegrind. This can lead to a misguided visualization. However, often cycles appear because of an unlucky superposition of independent call chains in a way that the profile result will see a cycle. Neglecting calls with very small measured inclusive cost can break these cycles. In such cases, the incorrect handling of cycles (by not detecting them still) gives a meaningful profiling visualization.

It has to be noted that currently, callgrind_annotate does not do any cycle detection at all. For program executions with function recursion, it can print inclusive costs above 100%.

After describing why cycles are bad for profiling, it is worth talking about cycle avoidance. The key insight here is that symbols in the profile data do not have to exactly match the symbols found in the program. Instead, the symbol name could encode additional information from the current execution context such as recursion level of the current function, or even some part of the call chain leading to the function. While encoding of additional information into symbols is quite capable of avoiding cycles, it has to be used carefully to not cause symbol explosion. The latter imposes large memory requirement for Callgrind with possible out-of-memory conditions, and big profile data files.

Another way of avoiding cycles in Callgrind's profile data output is to leave out selected functions in the call graph. Of course, this also skips any call information from and to an ignored function, and thus can break a cycle. Candidates for this typically are dispatcher functions in event driven code. The option to ignore calls to a function is --fn-skip=function. Aside from possibly breaking cycles, this is used in Callgrind to skip trampoline functions in the PLT sections for calls to functions in shared libraries. You can see the difference if you profile with --skip-plt=no. If a call is ignored, its cost events will be propagated to the enclosing function.

If you have a recursive function, you can distinguish the first 10 recursion levels by specifying --separate-recs10=function. Or for all functions with --separate-recs=10, but this will give you much bigger profile data files. In the profile data, you will see the recursion levels of func as the different functions with names func, func'2, func'3 and so on.

If you have call chains “A > B > C” and “A > C > B” in your program, you usually get a “false” cycle “B <> C”. Use --separate-callers2=B --separate-callers2=C, and functions B and C will be treated as different functions depending on the direct caller. Using the apostrophe for appending this “context” to the function name, you get “A > B'A > C'B” and “A > C'A > B'C”, and there will be no cycle. Use --separate-callers=2 to get a 2-caller dependency for all functions. Note that doing this will increase the size of profile data files.

ISBN 0954612051Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applicationsSee the print edition