| Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications by J. Seward, N. Nethercote, J. Weidendorfer and the Valgrind Development Team Paperback (6"x9"), 164 pages ISBN 0954612051 RRP £12.95 ($19.95) |
7.2 Basic Usage
As with Cachegrind, you probably want to compile with debugging info (the -g flag), but with optimization turned on.
To start a profile run for a program, execute:
valgrind --tool=callgrind [callgrind options] your-program [your options]
While the simulation is running, you can observe execution with
callgrind_control -b
This will print out the current backtrace. To annotate the backtrace with event counts, run
callgrind_control -e -b
After program termination, Callgrind generates a profile data file named
‘callgrind.out.<pid>’, where pid is the process ID
of the program being profiled.
The data file contains information about the calls made in the
program among the functions executed, together with events of type
Instruction Read Accesses (Ir).
To generate a function-by-function summary from the profile data file, use
callgrind_annotate [options] callgrind.out.<pid>
This summary is similar to the output you get from a Cachegrind run with ‘cg_annotate’: the list of functions is ordered by exclusive cost of functions, which also are the ones that are shown. Important for the additional features of Callgrind are the following two options:
-
--inclusive=yes: Instead of using exclusive cost of functions as sorting order, use and show inclusive cost. -
--tree=both: Interleave into the top level list of functions, information on the callers and the callees of each function. In these lines, which represents executed calls, the cost gives the number of events spent in the call. Indented, above each function, there is the list of callers, and below, the list of callees. The sum of events in calls to a given function (caller lines), as well as the sum of events in calls from the function (callee lines) together with the self cost, gives the total inclusive cost of the function.
Use --auto=yes to get annotated source code
for all relevant functions for which the source can be found. In
addition to source annotation as produced by
‘cg_annotate’, you will see the
annotated call sites with call counts. For all other options,
consult the (Cachegrind) documentation for
‘cg_annotate’.
For better call graph browsing experience, it is highly recommended to use KCachegrind. If your code has a significant fraction of its cost in cycles (sets of functions calling each other in a recursive manner), you have to use KCachegrind, as ‘callgrind_annotate’ currently does not do any cycle detection, which is important to get correct results in this case.
If you are additionally interested in measuring the
cache behavior of your
program, use Callgrind with the option
--simulate-cache=yes.
However, expect a further slow down approximately by a factor of 2.
If the program section you want to profile is somewhere in the
middle of the run, it is beneficial to
fast forward to this section without any
profiling, and then switch on profiling. This is achieved by using
the command line option
--instr-atstart=no
and running, in a shell,
‘callgrind_control -i on’ just before the
interesting code section is executed. To exactly specify
the code position where profiling should start, use the client request
‘CALLGRIND_START_INSTRUMENTATION’.
If you want to be able to see assembly code level annotation, specify
--dump-instr=yes. This will produce
profile data at instruction granularity. Note that the resulting profile
data
can only be viewed with KCachegrind. For assembly annotation, it also is
interesting to see more details of the control flow inside of functions,
i.e. (conditional) jumps. This will be collected by further specifying
--collect-jumps=yes.
| ISBN 0954612051 | Valgrind 3.3 - Advanced Debugging and Profiling for GNU/Linux applications | See the print edition |