Profiling GNU/Linux applications
This document explains the necessary steps for profiling applications under GNU/Linux. There are utilities such as valgrind that tells how the programs uses memory and gives detailed info about leaks, nevertheless does not provide sufficient information of CPU usage by the program's functions, who called them and how much time does this functions used when they run. This document tells how to profile with gprof, The GNU Profiler.
In order to produce the gmon.out file, the application should exit normally, on errors the file is not created
In order to use gprof, install oprofile at file system applications
gprof needs that the application generates profiling data to work with. To make this available, on the application package some flags need to be set,
Profiling and debugging flags:
CFLAGS = -pg
gprof generates profiling data better for static linked libraries, have it in mind for your application:
LDFLAGS = -static-libgcc -Wl,-Bstatic -lc
With this, the application profiling build setup is ready, now run your application as you would normally:
When you run your application you create "gmon.out" file with the information.
$ ls gmon.out app app.c app.o Makefile
When the application is done, there should be a *.out file (usually gmon.out) that gprof will work with. Run,
gprof $EXECUABLE_FILE gmon.out
By now, you should see in the command prompt all the information regard the application profile. For more details, visit the GNU Profiler website.
Other way to use gprof to analyze the information later,
gprof $EXECUABLE_FILE gmon.out >> app_gprof_data.txt
There is a way to show the profile information in a graph (generate a *.png file also), for that you'll need some extra packages, plus a python script.
sudo apt-get install python graphviz xdot
Now, you'll need gprof2dot to convert the profiling output to a dot graph. This script can be download from:
or checkout from the git repo.
git clone https://code.google.com/p/jrfonseca.gprof2dot/ gprof2dot
Enter the script directory:
gprof2dot.py app_gprof_data.txt > app_call_graph.dot
To visualize the data:
To convert the .dot file to an image:
dot -Tpng app_call_graph.dot -o app_call_graph.png
Example 1 video stabilization
A video stabilization algorithm was ported to the RidgeRun SDK and run on the ARM processor. gprof was used to identify the time consuming routines so they could be moved to the DSP. The following are the results of running gprof before any optimizations were performed.
- Normal command execution
$APP -f 1 coastguard_352x288.yuv
- Capture profile data
gprof $APP -f 1 coastguard_352x288.yuv $APP.gprof.out >> $APP.gprof.txt
with first part of output being:
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 32.58 1.01 1.01 45467136 0.00 0.00 interpolateBiLin 32.09 2.00 0.99 300 3.30 3.30 boxblur_vert_C 11.67 2.36 0.36 300 1.20 1.20 boxblur_hori_C 8.43 2.62 0.26 300 0.87 4.22 transformYUV 8.43 2.88 0.26 600990 0.00 0.00 compareSubImg_thr 1.46 2.92 0.05 300 0.15 0.15 lowPassTransforms 1.30 2.96 0.04 memcpy 0.97 2.99 0.03 __write_nocancel
- Create call graph
graph2dot $APP.gprof.txt > $APP.call_graph.dot
with first part of call graph data file being:
Call graph (explanation follows) granularity: each sample hit covers 2 byte(s) for 0.32% of 3.09 seconds index % time self children called name <spontaneous>  95.0 0.00 2.93 filter_video  0.00 1.62 300/300 motionDetection  0.26 1.01 300/300 transformYUV  0.05 0.00 300/300 lowPassTransforms  0.00 0.00 300/300 transformPrepare  0.00 0.00 300/300 transformFinish 
- Visualize call graph
- Create PNG image file of call graph
dot -T$APP.call_graph.dot -o $APP.call_graph.png
You can find more information about profiling in the following links:
The GNU Profiler: http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html
Gprof call-graph visualization: http://redmine.epfl.ch/projects/python_cookbook/wiki/Gprof_call-graph_visualization