Some notes on profiling with GNU gprof. I have a CMake project that I wanted to improve on.


	$ cmake -DCMAKE_BUILD_TYPE=Debug \
			-DCMAKE_CXX_FLAGS=-pg \
			-DCMAKE_EXE_LINKER_FLAGS=-pg \
			-DCMAKE_SHARED_LINKER_FLAGS=-pg .. 


It's not sufficient to simply compile with '-pg'; we need to also link against this.

Running the compiled program (let's call it foo) will produce a 'gmon.out' in the same folder;
We can generate a "flat profile" by running,


	$ gprof foo gmon.out > foo.flat


which will show the time spent executing different functions in the program:


	$ cat foo.flat | less


	Flat profile:

	Each sample counts as 0.01 seconds.
	%   cumulative   self              self     total           
	time   seconds   seconds    calls   s/call   s/call  name    
	2.80      0.84     0.84 50497108     0.00     0.00  foo::client::read_message()::{lambda()#1}::operator()() const
	2.35      1.54     0.70 52187797     0.00     0.00  boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&)
	2.06      2.15     0.62 50497109     0.00     0.00  boost::asio::detail::executor_op<foo::client::read_message()::{lambda()#1}, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::allocator<void>*, boost::system::error_code const&, unsigned long)
	1.91      2.72     0.57 52183554     0.00     0.00  boost::asio::detail::std_fenced_block::~std_fenced_block()
	1.41      3.14     0.42 52743910     0.00     0.00  boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_c
	...
				Call graph (explanation follows)

	granularity: each sample hit covers 2 byte(s) for 0.03% of 29.79 seconds

	index % time    self  children    called     name
													<spontaneous>
	[1]     93.5    0.00   27.85                 main [1]
					0.00   27.64       1/1           foo::client::client(config const&) [2]
					0.00    0.02       1/1           foo::client::~client() [919]
	...


We can visualize the call graph,


	$ gprof2dot gprof.callgraph | dot -Tpng -o foo.png


Some neat flags are '-P' for suppressing flat profile and '-Q' for suppressing call graph.