Saturday, January 26, 2013

Apport-Valgrind: a case of apples and oranges?


Who doesn't love a good fruit salad?

As noted a previous post, Ubuntu 13.04 (Raring) has the new apport-valgrind binary package [1]. 

But apport is a crash reporting system, whereas valgrind is memory leak detector (among other things). So it may make you wonder: what's the connection? Is this a case of apples-and-oranges?




Let's try to answer this, starting with a look at valgrind.

Valgrind

Valgrind is a workhorse of looking at C programs at run time. It has many capabilities. The one we are interested in here is its 'memcheck' tool, which is used to find memory leaks [2]. 

For example, you can run valgrind --tool=memcheck /usr/bin/foo [3] and a report is generated:
  • The report shows memory leaks created while running /usr/bin/foo. This includes leaks in foo's code and leaks in any code executed by foo, for example called functions that exist in external shared libraries. 
  • For each leak, the log contains a stack trace that shows the sequence of function calls that created the memory leak. You can follow the stack trace by hand and find the code errors that lead to the memory leak, except for one issue: raw stack traces are like swiss cheese with lots of missing bits.

Raw stack traces are like swiss cheese

The problem with a raw stack trace is that it is full of holes.


What's missing? 
  • In a  raw valgrind stack trace, one does not see function names or source code file names
  • Instead one sees question marks for function names and executable file names (frequently shared object library files) instead of source file names
Fortunately, valgrind can display the function names and source code file names if the debug symbols are present (more on this below). 

Let's take a look at some real-life valgrind output and compare how it looks before and after debug symbols are present.

Here's a line from a raw valgrind memory leak stack trace:

==2874==    by 0x51E727D: ??? (in /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.600.0)

And here's how the same line looks with debug symbols for libgtk-3.0:

==3643==    by 0x51E727D: gtk_menu_item_class_intern_init (gtkmenuitem.c:427)

In the raw example: 
  • The function name is unknown: ???
  • The shared object file name in which the function lives is listed: /usr/lib/x86_64-linux-gnu/libgtk-3.so.0.600.0
In the debug symbols example:
  • The function name is shown:  gtk_menu_item_class_intern_init
  • The source code file name is shown: gtk_menu_item_class_intern_init:427 (It even knows the line number: 427)
Clearly, the second stack trace is much more useful for a human being who wants find and fix a memory leak. The question therefore is: how can we easily obtain the debug symbols that will make our valgrind stack traces useful? That's where apport comes in.

The apport connection

Apport is a collection of packages and tools that provide automatic crash reporting. Among them is apport-retrace. This little gem provides the debug symbol capabilities for valgrind through its apport-retrace script.

It has the ability to find, download and extract available debug packages for all dependencies of a packaged [4] executable. (This was used for purposes unrelated to finding memory leaks with valgrind, but  as of Raring, it is used for valgrind purposes too.)

Let's unpack this a bit. Suppose the executable is nm-applet. Apport-retrace could:
  • Find the tree of packages that nm-applet needs to run (that is, the package that owns nm-applet and that package's direct and indirect dependencies)
  • Find the available debug symbol packages for them and make their debug symbol files available in a temporary directory (in /tmp, unless you specify otherwise) called the 'sandbox'
  • The debug symbols packages are not installed, but extracted into the sandbox directory
  • The sandbox directory is deleted after use (again, unless you specified your own sandbox directory) 
This automatic discovery of the complete set of available debug symbol files related to an executable would be very useful for the valgrind use case, as explained above. And as of Raring, we have it, as explained below. (Thanks to Martin Pitt for his invaluable assistance with this work.)

But first, a nice thing to note: extracting (rather than installing) the debug packages has a smaller impact on the the system. Notably: the extract directory (the sandbox) can simply be deleted after use to regain the disk space. No debug packages are actually installed, so no package removal needs to be done either. If you intend to valgrind the same executable many times, you can reuse a persistent sandbox, or simply install the debug symbol packages.

Pointing valgrind at the sandbox

We have seen that if debug symbol files are available, valgrind uses them to make much more useful stack traces. And we have seen that apport can download and extract into a temporary sandbox all available debug symbols for the packaged executable for which we want to find memory leaks. 

This leaves one more piece of the puzzle: telling valgrind to also look in the sandbox directory for debug symbols. (Valgrind always looks in the normal system directories for debug symbols.)

In Ubuntu 13.04 (Raring), the valgrind package now has this ability thanks to a patch from Alex Chiang. You simply add the information to the valgrind command line using the --extra-debuginfo-path=DIR argument.

With these two functions in place, the table is set for apport-valgrind, as described next.

Apport and valgrind: two peas in a pod

In Ubuntu 13.04 (Raring), the apport-valgrind package is introduced. The apport code that creates the sandbox was moved from apport-retrace into a new apport python module: apport/sandboxutils.py, which makes it available for new code, such as apport-valgrind.

So here's what apport-valgrind does:
  • It uses apport/sandboxutils to obtain and extract all available debug symbol packages into the sandbox directory for your (packaged) executable
  • It calls valgrind and tells it to also look in the sandbox directory (with the new --extra-debuginfo-path argument)
  • When valgrind is done, the sandbox directory is deleted (there are options for persistent sandboxes -- useful when you want to use valgrind repeatedly)
  • The valgrind log file is created in the current directory: ./vagrind.log
So with a single command like this: apport-valgrind nm-applet, you can generate a valgrind memory check log of stack traces with all available debug symbols used, with no new debug packages installed on the system and all debug  files (the sandbox) automatically deleted after execution.

So, apport and valgrind, a case of two peas in a pod :)



All of these strained metaphors about food make me hungry: time for lunch.

Footnotes

[1] Install it with sudo apt-get install apport-valgrind. You will also want to install valgrind and valgrind-dbgsym: sudo apt-get install valgrind valgrind-dbgsym

[2] A memory leak occurs in C when you allocate memory on the heap and fail to deallocate ('free') the memory. Such memory is not available for use by any process until the the code's main process terminates. If the process does not terminate (perhaps it is a daemon, or a piece of code that is persistent, such as a part of the desktop GUI) the memory is essentially lost forever.

[3] One picks the valgrind tool with --tool=TOOLNAME. The memcheck tool is the default, so there is no actual need to specify it.

[4] If the executable is not in a debian package, apport does not know how to find and get all appropriate debug packages. There are other approaches to automate this. See future posts here.

No comments:

Post a Comment