Use the backtrace functions that is provided by libc, this allows us
to get backtrace that is independent of the current memory map of the
process. Which in turn can be used for debugging/tracing purpose.
The backtrace is not 100% accurate due to various optimizations, most
notably "-fomit-frame-pointer" and LTO. This might result that the
line in source file doesn't correspond to the real line. However, it
should be able to pinpoint at least the function where the backtrace
was called.
The implementation is determined during compilation based on available
libraries. Libunwind has higher priority if both methods are available
to keep the compatibility with current behavior.
The backtrace is not marked as signal safe however the backtrace manual
page gives more detailed explanation why it might be the case [0].
Load the "libgcc" or equivalent in advance within the "fatal_signal_init"
which should ensure that subsequent calls to backtrace* do not call
malloc and are signal safe.
The typical backtrace will look similar to the one below:
/lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe]
/lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7]
/lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b]
/lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d]
ovsdb-server(+0xa661) [0x562cfce2e661]
ovsdb-server(+0x7e39) [0x562cfce2be39]
/lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a]
/lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b]
ovsdb-server(+0x8c35) [0x562cfce2cc35]
backtrace.h elaborates on how to effectively get the line information
associated with the addressed presented in the backtrace.
[0]
backtrace() and backtrace_symbols_fd() don't call malloc() explicitly,
but they are part of libgcc, which gets loaded dynamically when first
used. Dynamic loading usually triggers a call to malloc(3). If you
need certain calls to these two functions to not allocate memory (in
signal handlers, for example), you need to make sure libgcc is loaded
beforehand
Reported-at: https://bugzilla.redhat.com/2177760
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The libunwind unw_word_t type is defined as uint32_t for 32-bit
system and uint64_t for 64-bit system. The patch fixes the
compile error using PRIxPTR to print this value.
Fixes: e2ed6fbeb18c ("fatal-signal: Catch SIGSEGV and print backtrace.")
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The patch catches the SIGSEGV signal and prints the backtrace
using libunwind at the monitor daemon. This makes debugging easier
when there is no debug symbol package or gdb installed on production
systems.
The patch works when the ovs-vswitchd compiles even without debug symbol
(no -g option), because the object files still have function symbols.
For example:
|daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace:
|daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52>
|daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40>
|daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d>
|daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108>
|daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c>
|daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa>
|daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d>
|daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca>
|daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d>
|daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \
(Segmentation fault), core dumped, restarting
However, if the object files' symbols are stripped, then we can only
get init function plus offset value. This is still useful when trying
to see if two bugs have the same root cause, Example:
|daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace:
|daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a>
|daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40>
|daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d>
|daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280>
|daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324>
|daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371>
|daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0>
|daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261>
|daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \
(Segmentation fault), core dumped, restarting
Most C library functions are not async-signal-safe, meaning that
it is not safe to call them from a signal handler, for example
printf() or fflush(). To be async-signal-safe, the handler only
collects the stack info using libunwind, which is signal-safe, and
issues 'write' to the pipe, where the monitor thread reads and
prints to ovs-vswitchd.log.
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
log_backtrace() and log_backtrace_msg() logs the back trace into
the log file. It may be most useful when debugging unit tests.
"backtrace.h" documents the usage. It is not being called directly
in the code, but rather as a handy tool available when needed.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This variant was Linux-specific, GCC-specific, only worked on
architectures with frame pointers (possibly only on i386?), and isn't used
with glibc anyway. Remove it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The backtrace_capture() implementation only worked properly with GNU C on
systems that have a simple stack frame with a frame pointer. Notably,
the x86-64 ABI by default has no frame pointer, so this failed on x86-64.
However, glibc has a function named backtrace() that does what we want.
This commit tests for this function and uses it when it is present, fixing
x86-64 backtraces.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
The portable implementation of stack_low(), which before this commit is
used on x86-64, provokes a warning from GCC that cannot be disabled. We
already have an i386-specific implementation that does not warn; this
commit adds a corresponding implementation for x86-64 to avoid the warning
there too.
Without this change GCC warns "use of assignment suppression and length
modifier together in scanf format", which doesn't actually point out any
real problem (and why would it? Google turns up nothing interesting).