2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-23 02:17:42 +00:00
ovs/lib/backtrace.c

168 lines
3.9 KiB
C
Raw Normal View History

/*
* Copyright (c) 2008, 2009, 2010, 2011, 2013 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at:
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <config.h>
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
#include <errno.h>
#include <fcntl.h>
#include <inttypes.h>
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
#include <string.h>
#include <unistd.h>
#include "backtrace.h"
#include "openvswitch/vlog.h"
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
#include "util.h"
VLOG_DEFINE_THIS_MODULE(backtrace);
#ifdef HAVE_BACKTRACE
#include <execinfo.h>
void
backtrace_capture(struct backtrace *b)
{
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
b->n_frames = backtrace(b->frames, BACKTRACE_MAX_FRAMES);
}
void
backtrace_format(struct ds *ds, const struct backtrace *bt,
const char *delimiter)
{
if (bt->n_frames) {
char **symbols = backtrace_symbols(bt->frames, bt->n_frames);
if (!symbols) {
return;
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
for (int i = 0; i < bt->n_frames - 1; i++) {
ds_put_format(ds, "%s%s", symbols[i], delimiter);
}
ds_put_format(ds, "%s", symbols[bt->n_frames - 1]);
free(symbols);
}
}
#else
void
backtrace_capture(struct backtrace *backtrace)
{
backtrace->n_frames = 0;
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
void
backtrace_format(struct ds *ds, const struct backtrace *bt OVS_UNUSED,
const char *delimiter OVS_UNUSED)
{
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
ds_put_cstr(ds, "backtrace() is not supported!\n");
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
#endif
void
log_backtrace_at(const char *msg, const char *where)
{
struct backtrace b;
struct ds ds = DS_EMPTY_INITIALIZER;
backtrace_capture(&b);
if (msg) {
ds_put_format(&ds, "%s ", msg);
}
ds_put_cstr(&ds, where);
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
ds_put_cstr(&ds, " backtrace:\n");
backtrace_format(&ds, &b, "\n");
VLOG_ERR("%s", ds_cstr_ro(&ds));
ds_destroy(&ds);
}
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
#if defined(HAVE_UNWIND) || defined(HAVE_BACKTRACE)
static bool
read_received_backtrace(int fd, void *dest, size_t len)
{
VLOG_DBG("%s fd %d", __func__, fd);
fcntl(fd, F_SETFL, O_NONBLOCK);
memset(dest, 0, len);
int byte_read = read(fd, dest, len);
if (byte_read < 0) {
VLOG_ERR("Read fd %d failed: %s", fd, ovs_strerror(errno));
}
return byte_read > 0;;
}
#else
static bool
read_received_backtrace(int fd OVS_UNUSED, void *dest OVS_UNUSED,
size_t len OVS_UNUSED)
{
return false;
}
#endif
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
#ifdef HAVE_UNWIND
void
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
log_received_backtrace(int fd)
{
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
struct unw_backtrace backtrace[UNW_MAX_DEPTH];
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
if (read_received_backtrace(fd, backtrace, UNW_MAX_BUF)) {
struct ds ds = DS_EMPTY_INITIALIZER;
ds_put_cstr(&ds, BACKTRACE_DUMP_MSG);
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
for (int i = 0; i < UNW_MAX_DEPTH; i++) {
if (backtrace[i].func[0] == 0) {
break;
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
ds_put_format(&ds, "0x%016"PRIxPTR" <%s+0x%"PRIxPTR">\n",
backtrace[i].ip,
backtrace[i].func,
backtrace[i].offset);
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
VLOG_WARN("%s", ds_cstr_ro(&ds));
ds_destroy(&ds);
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
}
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
#elif HAVE_BACKTRACE
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
void
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
log_received_backtrace(int fd)
{
struct backtrace bt;
if (read_received_backtrace(fd, &bt, sizeof bt)) {
struct ds ds = DS_EMPTY_INITIALIZER;
bt.n_frames = MIN(bt.n_frames, BACKTRACE_MAX_FRAMES);
ds_put_cstr(&ds, BACKTRACE_DUMP_MSG);
backtrace_format(&ds, &bt, "\n");
VLOG_WARN("%s", ds_cstr_ro(&ds));
ds_destroy(&ds);
}
fatal-signal: Catch SIGSEGV and print backtrace. The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 10:22:55 -07:00
}
backtrace: Extend the backtrace functionality. Use the backtrace functions that is provided by libc, this allows us to get backtrace that is independent of the current memory map of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The implementation is determined during compilation based on available libraries. Libunwind has higher priority if both methods are available to keep the compatibility with current behavior. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" or equivalent in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-30 09:34:12 +02:00
#else
void
log_received_backtrace(int daemonize_fd OVS_UNUSED)
{
VLOG_WARN("Backtrace using libunwind or backtrace() is not supported.");
}
#endif