Coverity reports that daemon_set_new_user() may receive a large
unsigned value from get_sysconf_buffer_size(), due to sysconf()
returning -1 and being cast to size_t.
Although this would likely lead to an allocation failure and abort,
it's better to handle the error in place.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Open vSwitch generally tries to let the underlying operating system
managed the low level details of hardware, for example DMA mapping,
bus arbitration, etc. However, when using DPDK, the underlying
operating system yields control of many of these details to userspace
for management.
In the case of some DPDK port drivers, configuring rte_flow or even
allocating resources may require access to iopl/ioperm calls, which
are guarded by the CAP_SYS_RAWIO privilege on linux systems. These
calls are dangerous, and can allow a process to completely compromise
a system. However, they are needed in the case of some userspace
driver code which manages the hardware (for example, the mlx
implementation of backend support for rte_flow).
Here, we create an opt-in flag passed to the command line to allow
this access. We need to do this before ever accessing the database,
because we want to drop all privileges asap, and cannot wait for
a connection to the database to be established and functional before
dropping. There may be distribution specific ways to do capability
management as well (using for example, systemd), but they are not
as universal to the vswitchd as a flag.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Gaetan Rivet <gaetanr@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When segmentation fault occurred in ovn-northd, monitor will try to
restart the ovn-northd daemon process every 10s.
Assume the following scenarios: There is a segmentation fault and
the ovn-northd daemon process does not restart properly every time.
New fds are created each time the ovn-northd daemon process is
restarted by the monitor process, but old fds(fd[0]) owned by
the monitor process was not closed properly. One pipe leak for
each restart of the ovn-northd daemon process. After a long time
file descriptors were exhausted.
Fixes: e2ed6fbeb18c ("fatal-signal: Catch SIGSEGV and print backtrace.")
Signed-off-by: Fengqi Li <lifengqi@inspur.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Until now, the monitor process held its log file open while it waited for
the child to exit, and then it reopened it after the child exited. Holding
the log file open this way prevented log rotation from freeing disk space.
This commit avoids the problem by closing the log file before waiting, then
reopening it afterward.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reported-by: Antonin Bas <abas@vmware.com>
VMware-BZ: #2743409
Signed-off-by: William Tu <u9012063@gmail.com>
19 bytes in 1 blocks are definitely lost in loss record 24 of 121
at 0x4839748: malloc (vg_replace_malloc.c:306)
by 0x483BD63: realloc (vg_replace_malloc.c:834)
by 0x521C26: xrealloc (util.c:149)
by 0x478F91: ds_reserve (dynamic-string.c:63)
by 0x47928B: ds_put_format_valist (dynamic-string.c:161)
by 0x47920A: ds_put_format (dynamic-string.c:142)
by 0x506DE5: process_status_msg (process.c:0)
by 0x52A6D0: fork_and_wait_for_startup (daemon-unix.c:284)
by 0x52A54D: daemonize_start (daemon-unix.c:453)
by 0x40EB3E: main (ovs-vswitchd.c:91)
Fixes: b925336a36e6 ("daemon: restart child process if it died before signaling its readiness")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Roi Dayan <roid@nvidia.com>
Currently the backtrace logging is only available when monitor
daemon is running. This patch enables backtrace logging when
no monitor daemon exists. At signal handling context, it detects
whether monitor daemon exists. If not, write directly the backtrace
to the vlog fd. Note that using VLOG_* macro doesn't work due to
it's buffer I/O, so this patch directly issue write() syscall to
the file descriptor.
For some system we stop using monitor daemon and use systemd to
monitor ovs-vswitchd, thus need this patch. Example of
ovs-vswitchd.log (note that there is no timestamp printed):
2020-03-23T14:42:12.949Z|00049|memory|INFO|175332 kB peak resident
2020-03-23T14:42:12.949Z|00050|memory|INFO|handlers:2 ports:3 reva
SIGSEGV detected, backtrace:
0x0000000000486969 <fatal_signal_handler+0x49>
0x00007f7f5e57f4b0 <killpg+0x40>
0x000000000047daa8 <pmd_thread_main+0x238>
0x0000000000504edd <ovsthread_wrapper+0x7d>
0x00007f7f5f0476ba <start_thread+0xca>
0x00007f7f5e65141d <clone+0x6d>
0x0000000000000000 <+0x0>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
The patch catches the SIGSEGV signal and prints the backtrace
using libunwind at the monitor daemon. This makes debugging easier
when there is no debug symbol package or gdb installed on production
systems.
The patch works when the ovs-vswitchd compiles even without debug symbol
(no -g option), because the object files still have function symbols.
For example:
|daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace:
|daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52>
|daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40>
|daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d>
|daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108>
|daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c>
|daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa>
|daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d>
|daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca>
|daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d>
|daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \
(Segmentation fault), core dumped, restarting
However, if the object files' symbols are stripped, then we can only
get init function plus offset value. This is still useful when trying
to see if two bugs have the same root cause, Example:
|daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace:
|daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a>
|daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40>
|daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d>
|daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280>
|daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324>
|daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371>
|daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0>
|daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261>
|daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \
(Segmentation fault), core dumped, restarting
Most C library functions are not async-signal-safe, meaning that
it is not safe to call them from a signal handler, for example
printf() or fflush(). To be async-signal-safe, the handler only
collects the stack info using libunwind, which is signal-safe, and
issues 'write' to the pipe, where the monitor thread reads and
prints to ovs-vswitchd.log.
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Linux has an idea of process name that is visible in /proc/$pid/comm. This
is "ovs-vswitchd" for a freshly started ovs-vswitchd process. When the
monitor code restarted a crash child, it changed it to the empty string.
This confused the daemon_is_running check in ovs-lib.in, which checks
comm. This commit fixes the problem by setting the program name as comm
in newly restarted children.
VMware-BZ: #2191724
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Gurucharan Shetty <guru@ovn.org>
Internal ports may be moved to another network namespace
and when that happens, the vswitch stops receiving netlink
notifications.
This patch enables the vswitch to listen to all network
namespaces that have a nsid assigned into the network
namespace where the socket has been opened.
It requires kernel 4.2 or newer.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Rename the remaining variables that were shadowing another definition.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
These options have existed for a while, but were not expressed in the
help information. Inform the user that these options exist, and give
some basic help.
Reported-by: Saravanan KR <skramaja@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Markos Chandras <mchandras@suse.de>
Daemons generally should close the standard fds because they don't want to
hold open an SSH session, etc. that is attached to a tty. But --monitor
without --detach does not daemonize, so do not close fds in that case.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
Shadowing is when a variable with a given name in an inner scope hides a
different variable with the same name in a surrounding scope. This is
generally undesirable because it can confuse programmers. This commit
eliminates most of it.
Found with -Wshadow=local in GCC 7. The repo is not really ready to enable
this option by default because of a few cases that are harder to fix, and
harmless, such as nested use of CMAP_FOR_EACH.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
ovs daemon process will reopen file after every log rotate.
However, it doesn't happen to monitor process. That is to say,
fd of log file in monitor proces always point to oldest disk file,
which is deleted after log rotate. Once daemon process restarts
from a crash, it inherits parent's fds, including the deleted log file.
This commit reopens log file in monitor process everytime it
wakes up from waitpid.
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
From the manpages of getgrnam_r (getpwnam_r is similar):
"If no matching group record was found, these functions return 0 and
store NULL in *result."
The code checked only against errors, but non existing users didn't set
e != 0 therefore the code could try to set arbitrary uid/gid values.
Fixes: e91b927d lib/daemon: support --user option for all OVS daemon
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
vlog log file can be created when parsing --log-file option, before
switching user, in case the --user option is also specified. While this
does not directly cause errors for the running daemons, it can
leave the log files on the disk as created under the "root" user.
This patch fix the log file ownership to the user specified with --user.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
A global variable 'switch_user' was used to make sure
we switch process's current user only once. This logic is now
simplified by testing for uid directly; if switch process has
taken place, the current uid will be not be zero.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
The function daemon_become_new_user_linux was conditionally defined but
then used in code unconditionally. If HAVE_LIBCAPNG is not defined, the
function would never be called, but it still must exist.
Adjust the #if guard around the function to be around the body of the
function instead of outside of its definition to ensure the function is
always defined, even if empty.
This issue was introduced in e91b927d8966bfcb9768225392324dde4fd7d7f6.
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
OVS daemons can now support --user option to run as a non-root
user with less privileges.
See the manpage patch for more descriptions.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The coding style guidelines include the following:
- Pick a unique name prefix (ending with an underscore) for each
module, and apply that prefix to all of that module's externally
visible names. Names of macro parameters, struct and union members,
and parameters in function prototypes are not considered externally
visible for this purpose.
This patch adds the new prefix to the externally visible names. This
makes it a bit more obvious what code is coming from common command
line handling code.
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The child process (the one being monitored) could die before it was able
to call fork_notify_startup() function. If such situation arises, then
parent process (the one monitoring child process) would also terminate
with a fatal log message:
...|EMER|fork child died before signaling startup (killed (...))
This patch changes that behavior by always restarting child process
if it was able to start up at least once in the past. However, if
child was not able to start up even once, then the monitor process
would still terminate, because that would most likely indicate a
persistent programming or system error.
To reproduce use following script:
while : ; do kill -SIGSEGV `cat /var/run/openvswitch/ovs-vswitchd.pid`; done
Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
VMware-BZ: 1273550
We have some common code between daemon-unix.c and
daemon-windows.c. Move them to daemon.c
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
In the unit tests, we check for some logs stored in stderr. In case
of windows, unit tests fail because the child writes additional information
into stderr because it does not have it closed. This commit
closes standard file descriptors for windows too.
Because the functions related to closing file descriptors is common
for both windows and unix, add it to the common daemonization file
daemon.c
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
An upcoming commit re-introduces daemon.c to have
common functions across daemon-unix.c and daemon-windows.c
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>