2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

53 Commits

Author SHA1 Message Date
Gurucharan Shetty
7ffd3f6972 worker: Prevent worker from being responsible for pidfile deletion.
Currently we are creating the worker process after creation of the pidfile.
This means that the responsibility of deleting the pidfile after process
termination rests with the worker process.

When we restart openvswitch using the startup scripts, we SIGTERM the main
process and once it is cleaned up, we start ovs-vswitchd again. This results
in a race condition. The new ovs-vswitchd will create a pidfile because it is
unlocked. But, if the old worker process exits after the start of new
ovs-vswitchd, it will simply delete the pidfile underneath the new ovs-vswitchd.
This will eventually result in multiple ovs-vswitchd daemons.

This patch gives the responsibility of deleting the pidfile to the main
process.

Bug #16669.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
2013-04-29 15:09:48 -07:00
Ben Pfaff
cb22974d77 Replace most uses of assert by ovs_assert.
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-01-16 16:03:37 -08:00
Ben Pfaff
066f329e29 daemon: Start monitor process, not daemon process, in new session.
To keep control+C and other signals in the initiating session from killing
the monitor process, we need to put the monitor process into its own
session.  However, until this point, we've only done that for the daemon
processes that the monitor started, which means that control+C would kill
the monitor but not the daemons that it launched.

I don't know of a benefit to putting the monitor and daemon processes in
different sessions, as opposed to one new session for both of them, so
this change does the latter.

daemonize_post_detach() is called from one additional context where we'd
want to be in a new session, the worker_start() function, but that function
is documented as to be called after daemonize_start(), in which case we
will (after this commit) already have called setsid(), so no additional
change is required there.

Bug #14280.
Reported-by: Gordon Good <ggood@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-12-13 14:01:23 -08:00
Ethan Jackson
2388a783e2 daemon: Avoid the link() syscall.
make_pidfile() depends on the link() system call to atomically
create pidfiles when multiple daemons are started concurrently.
However, this system call isn't available on ESX so an alternative
strategy is necessary.  Fortunately, the approach this patch takes
is cleaner than the original code.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-11-19 13:16:19 -08:00
Ed Maste
d86a6c099f lib: Move addition of program_name to proctitle_set
Signed-off-by: Ed Maste <emaste@adaranet.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-10-11 14:02:00 -07:00
Ben Pfaff
e8087a87a3 daemon: Factor out code into new function daemonize_post_detach().
This code will have another user in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-07-18 10:30:49 -07:00
Ben Pfaff
8aee05ccf4 daemon: Factor out code into new function fork_and_wait_for_startup().
This function will be useful in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-07-18 10:30:49 -07:00
Ben Pfaff
781dee0835 util: Introduce "subprogram_name" to identify subprocesses and threads.
This will be more useful later when we introduces "worker" subprocesses.
I don't have any current plans to introduce threading, but I can't
think of a disadvantage to wording this in a general manner.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-07-18 10:30:47 -07:00
Ben Pfaff
e6c5e53903 daemon: Add comment.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-14 14:21:18 -07:00
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
7d0c5973d5 daemon: New function daemon_save_fd() to preserve fds across detach.
This eliminates a kluge that was duplicated in three different daemons.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-02-02 16:26:53 -08:00
Ben Pfaff
2c8fcc9cd6 daemon: Better log when fork child dies early from signals.
On one machine, "/etc/init.d/openvswitch-switch start" failed to start
with:

   ovs-vswitchd: fork child failed to signal startup (Success)
   Starting ovs-vswitchd ... failed!

"strace" revealed that the fork child was actually segfaulting, but the
message output didn't indicate that in any way.  This commit fixes the
log message (but not the segfault itself).

Reported-by: Michael Hu <mhu@nicira.com>
Bug #8457.
2011-11-28 12:33:34 -08:00
Ben Pfaff
c1a543a8d6 vlog: Add a new log level "off".
Until now, "emer" has effectively been "off" because no messages were ever
logged at "emer" level.  Justin points out that it is useful to use "emer"
for messages that indicate a fatal error.  This commit makes that change
and adds a new "off" level to really turn off all logging to a facility.
2011-08-01 13:23:19 -07:00
Ben Pfaff
d3824212ad daemon: Disable logging to console after detaching.
When we detach, we replace stderr by /dev/null, so there's no point in
logging to the console after that.  Just turn it off.
2011-06-16 12:28:06 -07:00
Ben Pfaff
0b3769425f daemon: Reduce log level of "pid file is stale" message.
This message will appear repeatedly when ovs-vswitchd is running, if there
is any stale pidfile in /var/run/openvswitch, because ovs-vswitchd reads
all of the pidfiles in that directory periodically to update statistics.
2011-04-19 09:32:18 -07:00
Ben Pfaff
aacea8ba43 daemon: Avoid races on pidfile creation.
Until now, if two copies of one OVS daemon started up at the same time,
then due to races in pidfile creation it was possible for both of them to
start successfully, instead of just one.  This was made worse when a
previous copy of the daemon had died abruptly, leaving a stale pidfile.

This commit implements a new pidfile creation and removal protocol that I
believe closes these races.  Now, a pidfile is asserted with "link" instead
of "rename", which prevents the race on creation, and a stale pidfile may
only be deleted by a process after it has taken a lock on it.

This may solve mysterious problems seen occasionally on vswitch restart.
I'm still puzzled by these problems, however, because I don't see anything
in our tests cases that would actually cause two copies of a daemon to
start at the same time, which as far as I can see is a necessary
precondition for the problem.
2011-04-04 10:59:19 -07:00
Ben Pfaff
00c0858987 daemon: Integrate checking for an existing pidfile into daemonize_start().
Until now, it has been the responsibility of an individual daemon to call
die_if_already_running() at an appropriate time.  A long time ago, this
had to happen *before* daemonizing, because once the process daemonized
itself there was no way to report failure to the process that originally
started the daemon.  With the introduction of daemonize_start(), this is
now possible, but we haven't been taking advantage of it.

Therefore, this commit integrates the die_if_already_running() call into
daemonize_start() and deletes the calls to it from individual daemons.
2011-04-04 10:58:55 -07:00
Ben Pfaff
af9a144207 daemon: Tolerate EINTR in fork_and_wait_for_startup().
It seems possible that a signal coming in at the wrong time could confuse
this code.  It's always best to loop on EINTR.
2011-04-04 10:58:55 -07:00
Ben Pfaff
279c9e0308 Log anything that could prevent a daemon from starting.
If a daemon doesn't start, we need to know why.  Being able to
consistently consult the log to find out is helpful.
2011-04-04 10:58:55 -07:00
Ben Pfaff
18e124a20b daemon: Avoid redundant code in already_running().
This function substantially duplicated read_pidfile(), so reuse that
code instead.
2011-03-29 10:09:47 -07:00
Ben Pfaff
2159de8391 daemon: Write "already running" message to log also.
Otherwise it's hard to diagnose later if the daemon failed to start because
it thinks that it is already running.
2011-03-29 10:09:23 -07:00
Justin Pettit
4c1b8fc2e5 daemon: Fix leak of string in make_pidfile().
Coverity #10724
2011-02-22 09:36:57 -08:00
Ben Pfaff
a7ff9bd763 ovs-vswitchd: Complete daemonization only after initial configuration.
Otherwise when we add support for saving and restoring configuration
of internal devices around kernel module unload and reload, there's
no easy way for the "restore" code to tell when all the interfaces
should be set up and ready for configuration.
2011-02-07 12:50:19 -08:00
Ben Pfaff
e7668254f2 daemon: Suppress valgrind warnings from read_pidfile().
The version of valgrind I have in my test VMs doesn't know what F_GETLK
does, so it complains that l_pid is uninitialized even though fcntl sets
it.  Initializing it ourselves before calling the function avoids a series
of false-positive warnings about use of uninitialized data.
2011-02-03 14:56:33 -08:00
Ben Pfaff
b43c6fe279 Make installation directories overridable at runtime.
This makes it possible to run tests that need access to installation
directories, such as the rundir, without having access to the actual
installation directories (/var/run is generally not world-writable), by
setting environment variables.  This is not a good way to do things in
general--usually it would be better to choose the correct directories
at configure time--so for now this is undocumented.
2010-11-29 16:29:11 -08:00
Ben Pfaff
d98e600755 vlog: Make client supply semicolon for VLOG_DEFINE_THIS_MODULE.
It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon,
so this commit switches to the more common form.
2010-10-29 09:48:47 -07:00
Ben Pfaff
2bf9d87ae3 daemon: Don't call a normal exit from the monitor a "crash".
When the monitored child is killed with SIGTERM, the monitoring process
currently logs a message like "1 crashes: pid 12345 died, killed by
signal 15 (Terminated), exiting".  This counts the SIGTERM as a crash, even
though it's intentional.

This commit changes the log message to omit the "%d crashes" part on normal
termination.
2010-10-27 09:29:08 -07:00
Ethan Jackson
309eaa2bc4 lib: Remove warnings in daemon.c
On some platforms compilation of daemon.c results in implicit
declaration of function fstat and stat warnings.
2010-10-14 22:59:11 +00:00
Ben Pfaff
e4bd5e2a6c daemon: Fix behavior of read_pidfile() for our own pidfile.
Opening a file descriptor and then closing it always discards any locks
held on the underlying file, even if the file is still open as another file
descriptor.  This meant that calling read_pidfile() on the process's own
pidfile would discard the lock and make other OVS processes think that the
process had died.  This commit fixes the problem.
2010-09-23 11:45:34 -07:00
Ben Pfaff
cbbdf81cf8 daemon: Report number of crashes on monitor process command line. 2010-09-23 11:45:34 -07:00
Joe Perches
d295e8e97a treewide: Remove trailing whitespace
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2010-08-30 13:23:08 -07:00
Ben Pfaff
d4db8309c5 daemon: Improve comments.
Elsewhere we put the name of command-line options that control global
variables in the comment, so do so here as well.

Also fix a comment typo.
2010-08-25 14:55:47 -07:00
Ben Pfaff
df5d2ed998 daemon: Make sure that vlog is initialized when a process daemonizes.
If a process daemonizes itself, then it should be possible to control that
process's log levels with "ovs-appctl vlog/set" and related commands.  The
vlog_init() function registers those commands.  But vlog_init() doesn't
normally get called until the first log message is issued.  This can take a
while, especially for ovs-controller, where I first noticed the problem.

This commit fixes the problem by calling vlog_init() from
daemonize_start(), which always gets called as a process daemonizes.
2010-08-12 09:47:33 -07:00
Ben Pfaff
5136ce492c vlog: Introduce VLOG_DEFINE_THIS_MODULE for declaring vlog module in use.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined.  A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
2010-07-21 15:47:09 -07:00
Ben Pfaff
3762274e63 Add some missing "#include"s.
These are required to build on FreeBSD 8.0.
2010-05-26 15:27:01 -07:00
Ben Pfaff
a9633ada75 daemon: Throttle max respawning rate.
If a monitored daemon dies quickly at startup, the system can waste a lot
of CPU time continually restarting it.  This commit prevents a given
daemon from restarting more than once every 10 seconds.
2010-05-13 09:45:21 -07:00
Ben Pfaff
7c2dd4c648 daemon: Allow monitored daemon to dump core no more than once.
If the monitored daemon dumps core frequently, then this can quickly
exhaust the host's disk space.  This commit limits core dumps to at most
one per monitored session (typically, once per boot).
2010-05-13 09:45:21 -07:00
Ben Pfaff
daf03c53ee util: New functions get_cwd(), abs_file_name().
These will be used further in an upcoming commit.
2010-03-18 11:23:50 -07:00
Ben Pfaff
b2d06cb8cb daemon: Fix memory leak in --monitor implementation.
This leaked a small amount of memory each time a daemon process was
created.  It is only important if a daemon is otherwise very buggy.

Found with valgrind.
2010-02-02 15:21:09 -08:00
Ben Pfaff
40f0707cd9 daemon: Make --monitor process change its process title.
When --monitor is used, administrators sometimes become confused about the
presence of two copies of each process.  This commit attempts to clarify
the situation by making the monitoring process change its process name, as
seen in /proc/$pid/cmdline and in "ps", to clearly indicate what is going
on.

CC: Dan Wendlandt <dan@nicira.com>
2010-01-26 10:52:46 -08:00
Ben Pfaff
ff8decf1a3 daemon: Add support for process monitoring and restart. 2010-01-15 15:29:54 -08:00
Ben Pfaff
7943cd51e7 daemon: Refactor code.
This commit should not change behavior, but it paves the way for
implementing --monitor in the following commit.
2010-01-15 15:29:52 -08:00
Ben Pfaff
55368fb836 daemon: Close standard file descriptors when daemonizing.
Before SSH terminates, it waits for the PTYs that it creates for use as
stdin, stdout, and stderr to be closed.  When any of the Open vSwitch
daemons were started in the background over an SSH session, they held
those file descriptors open and thus the SSH session hung.  This commit
fixes the problem by closing those file descriptors, allowing SSH to
terminate.
2010-01-12 16:03:35 -08:00
Ben Pfaff
b8781ff08d daemon: Don't ignore failed write to pipe.
If the write to the pipe fails here then the parent will think that the
child failed to start up, so the child should oblige it by bailing out.
2010-01-04 09:47:01 -08:00
Ben Pfaff
95440284bd daemon: Allow daemon child process to report success or failure to parent.
There are conflicting pressures in startup of a daemon process:

    * The parent process should exit with an error code if the daemon
      cannot start up successfully.

    * Some startup actions must be performed in the child process, not in
      the parent.  The most obvious of these are file locking, since
      child processes do not inherit locks, and anything that requires
      knowing the child process's PID (e.g. unixctl sockets).

Until now, this conflict has usually been handled by giving up part of the
first property, i.e. in some cases the parent process would exit
successfully and the child immediately afterward exit with a failure code.

This commit introduces a better approach, by allowing daemons to perform
startup work in the child and only then signal the parent that they have
successfully started.  If the child instead exits without signaling
success, the parent passes this exit code along to its own parent.

This commit also modifies the daemons that can usefully take advantage of
this new feature to do so.
2009-12-18 13:37:44 -08:00
Justin Pettit
18b9283b98 Clean-up compiler warnings about ignoring return values
Some systems complain when certain functions' return values are not
checked.  This commit fixes those warnings.

Creating ignore() function suggested by Ben Pfaff.
2009-12-15 00:14:32 -08:00
Ben Pfaff
eb077b264f ovsdb-server: Maintain the database lock with --detach.
Before this commit, "ovsdb-server --detach" would detach after it opened
the database file, which meant that the child process did not hold the
file lock on the database file (because a forked child process does not
inherit its parents' locks).  This commit fixes the problem by making
ovsdb-server open the database only after it has detached.  This fix, in
turn, required that daemonize() not chdir to /, because this would break
databases whose names are given relative to the current directory, and so
this commit also changes ovsdb-server to do so later.
2009-11-16 15:20:01 -08:00
Ben Pfaff
ac718c9dbd Implement library for lockfiles and use it in cfg code.
This is useful because the upcoming configuration database also needs a
lockfile implementation.

Also adds tests.
2009-10-29 15:20:56 -07:00
Ben Pfaff
03fbffbda4 Make sure that time advances in a daemon between calls to time_refresh().
Open vSwitch uses an interval timer signal to tell it that its cached idea
of the current time has expired.  However, this didn't work in a daemon
detached from the foreground session (invoked with --detach) because a
child created with fork() does not inherit the parent's interval timer and
we did not re-set it after calling fork().

This commit fixes the problem by setting the interval timer back up after
calling fork() from daemonize().

This fix is based on code inspection (which was then verified to be correct
through testing).  It may not fix any actual problems in practice, because
time_refresh() is called every time through the poll loop, and the poll
loop typically runs more quickly than the periodic timer fires (1 ms or so
average in ovs-vswitchd, vs. 100 ms timer interval).
2009-10-15 10:43:36 -07:00
Justin Pettit
e7bd7d78b1 daemon: Remove short options from daemon library
The daemon library provides a few short options, but these then take
away their availability from programs that wish to use the library.
Since the daemon options are generally going to be called from a script
(which doesn't care how much typing is involved), we'll only provide
long options.
2009-08-06 18:04:36 -07:00