When a OVS daemon is configured to run as a Windows service,
when the service is stopped by calling service_stop(), the
windows services manager does not give enough time to do
everything in the atexit handler. So call the exit handler
directly from service_stop().
Also add a test case for Windows services which checks for
the termination of the service by looking at pidfile cleaned
by the exit handler.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com
Acked-by: Ben Pfaff <blp@nicira.com>
Between fork() and execvp() calls in the process_start()
function both child and parent processes share the same
file descriptors. This means that, if a child process
received a signal during this time interval, then it could
potentially write data to a shared file descriptor.
One such example is fatal signal handler, where, if
child process received SIGTERM signal, then it would
write data into pipe. Then a read event would occur
on the other end of the pipe where parent process is
listening and this would make parent process to incorrectly
believe that it was the one who received SIGTERM.
Also, since parent process never reads data from this
pipe, then this bug would make parent process to consume
100% CPU by immediately waking up from the event loop.
This patch will help to avoid this problem by blocking
signals until child closes all its file descriptors.
Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Reported-by: Suganya Ramachandran <suganyar@vmware.com>
Issue: 1255110
Fixed sparse non static symbol warning.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
Windows does not have a SIGPIPE. We ignore SIGPIPE for
Linux. To compile on Windows, carve out a new function
to ignore SIGPIPE on Linux.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Ctrl+C signals are a special case for Windows and can
be handled by registering a handle through
SetConsoleCtrlHandler() routine. This is only useful
when we run it directly on console and not as services in
the background.
Once we get a Ctrl+C signal, we call the cleanup functions
and then exit.
One thing to know here is that MinGW terminal handles
Ctrl+C signal differently (and looks a little buggy. I see
it exiting the handler midway with some sort of timeout).
So this implementation is only useful when run on Windows
terminal. Since we only use MinGW for compilation and
eventually to run unit tests, it should be okay. (The unit
tests would ideally use windows services and not expect
Ctrl+C)
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Windows does not have a SIGHUP or SIGALRM. It does have
a SIGINT and SIGTERM. The documentation at msdn says that
SIGINT is not supported for win32 applications because
WIN32 operating systems generate a new thread to specifically
handle Ctrl+C.
This commit handles SIGTERM for Windows. The documentation also
states that nothing generates SIGTERM in Windows, but one can
use raise(SIGTERM) to manage it. The idea for handling SIGTERM
for Windows is to just have a place holder if there is need to
raise() a signal for some other purpose.
We use SIGALRM in timeval.c if we wake up from a sleep after
'deadline'. For Windows, print an error message and then
use SIGTERM.
There is an atexit() function for Windows, so we can call cleanup
functions during exit.
An upcoming commit separately handles Ctrl+C so that we can call
clean up functions for that use case.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This allows other libraries to use util.h that has already
defined NOT_REACHED.
Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
We've seen a number of deadlocks in the tree since thread safety was
introduced. So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively. When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.
POSIX "error-checking" mutexes check for this specific problem (and
others). This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes. This ought to help find problems more
quickly in the future.
There might be performance advantages to other kinds of mutexes in some
cases. However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases. Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.
Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
On FreeBSD sig_atomic_t is long, which causes the comparison in
fatal_signal_run to be true when no signal has been reported.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Android appears to lack SIG_ATOMIC_MAX which is only
used in fatal-signal.c.
Observed when compiling using the Android NDK r6b (Android API level 13).
Patch based on a suggestion by Ben Pfaff
In each of the cases converted here, an shash was used simply to maintain
a set of strings, with the shash_nodes' 'data' values set to NULL. This
commit converts them to use sset instead.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
The fatal-signal library notices and records fatal signals (e.g. SIGTERM)
and terminates the process on the next trip through poll_block(). But
some special utilities do not always invoke poll_block() promptly, e.g.
"ovs-ofctl monitor" does not call poll_block() as long as OpenFlow messages
are available. But these special cases seem like they are all likely to
call into functions that themselves block (those with "_block" in their
names). So make a new rule that such functions should always call
fatal_signal_run(), either directly or through poll_block(). This commit
implements and documents that rule.
Bug #2625.
Not calling fatal_signal_init() means that the signal handlers don't get
registered, so the process won't clean up on fatal signals. Furthermore,
signal_fds[0] is then 0, which means that fatal-signal_wait() waits on
stdin, so if you are testing a program interactively and accidentally type
something on stdin then that program's CPU usage jumps to 100%.
Since poll_block() calls fatal_signal_wait() this seems like the most
reliable solution.
Until now, fatal_signal_fork() has simply disabled all the fatal signal
callback hooks. This worked fine, because a daemon process forked only
once and the parent didn't do much before it exited.
But upcoming commits will introduce a --monitor option, which requires
processes to fork multiple times. Sometimes the parent process will fork,
then run for a while, then fork again. It's not good to disable the
hooks in the child process in such a case, because that prevents e.g.
pidfiles from being removed at the child's exit.
So this commit changes the semantics of fatal_signal_fork() to just
clearing out hooks. After hooks are cleared, new hooks can be added and
will be executed on process termination in the usual way.
This commit also introduces a cancellation callback function so that a
canceled hook can free resources.
Rather than running signal hooks directly from the actual signal
handler, simply record the fact that the signal occured and run
the hook next time around the poll loop. This allows significantly
more freedom as to what can actually be done in the signal hooks.