POSIX says that multithreaded programs must not use sigprocmask() but must
use pthread_sigmask() instead. This commit makes that replacement.
The actual use of signals in Open vSwitch is still not thread safe
following this commit, but this change is a necessary prerequisite for
fixing the other problems.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Other code in the tree uses HAVE_BACKTRACE and then blindly includes
<execinfo.h> if it is present, so this doesn't make anything worse.
Once we do that, HAVE_EXECINFO_H has no further users, so this commit also
removes the check for <execinfo.h>
Reported-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
backtrace() is really useful, but it is not signal safe everywhere. We
need to reassess whether it is reasonable to use it anywhere, but
immediately we need to disable it on x86-64 (with glibc) because it is
causing segfaults in testing.
Bug #15497.
Reported-by: Ram Jothikumar <rjothikumar@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Durations longer than 4294967 seconds would unnecessarily overflow in the
multiplication here.
Found by Coverity.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Occasionally, backtrace() will deadlock in the signal handler
because it does some non signal safe initialization. Specifically,
it opens a shared object. As a work around, this patch forces
backtrace() to run outside of a signal handler, so that future
calls will perform as expected.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Often when debugging Open vSwitch, one will see in the logs that
CPU usage has been high for some period of time, but it's totally
unclear why. In an attempt to remedy the situation, this patch
logs backtraces taken at regular intervals as a poor man's
profiling alternative.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
With this patch, `ovs-appctl backtrace` will return a unique list
of backtraces and a count of how many times it has been recorded.
This work had previously been done by ovs-parse-backtrace. However,
in future patches poll-loop will begin logging backtraces as a
matter of course. At this point, coalescing the backtraces will
help keep these log messages brief.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
With this patch, timeval will take a backtrace with each SIGALRM
allowing it to retrieve a profiling snapshot instantly. This will
be useful in future patches when backtraces are logged.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
log_poll_interval() is a little bit too aggressive, and is
therefore less useful than it could be. This patch removes the
mean interval calculation, and simply logs if the poll loop took
longer than 1 second instead.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Commit 00a16895 (timeval: Don't require signals for time_alarm().)
Incorrectly disabled signals when when CACHE_TIME was disabled. In
fact, the reverse was correct. As a result of this bug, OVS would
wake once every 100ms unnecessarily. It shouldn't have affected
correctness otherwise.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Often, it can be quite difficult to debug performance issues in
Open vSwitch. Typically one needs to run something like gprof, but
that requires rebuilding and installing on the affected system
which is often problematic. This patch adds a light weight
profiling solution which can be used in these situations. The
ovs-appctl backtrace command prints out backtraces taken at 100
millisecond intervals over a 5 second period of time. It is
currently only supported on systems which have the execinfo library
and enable time caching.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
The ESX userspace looks quite a bit like linux, but has some key
differences which need to be specially handled in the build. To
distinguish between ESX and systems which use the linux datapath
module, this patch adds two new macros "ESX" and "LINUX_DATAPATH".
It uses these macros to disable building code on ESX which only
applies to a true Linux environment. In addition, it adds a new
route-table-stub implementation which is required for the build to
complete successfully on ESX.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
The timer_create() system call is not supported in ESX and returns
an error when called. Aborting when this system call fails seems a
bit extreme. So instead, this patch simply falls back to disabling
the cached time optimization.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Before this patch, time_alarm() used the SIGALRM handler to notify
the poll loop that it should exit the program. Instead, this patch
simply implements time_alarm() directly in the pool loop. This
significantly simplifies the code, while removing a call to
timer_create() which is not currently supported on ESX.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
64-bit Linux appears to avoid syscalls for clock_gettime(), so we can get
higher resolution timing and avoid having a timer firing off SIGALRM
without introducing extra overhead.
Signed-off-by: Leo Alterman <lalterman@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
I've had a few complaints that ovs-vswitchd logs its coverage counters
at WARN level, but this is mainly wrong: ovs-vswitchd only logs coverage
counters at WARN level when the "coverage/log" command is used through
ovs-appctl. This was even documented.
The reason to log at such a high level was to make it fairly certain that
these messages specifically requested by the admin would not be filtered
out before making it to the log. But it's even better if the admin just
gets the coverage counters as a reply to the ovs-appctl command. So that
is what this commit does.
This commit also improves the documentation of the ovs-appctl command.
Signed-off-by: Ben Pfaff <blp@nicira.com>
I'd always assumed that the exponentially weighted moving average code
here was sufficient rate-limiting, but I actually encountered a
pathological case some time ago that forced this rusage information to
print once a second or so, which seems too often.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Although we try to avoid it, some unit tests are necessarily
timing-sensitive. The new "time/stop" command that this commit adds should
help with that, by preventing time from advancing from the viewpoint of
the OVS "timeval" functions except when "time/warp" explicitly advances
the current time. This should allow the unit tests that need it to become
reproducible regardless of the speed at which the tests run.
This commit adds one unit of "time/stop" to the unit test suite, in the one
timing-sensitive test of which I am currently aware.
Bug #9782.
Reported-by: Tim Chen <tchen@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The unixctl library had used the vde2 management protocol since the
early days of Open vSwitch. As Open vSwitch has matured, several
Python daemons have been added to the code base which would benefit
from a unixctl implementations. Instead of implementing the old
unixctl protocol in Python, this patch changes unixctl to use JSON
RPC for which we already have an implementation in both Python and
C. Future patches will need to implement a unixctl library in
Python on top of JSON RPC.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
An upcoming commit has a new use for the time at which OVS started up, so
this moves this functionality to a common location.
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is a necessary prerequisite for allowing time to be "fast forwarded"
in unit tests, to keep tests that depend on the passage of time from
running in real time. Without this change, a code sequence like this:
poll_timer_wait(1000);
...fast forward time 5 seconds...
poll_block();
would still sleep for a second, because the poll_loop module would still
have a relative timeout of 1000 ms.
Signed-off-by: Ben Pfaff <blp@nicira.com>
For a long time, the poll-loop module has had the ability to log the reason
for wakeups, which is valuable for debugging excessive use of CPU time.
But I have to ask users to turn up the log level for the module, which
wastes their time and mine. This commit improves the situation by
automatically logging the reason for a wakeup whenever a process's
estimated CPU usage rises above 50%. (ovs-vswitchd often uses less than
1% CPU; more than 5% CPU is uncommon.)
When poll interval-based logging was introduced a long time, we were
actively interested in looking at almost every long poll interval. But
these days, with OVS working rather well, with pretty good latency, most
of the messages are red herrings that bother some administrators and
provoke false reports. So this commit suppresses all but the most
egregious long poll intervals that may in fact be worth looking at.
NIC-366.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
Since the timeval module now initializes itself on-demand, there is no
longer any need to initialize it explicitly, or to provide an interface to
do so.
I don't see a reason that set_up_monotonic() should be separate from
time_init(). Doing all the time initialization in one place seems
reasonable, so this commit makes that change.
In glibc, "timer_t" is a "void *" that appears to point into malloc()'d
memory. By throwing it away entirely, we leak it, which makes valgrind
complain. We really don't ever care to use the timer object again, but
we can't destroy it without stopping the periodic timer. So make it
static to avoid a warning from Valgrind.
Most of the timekeeping needs of OVS are simply to measure intervals,
which means that it is sensitive to changes in the clock. This commit
replaces the existing clocks with monotonic timers. An additional set
of wall clock timers are added and used in locations that need absolute
time.
Bug #1858
This code triggers when a trip through the process's main loop takes much
longer than expected. The code for calculating the expected time rounds
down to a maximum of 10000 ms to avoid overflow. But there is no reason
that the correct time should not be displayed in the log message, and
furthermore displaying the correct time may help tracking down the
underlying issue, since it lets the administrator find out exactly when
the trip through the main loop started. So this commit displays the exact
time without rounding down.
Without removing SA_RESTART from the SIGALRM handler, the fcntl call will
never return, even after the signal handler is invoked and returns.
We haven't seen a problem in practice, at least not recently, but that's
probably just luck combined with not holding the configuration file lock
for very long.
Open vSwitch uses an interval timer signal to tell it that its cached idea
of the current time has expired. However, this didn't work in a daemon
detached from the foreground session (invoked with --detach) because a
child created with fork() does not inherit the parent's interval timer and
we did not re-set it after calling fork().
This commit fixes the problem by setting the interval timer back up after
calling fork() from daemonize().
This fix is based on code inspection (which was then verified to be correct
through testing). It may not fix any actual problems in practice, because
time_refresh() is called every time through the poll loop, and the poll
loop typically runs more quickly than the periodic timer fires (1 ms or so
average in ovs-vswitchd, vs. 100 ms timer interval).
By default, many OVS processes keep track of their time through a poll
loop. If it takes an unusually long time (measured as some distance
from the mean), the processes will log stats it has been keeping about
coverage. It was doing this at level WARN.
On Xen systems, syslog messages written at level INFO and higher are
written to /var/log/messages synchronously. This would mean that there
would be dire messages that it took a few dozen milliseconds to go
through the loop, meanwhile, it would take up to 6(!) seconds writing
those. Meanwhile, the process would do no other processing, which could
be quite serious in the case of a process such as ovs-vswitchd.
This problem was somewhat masked because the time used by this logging
was not used in the calculations for determining how long it was taking
to get through the loop.
This commit lowers the default log level for those coverage messages to
INFO. On Xen systems, it raises the default level at which messages are
written to syslog to WARN.
Diagnosed and fixed with the help of Ian Campbell.
Previously, there was no way to induce coverage information to be
displayed; it would only print when the system noticed unusual delays
between polling intervals. Now, production of coverage logs can be
forced with "coverage/log" command in ovs-appctl. Coverage counters may
be reset with "coverage/clear".