2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-22 18:07:40 +00:00
ovs/ovsdb/monitor.c

1864 lines
62 KiB
C
Raw Normal View History

/*
* Copyright (c) 2015, 2017 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at:
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <config.h>
#include <errno.h>
#include "bitmap.h"
#include "column.h"
#include "openvswitch/dynamic-string.h"
#include "openvswitch/json.h"
#include "jsonrpc.h"
#include "ovsdb-error.h"
#include "ovsdb-parser.h"
#include "ovsdb.h"
#include "row.h"
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
#include "condition.h"
#include "simap.h"
#include "hash.h"
#include "table.h"
#include "timeval.h"
#include "transaction.h"
#include "jsonrpc-server.h"
#include "monitor.h"
#include "util.h"
#include "openvswitch/vlog.h"
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
VLOG_DEFINE_THIS_MODULE(ovsdb_monitor);
static struct hmap ovsdb_monitors = HMAP_INITIALIZER(&ovsdb_monitors);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
/* Keep state of session's conditions */
struct ovsdb_monitor_session_condition {
bool conditional; /* True iff every table's condition is true. */
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
struct shash tables; /* Contains
* "struct ovsdb_monitor_table_condition *"s. */
};
/* Monitored table session's conditions */
struct ovsdb_monitor_table_condition {
const struct ovsdb_table *table;
struct ovsdb_monitor_table *mt;
struct ovsdb_condition old_condition;
struct ovsdb_condition new_condition;
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
/* Condition composed from difference between clauses in old and new.
* Note: Empty diff condition doesn't mean that old == new. */
struct ovsdb_condition diff_condition;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
};
/* Backend monitor.
*
* ovsdb_monitor keep track of the ovsdb changes.
*/
/* A collection of tables being monitored. */
struct ovsdb_monitor {
struct ovs_list list_node; /* In struct ovsdb's "monitors" list. */
struct shash tables; /* Holds "struct ovsdb_monitor_table"s. */
struct ovs_list jsonrpc_monitors; /* Contains "jsonrpc_monitor_node"s. */
struct ovsdb *db;
/* Contains "ovsdb_monitor_change_set". Each change set contains changes
* from some start point up to the latest committed transaction. There can
* be different change sets for the same struct ovsdb_monitor because there
* are different clients pending on changes starting from different points.
* The different change sets are maintained as a list. */
struct ovs_list change_sets;
/* The new change set that is to be populated for future transactions. */
struct ovsdb_monitor_change_set *new_change_set;
/* The change set that starts from the first transaction of the DB, which
* is used for populating the initial data for new clients. */
struct ovsdb_monitor_change_set *init_change_set;
struct hmap_node hmap_node; /* Elements within ovsdb_monitors. */
struct hmap json_cache; /* Contains "ovsdb_monitor_json_cache_node"s.*/
};
/* A json object of updates for the ovsdb_monitor_change_set and the given
* monitor version. */
struct ovsdb_monitor_json_cache_node {
struct hmap_node hmap_node; /* Elements in json cache. */
enum ovsdb_monitor_version version;
struct uuid change_set_uuid;
struct json *json; /* Null, or a cloned of json */
};
struct jsonrpc_monitor_node {
struct ovs_list node;
struct ovsdb_jsonrpc_monitor *jsonrpc_monitor;
};
/* A particular column being monitored. */
struct ovsdb_monitor_column {
const struct ovsdb_column *column;
enum ovsdb_monitor_selection select;
bool monitored;
};
/* A row that has changed in a monitored table. */
struct ovsdb_monitor_row {
struct hmap_node hmap_node; /* In ovsdb_monitor_change_set_for_table. */
struct uuid uuid; /* UUID of row that changed. */
struct ovsdb_datum *old; /* Old data, NULL for an inserted row. */
struct ovsdb_datum *new; /* New data, NULL for a deleted row. */
};
/* Contains a set of changes that are not yet flushed to all the jsonrpc
* connections.
*
* 'n_refs' represent the number of jsonrpc connections that depend on this
* change set (have not received updates). Generate the update for the last
* jsonprc connection will also destroy the whole "struct
* ovsdb_monitor_change_set" object.
*/
struct ovsdb_monitor_change_set {
/* Element in change_sets of ovsdb_monitor. */
struct ovs_list list_node;
/* Internally generated uuid that identifies this data structure. */
struct uuid uuid;
/* Contains struct ovsdb_monitor_change_set_for_table. */
struct ovs_list change_set_for_tables;
int n_refs;
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
/* The previous txn id before this change set's start point. */
struct uuid prev_txn;
};
/* Contains 'struct ovsdb_monitor_row's for rows in a specific table
* of struct ovsdb_monitor_change_set. It can also be searched from
* member 'change_sets' of struct ovsdb_monitor_table. */
struct ovsdb_monitor_change_set_for_table {
/* Element in ovsdb_monitor_tables' change_sets list. */
struct ovs_list list_in_mt;
/* Element in ovsdb_monitor_change_sets' change_set_for_tables list. */
struct ovs_list list_in_change_set;
struct ovsdb_monitor_table *mt;
struct ovsdb_monitor_change_set *mcs;
/* Contains struct ovsdb_monitor_row. */
struct hmap rows;
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
/* Save the mt->n_columns that is used when creating the changes.
* It can be different from the current mt->n_columns because
* mt->n_columns can be increased when there are condition changes
* from any of the clients sharing the dbmon. */
size_t n_columns;
};
/* A particular table being monitored. */
struct ovsdb_monitor_table {
const struct ovsdb_table *table;
/* This is the union (bitwise-OR) of the 'select' values in all of the
* members of 'columns' below. */
enum ovsdb_monitor_selection select;
/* Columns being monitored. */
struct ovsdb_monitor_column *columns;
size_t n_columns;
size_t n_monitored_columns;
size_t allocated_columns;
/* Columns in ovsdb_monitor_row have different indexes then in
* ovsdb_row. This field maps between column->index to the index in the
* ovsdb_monitor_row. It is used for condition evaluation. */
unsigned int *columns_index_map;
/* Contains 'ovsdb_monitor_change_set_for_table'. */
struct ovs_list change_sets;
};
enum ovsdb_monitor_row_type {
OVSDB_ROW,
OVSDB_MONITOR_ROW
};
typedef struct json *
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
(*compose_row_update_cb_func)
(const struct ovsdb_monitor_table *mt,
const struct ovsdb_monitor_session_condition * condition,
enum ovsdb_monitor_row_type row_type,
const void *,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
bool initial, unsigned long int *changed,
size_t n_columns);
static void ovsdb_monitor_destroy(struct ovsdb_monitor *);
static struct ovsdb_monitor_change_set * ovsdb_monitor_add_change_set(
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
struct ovsdb_monitor *, bool init_only, const struct uuid *prev_txn);
static struct ovsdb_monitor_change_set * ovsdb_monitor_find_change_set(
const struct ovsdb_monitor *, const struct uuid *prev_txn);
static void ovsdb_monitor_change_set_destroy(
struct ovsdb_monitor_change_set *);
static void ovsdb_monitor_track_new_change_set(struct ovsdb_monitor *);
static uint32_t
json_cache_hash(enum ovsdb_monitor_version version,
struct ovsdb_monitor_change_set *change_set)
{
return hash_uint64_basis(version, uuid_hash(&change_set->uuid));
}
static struct ovsdb_monitor_json_cache_node *
ovsdb_monitor_json_cache_search(const struct ovsdb_monitor *dbmon,
enum ovsdb_monitor_version version,
struct ovsdb_monitor_change_set *change_set)
{
struct ovsdb_monitor_json_cache_node *node;
uint32_t hash = json_cache_hash(version, change_set);
HMAP_FOR_EACH_WITH_HASH(node, hmap_node, hash, &dbmon->json_cache) {
if (uuid_equals(&node->change_set_uuid, &change_set->uuid) &&
node->version == version) {
return node;
}
}
return NULL;
}
static void
ovsdb_monitor_json_cache_insert(struct ovsdb_monitor *dbmon,
enum ovsdb_monitor_version version,
struct ovsdb_monitor_change_set *change_set,
struct json *json)
{
struct ovsdb_monitor_json_cache_node *node;
uint32_t hash = json_cache_hash(version, change_set);
node = xmalloc(sizeof *node);
node->version = version;
node->change_set_uuid = change_set->uuid;
node->json = json ? json_clone(json) : NULL;
hmap_insert(&dbmon->json_cache, &node->hmap_node, hash);
}
static void
ovsdb_monitor_json_cache_flush(struct ovsdb_monitor *dbmon)
{
struct ovsdb_monitor_json_cache_node *node;
HMAP_FOR_EACH_POP(node, hmap_node, &dbmon->json_cache) {
json_destroy(node->json);
free(node);
}
}
/* Free all versions of json cache for a given change_set.*/
static void
ovsdb_monitor_json_cache_destroy(struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_change_set *change_set)
{
enum ovsdb_monitor_version v;
for (v = OVSDB_MONITOR_V1; v < OVSDB_MONITOR_VERSION_MAX; v++) {
struct ovsdb_monitor_json_cache_node *node
= ovsdb_monitor_json_cache_search(dbmon, v, change_set);
if (node) {
hmap_remove(&dbmon->json_cache, &node->hmap_node);
json_destroy(node->json);
free(node);
}
}
}
static int
compare_ovsdb_monitor_column(const void *a_, const void *b_)
{
const struct ovsdb_monitor_column *a = a_;
const struct ovsdb_monitor_column *b = b_;
/* put all monitored columns at the begining */
if (a->monitored != b->monitored) {
return a->monitored ? -1 : 1;
}
return a->column < b->column ? -1 : a->column > b->column;
}
/* Finds and returns the ovsdb_monitor_row in 'mt->changes->rows' for the
* given 'uuid', or NULL if there is no such row. */
static struct ovsdb_monitor_row *
ovsdb_monitor_changes_row_find(
const struct ovsdb_monitor_change_set_for_table *changes,
const struct uuid *uuid)
{
struct ovsdb_monitor_row *row;
HMAP_FOR_EACH_WITH_HASH (row, hmap_node, uuid_hash(uuid),
&changes->rows) {
if (uuid_equals(uuid, &row->uuid)) {
return row;
}
}
return NULL;
}
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
/* Allocates an array of 'n_columns' ovsdb_datums and initializes them as
* copies of the data in 'row' drawn from the columns represented by
* mt->columns[]. Returns the array.
*
* If 'row' is NULL, returns NULL. */
static struct ovsdb_datum *
clone_monitor_row_data(const struct ovsdb_monitor_table *mt,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
const struct ovsdb_row *row,
size_t n_columns)
{
struct ovsdb_datum *data;
size_t i;
if (!row) {
return NULL;
}
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
data = xmalloc(n_columns * sizeof *data);
for (i = 0; i < n_columns; i++) {
const struct ovsdb_column *c = mt->columns[i].column;
const struct ovsdb_datum *src = &row->fields[c->index];
struct ovsdb_datum *dst = &data[i];
ovsdb_datum_clone(dst, src);
}
return data;
}
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
/* Replaces the n_columns ovsdb_datums in row[] by copies of the data from
* in 'row' drawn from the columns represented by mt->columns[]. */
static void
update_monitor_row_data(const struct ovsdb_monitor_table *mt,
const struct ovsdb_row *row,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
struct ovsdb_datum *data,
size_t n_columns)
{
size_t i;
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
for (i = 0; i < n_columns; i++) {
const struct ovsdb_column *c = mt->columns[i].column;
const struct ovsdb_datum *src = &row->fields[c->index];
struct ovsdb_datum *dst = &data[i];
const struct ovsdb_type *type = &c->type;
if (!ovsdb_datum_equals(src, dst, type)) {
ovsdb_datum_destroy(dst, type);
ovsdb_datum_clone(dst, src);
}
}
}
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
/* Frees all of the n_columns ovsdb_datums in data[], using the types taken
* from mt->columns[], plus 'data' itself. */
static void
free_monitor_row_data(const struct ovsdb_monitor_table *mt,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
struct ovsdb_datum *data,
size_t n_columns)
{
if (data) {
size_t i;
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
for (i = 0; i < n_columns; i++) {
const struct ovsdb_column *c = mt->columns[i].column;
ovsdb_datum_destroy(&data[i], &c->type);
}
free(data);
}
}
/* Frees 'row', which must have been created from 'mt'. */
static void
ovsdb_monitor_row_destroy(const struct ovsdb_monitor_table *mt,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
struct ovsdb_monitor_row *row,
size_t n_columns)
{
if (row) {
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
free_monitor_row_data(mt, row->old, n_columns);
free_monitor_row_data(mt, row->new, n_columns);
free(row);
}
}
static void
ovsdb_monitor_columns_sort(struct ovsdb_monitor *dbmon)
{
int i;
struct shash_node *node;
SHASH_FOR_EACH (node, &dbmon->tables) {
struct ovsdb_monitor_table *mt = node->data;
if (mt->n_columns == 0) {
continue;
}
qsort(mt->columns, mt->n_columns, sizeof *mt->columns,
compare_ovsdb_monitor_column);
for (i = 0; i < mt->n_columns; i++) {
/* re-set index map due to sort */
mt->columns_index_map[mt->columns[i].column->index] = i;
}
}
}
void
ovsdb_monitor_add_jsonrpc_monitor(struct ovsdb_monitor *dbmon,
struct ovsdb_jsonrpc_monitor *jsonrpc_monitor)
{
struct jsonrpc_monitor_node *jm;
jm = xzalloc(sizeof *jm);
jm->jsonrpc_monitor = jsonrpc_monitor;
ovs_list_push_back(&dbmon->jsonrpc_monitors, &jm->node);
}
struct ovsdb_monitor *
ovsdb_monitor_create(struct ovsdb *db,
struct ovsdb_jsonrpc_monitor *jsonrpc_monitor)
{
struct ovsdb_monitor *dbmon;
dbmon = xzalloc(sizeof *dbmon);
ovs_list_push_back(&db->monitors, &dbmon->list_node);
ovs_list_init(&dbmon->jsonrpc_monitors);
dbmon->db = db;
ovs_list_init(&dbmon->change_sets);
shash_init(&dbmon->tables);
hmap_node_nullify(&dbmon->hmap_node);
hmap_init(&dbmon->json_cache);
ovsdb_monitor_add_jsonrpc_monitor(dbmon, jsonrpc_monitor);
return dbmon;
}
void
ovsdb_monitor_add_table(struct ovsdb_monitor *m,
const struct ovsdb_table *table)
{
struct ovsdb_monitor_table *mt;
int i;
size_t n_columns = shash_count(&table->schema->columns);
mt = xzalloc(sizeof *mt);
mt->table = table;
shash_add(&m->tables, table->schema->name, mt);
ovs_list_init(&mt->change_sets);
mt->columns_index_map =
xmalloc(sizeof *mt->columns_index_map * n_columns);
for (i = 0; i < n_columns; i++) {
mt->columns_index_map[i] = -1;
}
}
const char *
ovsdb_monitor_add_column(struct ovsdb_monitor *dbmon,
const struct ovsdb_table *table,
const struct ovsdb_column *column,
enum ovsdb_monitor_selection select,
bool monitored)
{
ovsdb: monitor: Destroy initial change set when new columns added. Initial change set is preserved for as long as the monitor itself. However, if a new client has a condition on a column that is not one of the monitored columns, this column will be added to the monitor via ovsdb_monitor_condition_bind(). This new column, however, doesn't exist in the initial change set. That will cause ovsdb-server to malfunction or crash trying to access non-existent column during condition evaluation: ERROR: AddressSanitizer: heap-buffer-overflow READ of size 4 at 0x606000006780 thread T0 0 ovsdb_clause_evaluate ovsdb/condition.c:328:26 1 ovsdb_condition_match_any_clause ovsdb/condition.c:441:13 2 ovsdb_condition_empty_or_match_any ovsdb/condition.h:84:13 3 ovsdb_monitor_row_update_type_condition ovsdb/monitor.c:892:28 4 ovsdb_monitor_compose_row_update2 ovsdb/monitor.c:1058:12 5 ovsdb_monitor_compose_update ovsdb/monitor.c:1172:24 6 ovsdb_monitor_get_update ovsdb/monitor.c:1276:24 7 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1505:12 8 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 9 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 10 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 11 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 12 main_loop ovsdb/ovsdb-server.c:222:9 13 main ovsdb/ovsdb-server.c:500:5 14 __libc_start_call_main 15 __libc_start_main@GLIBC_2.2.5 16 _start (ovsdb/ovsdb-server+0x473034) Located 0 bytes after 64-byte region [0x606000006740,0x606000006780) allocated by thread T0 here: 0 malloc (ovsdb/ovsdb-server+0x50dc82) 1 xmalloc__ lib/util.c:140:15 2 xmalloc lib/util.c:175:12 3 clone_monitor_row_data ovsdb/monitor.c:336:12 4 ovsdb_monitor_changes_update ovsdb/monitor.c:1384:23 5 ovsdb_monitor_get_initial ovsdb/monitor.c:1535:21 6 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1502:9 7 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 8 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 9 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 10 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 11 main_loop ovsdb/ovsdb-server.c:222:9 12 main ovsdb/ovsdb-server.c:500:5 13 __libc_start_call_main 14 __libc_start_main@GLIBC_2.2.5 15 _start (ovsdb/ovsdb-server+0x473034) Fix that by destroying the initial change set every time new columns are added to the monitor. This will trigger re-generation of the change set and it will contain all the necessary columns afterwards. Fixes: 07c27226ee96 ("ovsdb: Monitor: Keep and maintain the initial change set.") Reported-by: Han Zhou <hzhou@ovn.org> Acked-by: Han Zhou <hzhou@ovn.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-06-07 15:08:32 +02:00
struct ovsdb_monitor_change_set *mcs;
struct ovsdb_monitor_table *mt;
struct ovsdb_monitor_column *c;
mt = shash_find_data(&dbmon->tables, table->schema->name);
/* Check for column duplication. Return duplicated column name. */
if (mt->columns_index_map[column->index] != -1) {
return column->name;
}
ovsdb: monitor: Destroy initial change set when new columns added. Initial change set is preserved for as long as the monitor itself. However, if a new client has a condition on a column that is not one of the monitored columns, this column will be added to the monitor via ovsdb_monitor_condition_bind(). This new column, however, doesn't exist in the initial change set. That will cause ovsdb-server to malfunction or crash trying to access non-existent column during condition evaluation: ERROR: AddressSanitizer: heap-buffer-overflow READ of size 4 at 0x606000006780 thread T0 0 ovsdb_clause_evaluate ovsdb/condition.c:328:26 1 ovsdb_condition_match_any_clause ovsdb/condition.c:441:13 2 ovsdb_condition_empty_or_match_any ovsdb/condition.h:84:13 3 ovsdb_monitor_row_update_type_condition ovsdb/monitor.c:892:28 4 ovsdb_monitor_compose_row_update2 ovsdb/monitor.c:1058:12 5 ovsdb_monitor_compose_update ovsdb/monitor.c:1172:24 6 ovsdb_monitor_get_update ovsdb/monitor.c:1276:24 7 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1505:12 8 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 9 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 10 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 11 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 12 main_loop ovsdb/ovsdb-server.c:222:9 13 main ovsdb/ovsdb-server.c:500:5 14 __libc_start_call_main 15 __libc_start_main@GLIBC_2.2.5 16 _start (ovsdb/ovsdb-server+0x473034) Located 0 bytes after 64-byte region [0x606000006740,0x606000006780) allocated by thread T0 here: 0 malloc (ovsdb/ovsdb-server+0x50dc82) 1 xmalloc__ lib/util.c:140:15 2 xmalloc lib/util.c:175:12 3 clone_monitor_row_data ovsdb/monitor.c:336:12 4 ovsdb_monitor_changes_update ovsdb/monitor.c:1384:23 5 ovsdb_monitor_get_initial ovsdb/monitor.c:1535:21 6 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1502:9 7 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 8 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 9 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 10 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 11 main_loop ovsdb/ovsdb-server.c:222:9 12 main ovsdb/ovsdb-server.c:500:5 13 __libc_start_call_main 14 __libc_start_main@GLIBC_2.2.5 15 _start (ovsdb/ovsdb-server+0x473034) Fix that by destroying the initial change set every time new columns are added to the monitor. This will trigger re-generation of the change set and it will contain all the necessary columns afterwards. Fixes: 07c27226ee96 ("ovsdb: Monitor: Keep and maintain the initial change set.") Reported-by: Han Zhou <hzhou@ovn.org> Acked-by: Han Zhou <hzhou@ovn.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-06-07 15:08:32 +02:00
mcs = dbmon->init_change_set;
if (mcs) {
/* A new column is going to be added to the monitor. Existing
* initial change set doesn't have it, so can no longer be used.
* Initial change set is never used by more than one session at
* the same time, so it's safe to destroy it here. */
ovs_assert(mcs->n_refs == 1);
ovsdb_monitor_json_cache_destroy(dbmon, mcs);
ovsdb_monitor_change_set_destroy(mcs);
dbmon->init_change_set = NULL;
}
if (mt->n_columns >= mt->allocated_columns) {
mt->columns = x2nrealloc(mt->columns, &mt->allocated_columns,
sizeof *mt->columns);
}
mt->select |= select;
mt->columns_index_map[column->index] = mt->n_columns;
c = &mt->columns[mt->n_columns++];
c->column = column;
c->select = select;
c->monitored = monitored;
if (monitored) {
mt->n_monitored_columns++;
}
return NULL;
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
static void
ovsdb_monitor_condition_add_columns(struct ovsdb_monitor *dbmon,
const struct ovsdb_table *table,
struct ovsdb_condition *condition)
{
size_t n_columns;
int i;
const struct ovsdb_column **columns =
ovsdb_condition_get_columns(condition, &n_columns);
for (i = 0; i < n_columns; i++) {
ovsdb_monitor_add_column(dbmon, table, columns[i],
OJMS_NONE, false);
}
free(columns);
}
/* Bind this session's condition to ovsdb_monitor */
void
ovsdb_monitor_condition_bind(struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_session_condition *cond)
{
struct shash_node *node;
SHASH_FOR_EACH(node, &cond->tables) {
struct ovsdb_monitor_table_condition *mtc = node->data;
struct ovsdb_monitor_table *mt =
shash_find_data(&dbmon->tables, mtc->table->schema->name);
mtc->mt = mt;
ovsdb_monitor_condition_add_columns(dbmon, mtc->table,
&mtc->new_condition);
}
}
bool
ovsdb_monitor_table_exists(struct ovsdb_monitor *m,
const struct ovsdb_table *table)
{
return shash_find_data(&m->tables, table->schema->name);
}
static struct ovsdb_monitor_change_set *
ovsdb_monitor_add_change_set(struct ovsdb_monitor *dbmon,
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
bool init_only, const struct uuid *prev_txn)
{
struct ovsdb_monitor_change_set *change_set = xzalloc(sizeof *change_set);
change_set->uuid = uuid_random();
ovs_list_push_back(&(dbmon->change_sets), &change_set->list_node);
ovs_list_init(&change_set->change_set_for_tables);
change_set->n_refs = 1;
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
change_set->prev_txn = prev_txn ? *prev_txn : UUID_ZERO;
struct shash_node *node;
SHASH_FOR_EACH (node, &dbmon->tables) {
struct ovsdb_monitor_table *mt = node->data;
if (!init_only || (mt->select & OJMS_INITIAL)) {
struct ovsdb_monitor_change_set_for_table *mcst =
xzalloc(sizeof *mcst);
mcst->mt = mt;
mcst->n_columns = mt->n_columns;
mcst->mcs = change_set;
hmap_init(&mcst->rows);
ovs_list_push_back(&mt->change_sets, &mcst->list_in_mt);
ovs_list_push_back(&change_set->change_set_for_tables,
&mcst->list_in_change_set);
}
}
return change_set;
};
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
static struct ovsdb_monitor_change_set *
ovsdb_monitor_find_change_set(const struct ovsdb_monitor *dbmon,
const struct uuid *prev_txn)
{
struct ovsdb_monitor_change_set *cs;
LIST_FOR_EACH (cs, list_node, &dbmon->change_sets) {
if (uuid_equals(&cs->prev_txn, prev_txn)) {
/* Check n_columns for each table in dbmon, in case it is changed
* after the change set is populated. */
bool n_col_is_equal = true;
struct ovsdb_monitor_change_set_for_table *mcst;
LIST_FOR_EACH (mcst, list_in_change_set,
&cs->change_set_for_tables) {
struct ovsdb_monitor_table *mt = mcst->mt;
if (mt->n_columns != mcst->n_columns) {
n_col_is_equal = false;
break;
}
}
if (n_col_is_equal) {
return cs;
}
}
}
return NULL;
}
static void
ovsdb_monitor_untrack_change_set(struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_change_set *mcs)
{
ovs_assert(mcs);
if (--mcs->n_refs == 0) {
if (mcs == dbmon->init_change_set) {
ovsdb: Monitor: Keep and maintain the initial change set. Change sets in OVSDB monitor are storing all the changes that happened between a particular transaction ID and now. Initial change set basically contains all the data. On each monitor request a new initial change set is created by creating an empty change set and adding all the database rows. Then it is converted into JSON reply and immediately untracked and destroyed. This is causing significant performance issues if many clients are requesting new monitors at the same time. For example, that is happening after database schema conversion, because conversion triggers cancellation of all monitors. After cancellation, every client sends a new monitor request. The server then creates a new initial change set, sends a reply, destroys initial change set and repeats that for each client. On a system with 200 MB database and 500 clients, cluster of 3 servers spends 20 minutes replying to all the clients (200 MB x 500 = 100 GB): timeval|WARN|Unreasonably long 1201525ms poll interval Of course, all the clients are already disconnected due to inactivity at this point. When they are re-connecting back, server accepts new connections one at a time, so inactivity probes will not be triggered anymore, but it still takes another 20 minutes to handle all the incoming connections. Let's keep the initial change set around for as long as the monitor itself exists. This will allow us to not construct a new change set on each new monitor request and even utilize the JSON cache in some cases. All that at a relatively small maintenance cost, since we'll need to commit changes to one extra change set on every transaction. Measured memory usage increase due to keeping around a shallow copy of a database is about 10%. Measured CPU usage difference during normal operation is negligible. With this change it takes only 30 seconds to send out all the monitor replies in the example above. So, it's a 40x performance improvement. On a more reasonable setup with 250 nodes, the process takes up to 8-10 seconds instead of 4-5 minutes. Conditional monitoring will benefit from this change as well, however results might be less impressive due to lack of JSON cache. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-03-27 21:43:01 +02:00
/* The initial change set should exist as long as the
ovsdb: monitor: Destroy initial change set when new columns added. Initial change set is preserved for as long as the monitor itself. However, if a new client has a condition on a column that is not one of the monitored columns, this column will be added to the monitor via ovsdb_monitor_condition_bind(). This new column, however, doesn't exist in the initial change set. That will cause ovsdb-server to malfunction or crash trying to access non-existent column during condition evaluation: ERROR: AddressSanitizer: heap-buffer-overflow READ of size 4 at 0x606000006780 thread T0 0 ovsdb_clause_evaluate ovsdb/condition.c:328:26 1 ovsdb_condition_match_any_clause ovsdb/condition.c:441:13 2 ovsdb_condition_empty_or_match_any ovsdb/condition.h:84:13 3 ovsdb_monitor_row_update_type_condition ovsdb/monitor.c:892:28 4 ovsdb_monitor_compose_row_update2 ovsdb/monitor.c:1058:12 5 ovsdb_monitor_compose_update ovsdb/monitor.c:1172:24 6 ovsdb_monitor_get_update ovsdb/monitor.c:1276:24 7 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1505:12 8 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 9 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 10 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 11 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 12 main_loop ovsdb/ovsdb-server.c:222:9 13 main ovsdb/ovsdb-server.c:500:5 14 __libc_start_call_main 15 __libc_start_main@GLIBC_2.2.5 16 _start (ovsdb/ovsdb-server+0x473034) Located 0 bytes after 64-byte region [0x606000006740,0x606000006780) allocated by thread T0 here: 0 malloc (ovsdb/ovsdb-server+0x50dc82) 1 xmalloc__ lib/util.c:140:15 2 xmalloc lib/util.c:175:12 3 clone_monitor_row_data ovsdb/monitor.c:336:12 4 ovsdb_monitor_changes_update ovsdb/monitor.c:1384:23 5 ovsdb_monitor_get_initial ovsdb/monitor.c:1535:21 6 ovsdb_jsonrpc_monitor_create ovsdb/jsonrpc-server.c:1502:9 7 ovsdb_jsonrpc_session_got_request ovsdb/jsonrpc-server.c:1030:21 8 ovsdb_jsonrpc_session_run ovsdb/jsonrpc-server.c:572:17 9 ovsdb_jsonrpc_session_run_all ovsdb/jsonrpc-server.c:602:21 10 ovsdb_jsonrpc_server_run ovsdb/jsonrpc-server.c:417:9 11 main_loop ovsdb/ovsdb-server.c:222:9 12 main ovsdb/ovsdb-server.c:500:5 13 __libc_start_call_main 14 __libc_start_main@GLIBC_2.2.5 15 _start (ovsdb/ovsdb-server+0x473034) Fix that by destroying the initial change set every time new columns are added to the monitor. This will trigger re-generation of the change set and it will contain all the necessary columns afterwards. Fixes: 07c27226ee96 ("ovsdb: Monitor: Keep and maintain the initial change set.") Reported-by: Han Zhou <hzhou@ovn.org> Acked-by: Han Zhou <hzhou@ovn.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-06-07 15:08:32 +02:00
* monitor doesn't change. */
ovsdb: Monitor: Keep and maintain the initial change set. Change sets in OVSDB monitor are storing all the changes that happened between a particular transaction ID and now. Initial change set basically contains all the data. On each monitor request a new initial change set is created by creating an empty change set and adding all the database rows. Then it is converted into JSON reply and immediately untracked and destroyed. This is causing significant performance issues if many clients are requesting new monitors at the same time. For example, that is happening after database schema conversion, because conversion triggers cancellation of all monitors. After cancellation, every client sends a new monitor request. The server then creates a new initial change set, sends a reply, destroys initial change set and repeats that for each client. On a system with 200 MB database and 500 clients, cluster of 3 servers spends 20 minutes replying to all the clients (200 MB x 500 = 100 GB): timeval|WARN|Unreasonably long 1201525ms poll interval Of course, all the clients are already disconnected due to inactivity at this point. When they are re-connecting back, server accepts new connections one at a time, so inactivity probes will not be triggered anymore, but it still takes another 20 minutes to handle all the incoming connections. Let's keep the initial change set around for as long as the monitor itself exists. This will allow us to not construct a new change set on each new monitor request and even utilize the JSON cache in some cases. All that at a relatively small maintenance cost, since we'll need to commit changes to one extra change set on every transaction. Measured memory usage increase due to keeping around a shallow copy of a database is about 10%. Measured CPU usage difference during normal operation is negligible. With this change it takes only 30 seconds to send out all the monitor replies in the example above. So, it's a 40x performance improvement. On a more reasonable setup with 250 nodes, the process takes up to 8-10 seconds instead of 4-5 minutes. Conditional monitoring will benefit from this change as well, however results might be less impressive due to lack of JSON cache. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-03-27 21:43:01 +02:00
mcs->n_refs++;
return;
} else if (mcs == dbmon->new_change_set) {
dbmon->new_change_set = NULL;
}
ovsdb_monitor_json_cache_destroy(dbmon, mcs);
ovsdb_monitor_change_set_destroy(mcs);
}
}
static void
ovsdb_monitor_track_new_change_set(struct ovsdb_monitor *dbmon)
{
struct ovsdb_monitor_change_set *change_set = dbmon->new_change_set;
if (change_set) {
change_set->n_refs++;
} else {
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
change_set = ovsdb_monitor_add_change_set(dbmon, false,
ovsdb_monitor_get_last_txnid(dbmon));
dbmon->new_change_set = change_set;
}
}
static void
ovsdb_monitor_change_set_destroy(struct ovsdb_monitor_change_set *mcs)
{
ovs_list_remove(&mcs->list_node);
struct ovsdb_monitor_change_set_for_table *mcst;
LIST_FOR_EACH_SAFE (mcst, list_in_change_set,
&mcs->change_set_for_tables) {
ovs_list_remove(&mcst->list_in_change_set);
ovs_list_remove(&mcst->list_in_mt);
struct ovsdb_monitor_row *row;
HMAP_FOR_EACH_SAFE (row, hmap_node, &mcst->rows) {
hmap_remove(&mcst->rows, &row->hmap_node);
ovsdb_monitor_row_destroy(mcst->mt, row, mcst->n_columns);
}
hmap_destroy(&mcst->rows);
free(mcst);
}
free(mcs);
}
static enum ovsdb_monitor_selection
ovsdb_monitor_row_update_type(bool initial, const bool old, const bool new)
{
return initial ? OJMS_INITIAL
: !old ? OJMS_INSERT
: !new ? OJMS_DELETE
: OJMS_MODIFY;
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
/* Set conditional monitoring mode only if we have non-empty condition in one
* of the tables at least */
static inline void
ovsdb_monitor_session_condition_set_mode(
struct ovsdb_monitor_session_condition *cond)
{
struct shash_node *node;
SHASH_FOR_EACH (node, &cond->tables) {
struct ovsdb_monitor_table_condition *mtc = node->data;
if (!ovsdb_condition_is_true(&mtc->new_condition)) {
cond->conditional = true;
return;
}
}
cond->conditional = false;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
}
/* Returnes an empty allocated session's condition state holder */
struct ovsdb_monitor_session_condition *
ovsdb_monitor_session_condition_create(void)
{
struct ovsdb_monitor_session_condition *condition =
xzalloc(sizeof *condition);
condition->conditional = false;
shash_init(&condition->tables);
return condition;
}
void
ovsdb_monitor_session_condition_destroy(
struct ovsdb_monitor_session_condition *condition)
{
struct shash_node *node;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
if (!condition) {
return;
}
SHASH_FOR_EACH_SAFE (node, &condition->tables) {
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
struct ovsdb_monitor_table_condition *mtc = node->data;
ovsdb_condition_destroy(&mtc->new_condition);
ovsdb_condition_destroy(&mtc->old_condition);
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
ovsdb_condition_destroy(&mtc->diff_condition);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
shash_delete(&condition->tables, node);
free(mtc);
}
shash_destroy(&condition->tables);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
free(condition);
}
struct ovsdb_error *
ovsdb_monitor_table_condition_create(
struct ovsdb_monitor_session_condition *condition,
const struct ovsdb_table *table,
const struct json *json_cnd)
{
struct ovsdb_monitor_table_condition *mtc;
struct ovsdb_error *error;
mtc = xzalloc(sizeof *mtc);
mtc->table = table;
ovsdb_condition_init(&mtc->old_condition);
ovsdb_condition_init(&mtc->new_condition);
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
ovsdb_condition_init(&mtc->diff_condition);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
if (json_cnd) {
error = ovsdb_condition_from_json(table->schema,
json_cnd,
NULL,
&mtc->old_condition);
if (error) {
free(mtc);
return error;
}
}
shash_add(&condition->tables, table->schema->name, mtc);
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
/* On session startup old == new condition, diff is empty. */
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
ovsdb_condition_clone(&mtc->new_condition, &mtc->old_condition);
monitor: Fix bad caching of conditional monitor_cond requests. The current implementation of ovsdb-server caches only non-conditional monitors, that is, monitors for every table row, not those that monitor only rows that match some condition. To figure out which monitors are conditional, the code track the number of tables that have conditions that are uniformly true (cond->n_true_cnd) and compares that against the number of tables in the condition (shash_count(&cond->tables)). If they are the same, then every table has (effectively) no condition, and so cond->conditional is set to false. However, the implementation was buggy. The function that adds a new table condition, ovsdb_monitor_table_condition_create(), only updated cond->conditional if the table condition being added was true. This is wrong; only adding a non-true condition can actually change cond->conditional. This commit fixes the problem by always recalculating cond->conditional. The most visible side effect of cond->conditional being true when it should be false, as caused by this bug, was that conditional monitors were being mixed with unconditional monitors for the purpose of caching. This meant that, if a client requested a conditional monitor that was the same as an unconditional one, except for the condition, then the client would receive the cached data previously sent for the unconditional one. This commit fixes the problem. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Liran Schour <lirans@il.ibm.com>
2017-08-30 09:33:13 -07:00
ovsdb_monitor_session_condition_set_mode(condition);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
return NULL;
}
static bool
ovsdb_monitor_get_table_conditions(
const struct ovsdb_monitor_table *mt,
const struct ovsdb_monitor_session_condition *condition,
struct ovsdb_condition **old_condition,
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
struct ovsdb_condition **new_condition,
struct ovsdb_condition **diff_condition)
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
{
if (!condition) {
return false;
}
struct ovsdb_monitor_table_condition *mtc =
shash_find_data(&condition->tables, mt->table->schema->name);
if (!mtc) {
return false;
}
*old_condition = &mtc->old_condition;
*new_condition = &mtc->new_condition;
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
*diff_condition = &mtc->diff_condition;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
return true;
}
struct ovsdb_error *
ovsdb_monitor_table_condition_update(
struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_session_condition *condition,
const struct ovsdb_table *table,
const struct json *cond_json)
{
if (!condition) {
return NULL;
}
struct ovsdb_monitor_table_condition *mtc =
shash_find_data(&condition->tables, table->schema->name);
struct ovsdb_error *error;
struct ovsdb_condition cond = OVSDB_CONDITION_INITIALIZER(&cond);
error = ovsdb_condition_from_json(table->schema, cond_json,
NULL, &cond);
if (error) {
return error;
}
ovsdb_condition_destroy(&mtc->new_condition);
ovsdb_condition_clone(&mtc->new_condition, &cond);
ovsdb_condition_destroy(&cond);
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
ovsdb_condition_diff(&mtc->diff_condition,
&mtc->old_condition, &mtc->new_condition);
ovsdb_monitor_condition_add_columns(dbmon,
table,
&mtc->new_condition);
return NULL;
}
static void
ovsdb_monitor_table_condition_updated(struct ovsdb_monitor_table *mt,
struct ovsdb_monitor_session_condition *condition)
{
struct ovsdb_monitor_table_condition *mtc =
shash_find_data(&condition->tables, mt->table->schema->name);
if (mtc) {
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
/* If conditional monitoring - set old condition to new condition
* and clear the diff. */
if (ovsdb_condition_cmp_3way(&mtc->old_condition,
&mtc->new_condition)) {
ovsdb_condition_destroy(&mtc->old_condition);
ovsdb_condition_clone(&mtc->old_condition, &mtc->new_condition);
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
ovsdb_condition_destroy(&mtc->diff_condition);
ovsdb_condition_init(&mtc->diff_condition);
ovsdb_monitor_session_condition_set_mode(condition);
}
}
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
static enum ovsdb_monitor_selection
ovsdb_monitor_row_update_type_condition(
const struct ovsdb_monitor_table *mt,
const struct ovsdb_monitor_session_condition *condition,
bool initial,
enum ovsdb_monitor_row_type row_type,
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
const struct ovsdb_datum *old,
const struct ovsdb_datum *new)
{
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
struct ovsdb_condition *old_condition, *new_condition, *diff_condition;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
enum ovsdb_monitor_selection type =
ovsdb_monitor_row_update_type(initial, old, new);
if (ovsdb_monitor_get_table_conditions(mt,
condition,
&old_condition,
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
&new_condition,
&diff_condition)) {
unsigned int *index_map = row_type == OVSDB_MONITOR_ROW
? mt->columns_index_map : NULL;
bool old_cond = false, new_cond = false;
if (old && old == new
&& !ovsdb_condition_empty_or_match_any(old, diff_condition,
index_map)) {
/* Condition changed, but not the data. And the row is not
* affected by the condition change. It either mathes or
* doesn't match both old and new conditions at the same time.
* In any case, this row should not be part of the update. */
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
type = OJMS_NONE;
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
} else {
/* The row changed or the condition change affects this row.
* Need to fully check old and new conditions. */
if (old) {
old_cond = ovsdb_condition_empty_or_match_any(
old, old_condition, index_map);
}
if (new) {
new_cond = ovsdb_condition_empty_or_match_any(
new, new_condition, index_map);
}
if (!old_cond && !new_cond) {
type = OJMS_NONE;
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
}
switch (type) {
case OJMS_INITIAL:
case OJMS_INSERT:
if (!new_cond) {
type = OJMS_NONE;
}
break;
case OJMS_MODIFY:
type = !old_cond ? OJMS_INSERT : !new_cond
? OJMS_DELETE : OJMS_MODIFY;
break;
case OJMS_DELETE:
if (!old_cond) {
type = OJMS_NONE;
}
break;
case OJMS_NONE:
break;
}
}
return type;
}
static bool
ovsdb_monitor_row_skip_update(const struct ovsdb_monitor_table *mt,
enum ovsdb_monitor_row_type row_type,
const struct ovsdb_datum *old,
const struct ovsdb_datum *new,
enum ovsdb_monitor_selection type,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
unsigned long int *changed,
size_t n_columns)
{
if (!(mt->select & type)) {
return true;
}
if (type == OJMS_MODIFY) {
size_t i, n_changes;
n_changes = 0;
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
memset(changed, 0, bitmap_n_bytes(n_columns));
for (i = 0; i < n_columns; i++) {
const struct ovsdb_column *c = mt->columns[i].column;
size_t index = row_type == OVSDB_ROW ? c->index : i;
if (!ovsdb_datum_equals(&old[index], &new[index], &c->type)) {
bitmap_set1(changed, i);
n_changes++;
}
}
if (!n_changes) {
/* No actual changes: presumably a row changed and then
* changed back later. */
return true;
}
}
return false;
}
/* Returns JSON for a <row-update> (as described in RFC 7047) for 'row' within
* 'mt', or NULL if no row update should be sent.
*
* The caller should specify 'initial' as true if the returned JSON is going to
* be used as part of the initial reply to a "monitor" request, false if it is
* going to be used as part of an "update" notification.
*
* 'changed' must be a scratch buffer for internal use that is at least
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
* bitmap_n_bytes(n_columns) bytes long. */
static struct json *
ovsdb_monitor_compose_row_update(
const struct ovsdb_monitor_table *mt,
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
const struct ovsdb_monitor_session_condition *condition OVS_UNUSED,
enum ovsdb_monitor_row_type row_type OVS_UNUSED,
const void *_row,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
bool initial, unsigned long int *changed,
size_t n_columns OVS_UNUSED)
{
const struct ovsdb_monitor_row *row = _row;
enum ovsdb_monitor_selection type;
struct json *old_json, *new_json;
struct json *row_json;
size_t i;
ovs_assert(row_type == OVSDB_MONITOR_ROW);
type = ovsdb_monitor_row_update_type(initial, row->old, row->new);
if (ovsdb_monitor_row_skip_update(mt, row_type, row->old,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
row->new, type, changed,
mt->n_columns)) {
return NULL;
}
row_json = json_object_create();
old_json = new_json = NULL;
if (type & (OJMS_DELETE | OJMS_MODIFY)) {
old_json = json_object_create();
json_object_put(row_json, "old", old_json);
}
if (type & (OJMS_INITIAL | OJMS_INSERT | OJMS_MODIFY)) {
new_json = json_object_create();
json_object_put(row_json, "new", new_json);
}
for (i = 0; i < mt->n_monitored_columns; i++) {
const struct ovsdb_monitor_column *c = &mt->columns[i];
if (!c->monitored || !(type & c->select)) {
/* We don't care about this type of change for this
* particular column (but we will care about it for some
* other column). */
continue;
}
if ((type == OJMS_MODIFY && bitmap_is_set(changed, i))
|| type == OJMS_DELETE) {
json_object_put(old_json, c->column->name,
ovsdb_datum_to_json(&row->old[i],
&c->column->type));
}
if (type & (OJMS_INITIAL | OJMS_INSERT | OJMS_MODIFY)) {
json_object_put(new_json, c->column->name,
ovsdb_datum_to_json(&row->new[i],
&c->column->type));
}
}
return row_json;
}
/* Returns JSON for a <row-update2> (as described in ovsdb-server(1) mapage)
* for 'row' within * 'mt', or NULL if no row update should be sent.
*
* The caller should specify 'initial' as true if the returned JSON is
* going to be used as part of the initial reply to a "monitor_cond" request,
* false if it is going to be used as part of an "update2" notification.
*
* 'changed' must be a scratch buffer for internal use that is at least
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
* bitmap_n_bytes(n_columns) bytes long. */
static struct json *
ovsdb_monitor_compose_row_update2(
const struct ovsdb_monitor_table *mt,
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
const struct ovsdb_monitor_session_condition *condition,
enum ovsdb_monitor_row_type row_type,
const void *_row,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
bool initial, unsigned long int *changed,
size_t n_columns)
{
enum ovsdb_monitor_selection type;
struct json *row_update2, *diff_json;
const struct ovsdb_datum *old, *new;
size_t i;
if (row_type == OVSDB_MONITOR_ROW) {
old = ((const struct ovsdb_monitor_row *)_row)->old;;
new = ((const struct ovsdb_monitor_row *)_row)->new;
} else {
old = new = ((const struct ovsdb_row *)_row)->fields;
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
type = ovsdb_monitor_row_update_type_condition(mt, condition, initial,
row_type, old, new);
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
if (ovsdb_monitor_row_skip_update(mt, row_type, old, new, type, changed,
n_columns)) {
return NULL;
}
row_update2 = json_object_create();
if (type == OJMS_DELETE) {
json_object_put(row_update2, "delete", json_null_create());
} else {
diff_json = json_object_create();
const char *op;
for (i = 0; i < mt->n_monitored_columns; i++) {
const struct ovsdb_monitor_column *c = &mt->columns[i];
size_t index = row_type == OVSDB_ROW ? c->column->index : i;
if (!c->monitored || !(type & c->select)) {
/* We don't care about this type of change for this
* particular column (but we will care about it for some
* other column). */
continue;
}
if (type == OJMS_MODIFY) {
struct ovsdb_datum diff;
if (!bitmap_is_set(changed, i)) {
continue;
}
ovsdb_datum_diff(&diff ,&old[index], &new[index],
&c->column->type);
json_object_put(diff_json, c->column->name,
ovsdb_datum_to_json(&diff, &c->column->type));
ovsdb_datum_destroy(&diff, &c->column->type);
} else {
if (!ovsdb_datum_is_default(&new[index], &c->column->type)) {
json_object_put(diff_json, c->column->name,
ovsdb_datum_to_json(&new[index],
&c->column->type));
}
}
}
op = type == OJMS_INITIAL ? "initial"
: type == OJMS_MODIFY ? "modify" : "insert";
json_object_put(row_update2, op, diff_json);
}
return row_update2;
}
static size_t
ovsdb_monitor_max_columns(struct ovsdb_monitor *dbmon)
{
struct shash_node *node;
size_t max_columns = 0;
SHASH_FOR_EACH (node, &dbmon->tables) {
struct ovsdb_monitor_table *mt = node->data;
max_columns = MAX(max_columns, mt->n_columns);
}
return max_columns;
}
static void
ovsdb_monitor_add_json_row(struct json **json, const char *table_name,
struct json **table_json, struct json *row_json,
const struct uuid *row_uuid)
{
char uuid[UUID_LEN + 1];
/* Create JSON object for transaction overall. */
if (!*json) {
*json = json_object_create();
}
/* Create JSON object for transaction on this table. */
if (!*table_json) {
*table_json = json_object_create();
json_object_put(*json, table_name, *table_json);
}
/* Add JSON row to JSON table. */
snprintf(uuid, sizeof uuid, UUID_FMT, UUID_ARGS(row_uuid));
json_object_put(*table_json, uuid, row_json);
}
/* Constructs and returns JSON for a <table-updates> object (as described in
* RFC 7047) for all the outstanding changes within 'monitor', starting from
* 'transaction'. */
static struct json*
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
ovsdb_monitor_compose_update(
struct ovsdb_monitor *dbmon,
bool initial, struct ovsdb_monitor_change_set *mcs,
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
const struct ovsdb_monitor_session_condition *condition,
compose_row_update_cb_func row_update)
{
struct json *json;
size_t max_columns = ovsdb_monitor_max_columns(dbmon);
unsigned long int *changed = xmalloc(bitmap_n_bytes(max_columns));
json = NULL;
struct ovsdb_monitor_change_set_for_table *mcst;
LIST_FOR_EACH (mcst, list_in_change_set, &mcs->change_set_for_tables) {
struct ovsdb_monitor_row *row;
struct json *table_json = NULL;
struct ovsdb_monitor_table *mt = mcst->mt;
HMAP_FOR_EACH_SAFE (row, hmap_node, &mcst->rows) {
struct json *row_json;
row_json = (*row_update)(mt, condition, OVSDB_MONITOR_ROW, row,
initial, changed, mcst->n_columns);
if (row_json) {
ovsdb_monitor_add_json_row(&json, mt->table->schema->name,
&table_json, row_json,
&row->uuid);
}
}
}
free(changed);
return json;
}
static struct json*
ovsdb_monitor_compose_cond_change_update(
struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_session_condition *condition)
{
struct shash_node *node;
struct json *json = NULL;
size_t max_columns = ovsdb_monitor_max_columns(dbmon);
unsigned long int *changed = xmalloc(bitmap_n_bytes(max_columns));
SHASH_FOR_EACH (node, &dbmon->tables) {
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
struct ovsdb_condition *old_condition, *new_condition, *diff_condition;
struct ovsdb_monitor_table *mt = node->data;
struct json *table_json = NULL;
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
struct ovsdb_row *row;
if (!ovsdb_monitor_get_table_conditions(mt,
condition,
&old_condition,
ovsdb: condition: Process condition changes incrementally. In most cases, after the condition change request, the new condition is the same as old one plus minus a few clauses. Today, ovsdb-server will evaluate every database row against all the old clauses and then against all the new clauses in order to tell if an update should be generated. For example, every time a new port is added, ovn-controller adds two new clauses to conditions for a Port_Binding table. And this condition may grow significantly in size making addition of every new port heavier on the server side. The difference between conditions is not larger and, likely, significantly smaller than old and new conditions combined. And if the row doesn't match clauses that are different between old and new conditions, that row should not be part of the update. It either matches both old and new, or it doesn't match either of them. If the row matches some clauses in the difference, then we need to perform a full match against old and new in order to tell if it should be added/removed/modified. This is necessary because different clauses may select same rows. Let's generate the condition difference and use it to avoid evaluation of all the clauses for rows not affected by the condition change. Testing shows 70% reduction in total CPU time in ovn-heater's 120-node density-light test with conditional monitoring. Average CPU usage during the test phase went down from frequent 100% spikes to just 6-8%. Note: This will not help with new connections, or re-connections, or new monitor requests after database conversion. ovsdb-server will still evaluate every database row against every clause in the condition in these cases. So, it's still important to not have too many clauses in conditions for large tables. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-05-26 19:18:43 +02:00
&new_condition,
&diff_condition) ||
!ovsdb_condition_cmp_3way(old_condition, new_condition)) {
/* Nothing to update on this table */
continue;
}
/* Iterate over all rows in table */
HMAP_FOR_EACH (row, hmap_node, &mt->table->rows) {
struct json *row_json;
row_json = ovsdb_monitor_compose_row_update2(mt, condition,
OVSDB_ROW, row,
monitor: Fix crash when monitor condition adds new columns. The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 18:19:21 -08:00
false, changed,
mt->n_columns);
if (row_json) {
ovsdb_monitor_add_json_row(&json, mt->table->schema->name,
&table_json, row_json,
ovsdb_row_get_uuid(row));
}
}
ovsdb_monitor_table_condition_updated(mt, condition);
}
free(changed);
return json;
}
/* Returns JSON for a <table-updates> object (as described in RFC 7047)
* for all the outstanding changes in dbmon that are tracked by the change set
* *p_mcs.
*
* If cond_updated is true all rows in the db that match conditions will be
* sent.
*
* The caller should specify 'initial' as true if the returned JSON is going to
* be used as part of the initial reply to a "monitor" request, false if it is
* going to be used as part of an "update" notification. */
struct json *
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
ovsdb_monitor_get_update(
struct ovsdb_monitor *dbmon,
bool initial, bool cond_updated,
struct ovsdb_monitor_session_condition *condition,
enum ovsdb_monitor_version version,
struct ovsdb_monitor_change_set **p_mcs)
{
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
struct ovsdb_monitor_json_cache_node *cache_node = NULL;
struct json *json;
struct ovsdb_monitor_change_set *mcs = *p_mcs;
ovs_assert(cond_updated ? mcs == dbmon->new_change_set : true);
/* Return a clone of cached json if one exists. Otherwise,
* generate a new one and add it to the cache. */
if (!condition || (!condition->conditional && !cond_updated)) {
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
cache_node = ovsdb_monitor_json_cache_search(dbmon, version,
mcs);
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
}
if (cache_node) {
json = cache_node->json ? json_clone(cache_node->json) : NULL;
} else {
if (version == OVSDB_MONITOR_V1) {
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
json =
ovsdb_monitor_compose_update(dbmon, initial, mcs,
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
condition,
ovsdb_monitor_compose_row_update);
} else {
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
ovs_assert(version == OVSDB_MONITOR_V2 ||
version == OVSDB_MONITOR_V3);
if (!cond_updated) {
json = ovsdb_monitor_compose_update(dbmon, initial, mcs,
condition,
ovsdb_monitor_compose_row_update2);
if (!condition || !condition->conditional) {
if (json) {
struct json *json_serialized;
/* Pre-serializing the object to avoid doing this
* for every client. */
json_serialized = json_serialized_object_create(json);
json_destroy(json);
json = json_serialized;
}
ovsdb_monitor_json_cache_insert(dbmon, version, mcs,
json);
}
} else {
/* Compose update on whole db due to condition update.
Session must be flushed (change list is empty)*/
json =
ovsdb_monitor_compose_cond_change_update(dbmon, condition);
}
}
}
/* Maintain tracking change set. */
ovsdb_monitor_untrack_change_set(dbmon, mcs);
ovsdb_monitor_track_new_change_set(dbmon);
*p_mcs = dbmon->new_change_set;
return json;
}
bool
ovsdb_monitor_needs_flush(struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_change_set *change_set)
{
ovs_assert(change_set);
return (change_set != dbmon->new_change_set);
}
void
ovsdb_monitor_table_add_select(struct ovsdb_monitor *dbmon,
const struct ovsdb_table *table,
enum ovsdb_monitor_selection select)
{
struct ovsdb_monitor_table * mt;
mt = shash_find_data(&dbmon->tables, table->schema->name);
ovs_assert(mt);
mt->select |= select;
}
/*
* If a row's change type (insert, delete or modify) matches that of
* the monitor, they should be sent to the monitor's clients as updates.
* Of cause, the monitor should also internally update with this change.
*
* When a change type does not require client side update, the monitor
* may still need to keep track of certain changes in order to generate
* correct future updates. For example, the monitor internal state should
* be updated whenever a new row is inserted, in order to generate the
* correct initial state, regardless if a insert change type is being
* monitored.
*
* On the other hand, if a transaction only contains changes to columns
* that are not monitored, this transaction can be safely ignored by the
* monitor.
*
* Thus, the order of the declaration is important:
* 'OVSDB_CHANGES_REQUIRE_EXTERNAL_UPDATE' always implies
* 'OVSDB_CHANGES_REQUIRE_INTERNAL_UPDATE', but not vice versa. */
enum ovsdb_monitor_changes_efficacy {
OVSDB_CHANGES_NO_EFFECT, /* Monitor does not care about this
change. */
OVSDB_CHANGES_REQUIRE_INTERNAL_UPDATE, /* Monitor internal updates. */
OVSDB_CHANGES_REQUIRE_EXTERNAL_UPDATE, /* Client needs to be updated. */
};
struct ovsdb_monitor_aux {
const struct ovsdb_monitor *monitor;
struct ovsdb_monitor_table *mt;
enum ovsdb_monitor_changes_efficacy efficacy;
};
static void
ovsdb_monitor_init_aux(struct ovsdb_monitor_aux *aux,
const struct ovsdb_monitor *m)
{
aux->monitor = m;
aux->mt = NULL;
aux->efficacy = OVSDB_CHANGES_NO_EFFECT;
}
static void
ovsdb_monitor_changes_update(const struct ovsdb_row *old,
const struct ovsdb_row *new,
const struct ovsdb_monitor_table *mt,
struct ovsdb_monitor_change_set_for_table *mcst)
{
ovs_assert(new || old);
const struct uuid *uuid = ovsdb_row_get_uuid(new ? new : old);
struct ovsdb_monitor_row *change = NULL;
ovs_assert(uuid);
change = ovsdb_monitor_changes_row_find(mcst, uuid);
if (!change) {
change = xzalloc(sizeof *change);
hmap_insert(&mcst->rows, &change->hmap_node, uuid_hash(uuid));
change->uuid = *uuid;
change->old = clone_monitor_row_data(mt, old, mcst->n_columns);
change->new = clone_monitor_row_data(mt, new, mcst->n_columns);
} else {
if (new) {
ovsdb: Fix segfalut during replication. The newly added replication logic makes it possible for a monitor to receive delete and insertion of the same row back to back, which was not possible before. Add logic (and comment) to handle this case to avoid follow crash reported by Valgrind: #0 0x0000000000453edd in ovsdb_datum_compare_3way (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1626 #1 0x0000000000453ea4 in ovsdb_datum_equals (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1616 #2 0x000000000041b651 in update_monitor_row_data (mt=0x5eda4a0, row=0x5efbe00, data=0x0) at ovsdb/monitor.c:310 #3 0x000000000041ed14 in ovsdb_monitor_changes_update (old=0x0, new=0x5efbe00, mt=0x5eda4a0, changes=0x5ef7180) at ovsdb/monitor.c:1255 #4 0x000000000041f12e in ovsdb_monitor_change_cb (old=0x0, new=0x5efbe00, changed=0x5efc218, aux_=0xffefff040) at ovsdb/monitor.c:1339 #5 0x000000000042ded9 in ovsdb_txn_for_each_change (txn=0x5efbd90, cb=0x41ef50 <ovsdb_monitor_change_cb>, aux=0xffefff040) at ovsdb/transaction.c:906 #6 0x0000000000420155 in ovsdb_monitor_commit (replica=0x5eda2c0, txn=0x5efbd90, durable=false) at ovsdb/monitor.c:1553 #7 0x000000000042dc04 in ovsdb_txn_commit_ (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:868 #8 0x000000000042ddd4 in ovsdb_txn_commit (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:893 #9 0x0000000000422e0c in process_notification (table_updates=0x5efad10, db=0x5e6bd40) at ovsdb/replication.c:575 #10 0x0000000000420ff3 in replication_run () at ovsdb/replication.c:184 #11 0x0000000000405cc8 in main_loop (jsonrpc=0x5e67770, all_dbs=0xffefff3a0, unixctl=0x5ebd980, remotes=0xffefff360, run_process=0x0, exiting=0xffefff3c0, is_backup=0xffefff2de) at ovsdb/ovsdb-server.c:198 #12 0x0000000000406edb in main (argc=1, argv=0xffefff550) at ovsdb/ovsdb-server.c:429 Reported-by: Joe Stringer <joe@ovn.org> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079586.html Co-authored-by: Joe Stringer <joe@ovn.org> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
2016-09-20 12:44:32 -07:00
if (!change->new) {
/* Reinsert the row that was just deleted.
*
* This path won't be hit without replication. Whenever OVSDB
* server inserts a new row, It always generates a new UUID
* that is different from the row just deleted.
*
* With replication, this path can be hit in a corner
* case when two OVSDB servers are set up to replicate
* each other. Not that is a useful set up, but can
* happen in practice.
*
* An example of how this path can be hit is documented below.
* The details is not as important to the correctness of the
* logic, but added here to convince ourselves that this path
* can be hit.
*
* Imagine two OVSDB servers that replicates from each
* other. For each replication session, there is a
* corresponding monitor at the other end of the replication
* JSONRPC connection.
*
* The events can lead to a back to back deletion and
* insertion operation of the same row for the monitor of
* the first server are:
*
* 1. A row is inserted in the first OVSDB server.
* 2. The row is then replicated to the remote OVSDB server.
* 3. The row is now deleted by the local OVSDB server. This
* deletion operation is replicated to the local monitor
* of the OVSDB server.
* 4. The monitor now receives the same row, as an insertion,
* from the replication server. Because of
* replication, the row carries the same UUID as the row
* just deleted.
*/
change->new = clone_monitor_row_data(mt, new, mcst->n_columns);
ovsdb: Fix segfalut during replication. The newly added replication logic makes it possible for a monitor to receive delete and insertion of the same row back to back, which was not possible before. Add logic (and comment) to handle this case to avoid follow crash reported by Valgrind: #0 0x0000000000453edd in ovsdb_datum_compare_3way (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1626 #1 0x0000000000453ea4 in ovsdb_datum_equals (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1616 #2 0x000000000041b651 in update_monitor_row_data (mt=0x5eda4a0, row=0x5efbe00, data=0x0) at ovsdb/monitor.c:310 #3 0x000000000041ed14 in ovsdb_monitor_changes_update (old=0x0, new=0x5efbe00, mt=0x5eda4a0, changes=0x5ef7180) at ovsdb/monitor.c:1255 #4 0x000000000041f12e in ovsdb_monitor_change_cb (old=0x0, new=0x5efbe00, changed=0x5efc218, aux_=0xffefff040) at ovsdb/monitor.c:1339 #5 0x000000000042ded9 in ovsdb_txn_for_each_change (txn=0x5efbd90, cb=0x41ef50 <ovsdb_monitor_change_cb>, aux=0xffefff040) at ovsdb/transaction.c:906 #6 0x0000000000420155 in ovsdb_monitor_commit (replica=0x5eda2c0, txn=0x5efbd90, durable=false) at ovsdb/monitor.c:1553 #7 0x000000000042dc04 in ovsdb_txn_commit_ (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:868 #8 0x000000000042ddd4 in ovsdb_txn_commit (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:893 #9 0x0000000000422e0c in process_notification (table_updates=0x5efad10, db=0x5e6bd40) at ovsdb/replication.c:575 #10 0x0000000000420ff3 in replication_run () at ovsdb/replication.c:184 #11 0x0000000000405cc8 in main_loop (jsonrpc=0x5e67770, all_dbs=0xffefff3a0, unixctl=0x5ebd980, remotes=0xffefff360, run_process=0x0, exiting=0xffefff3c0, is_backup=0xffefff2de) at ovsdb/ovsdb-server.c:198 #12 0x0000000000406edb in main (argc=1, argv=0xffefff550) at ovsdb/ovsdb-server.c:429 Reported-by: Joe Stringer <joe@ovn.org> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079586.html Co-authored-by: Joe Stringer <joe@ovn.org> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
2016-09-20 12:44:32 -07:00
} else {
update_monitor_row_data(mt, new, change->new, mcst->n_columns);
ovsdb: Fix segfalut during replication. The newly added replication logic makes it possible for a monitor to receive delete and insertion of the same row back to back, which was not possible before. Add logic (and comment) to handle this case to avoid follow crash reported by Valgrind: #0 0x0000000000453edd in ovsdb_datum_compare_3way (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1626 #1 0x0000000000453ea4 in ovsdb_datum_equals (a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1616 #2 0x000000000041b651 in update_monitor_row_data (mt=0x5eda4a0, row=0x5efbe00, data=0x0) at ovsdb/monitor.c:310 #3 0x000000000041ed14 in ovsdb_monitor_changes_update (old=0x0, new=0x5efbe00, mt=0x5eda4a0, changes=0x5ef7180) at ovsdb/monitor.c:1255 #4 0x000000000041f12e in ovsdb_monitor_change_cb (old=0x0, new=0x5efbe00, changed=0x5efc218, aux_=0xffefff040) at ovsdb/monitor.c:1339 #5 0x000000000042ded9 in ovsdb_txn_for_each_change (txn=0x5efbd90, cb=0x41ef50 <ovsdb_monitor_change_cb>, aux=0xffefff040) at ovsdb/transaction.c:906 #6 0x0000000000420155 in ovsdb_monitor_commit (replica=0x5eda2c0, txn=0x5efbd90, durable=false) at ovsdb/monitor.c:1553 #7 0x000000000042dc04 in ovsdb_txn_commit_ (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:868 #8 0x000000000042ddd4 in ovsdb_txn_commit (txn=0x5efbd90, durable=false) at ovsdb/transaction.c:893 #9 0x0000000000422e0c in process_notification (table_updates=0x5efad10, db=0x5e6bd40) at ovsdb/replication.c:575 #10 0x0000000000420ff3 in replication_run () at ovsdb/replication.c:184 #11 0x0000000000405cc8 in main_loop (jsonrpc=0x5e67770, all_dbs=0xffefff3a0, unixctl=0x5ebd980, remotes=0xffefff360, run_process=0x0, exiting=0xffefff3c0, is_backup=0xffefff2de) at ovsdb/ovsdb-server.c:198 #12 0x0000000000406edb in main (argc=1, argv=0xffefff550) at ovsdb/ovsdb-server.c:429 Reported-by: Joe Stringer <joe@ovn.org> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079586.html Co-authored-by: Joe Stringer <joe@ovn.org> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
2016-09-20 12:44:32 -07:00
}
} else {
free_monitor_row_data(mt, change->new, mcst->n_columns);
change->new = NULL;
if (!change->old) {
/* This row was added then deleted. Forget about it. */
hmap_remove(&mcst->rows, &change->hmap_node);
free(change);
}
}
}
}
static bool
ovsdb_monitor_columns_changed(const struct ovsdb_monitor_table *mt,
const unsigned long int *changed)
{
size_t i;
for (i = 0; i < mt->n_columns; i++) {
size_t column_index = mt->columns[i].column->index;
if (bitmap_is_set(changed, column_index)) {
return true;
}
}
return false;
}
/* Return the efficacy of a row's change to a monitor table.
*
* Please see the block comment above 'ovsdb_monitor_changes_efficacy'
* definition form more information. */
static enum ovsdb_monitor_changes_efficacy
ovsdb_monitor_changes_classify(enum ovsdb_monitor_selection type,
const struct ovsdb_monitor_table *mt,
const unsigned long int *changed)
{
if (type == OJMS_MODIFY &&
!ovsdb_monitor_columns_changed(mt, changed)) {
return OVSDB_CHANGES_NO_EFFECT;
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
if (type == OJMS_MODIFY) {
/* Condition might turn a modify operation to insert or delete */
type |= OJMS_INSERT | OJMS_DELETE;
}
return (mt->select & type)
? OVSDB_CHANGES_REQUIRE_EXTERNAL_UPDATE
: OVSDB_CHANGES_REQUIRE_INTERNAL_UPDATE;
}
static bool
ovsdb_monitor_change_cb(const struct ovsdb_row *old,
const struct ovsdb_row *new,
const unsigned long int *changed,
void *aux_)
{
struct ovsdb_monitor_aux *aux = aux_;
const struct ovsdb_monitor *m = aux->monitor;
struct ovsdb_table *table = new ? new->table : old->table;
struct ovsdb_monitor_table *mt;
struct ovsdb_monitor_change_set_for_table *mcst;
if (!aux->mt || table != aux->mt->table) {
aux->mt = shash_find_data(&m->tables, table->schema->name);
if (!aux->mt) {
/* We don't care about rows in this table at all. Tell the caller
* to skip it. */
return false;
}
}
mt = aux->mt;
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
enum ovsdb_monitor_selection type =
ovsdb_monitor_row_update_type(false, old, new);
enum ovsdb_monitor_changes_efficacy efficacy =
ovsdb_monitor_changes_classify(type, mt, changed);
if (efficacy > OVSDB_CHANGES_NO_EFFECT) {
LIST_FOR_EACH (mcst, list_in_mt, &mt->change_sets) {
ovsdb_monitor_changes_update(old, new, mt, mcst);
}
ovsdb: generate update notifications for monitor_cond session Hold session's conditions in ovsdb_monitor_session_condition. Pass it to ovsdb_monitor for generating "update2" notifications. Add functions that can generate "update2" notification for a "monitor_cond" session. JSON cache is enabled only for session's with true condition only. "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions described by ovsdb-server(1) manpage. Performance evaluation: OVN is the main candidate for conditional monitoring usage. It is clear that conditional monitoring reduces computation on the ovn-controller (client) side due to the reduced size of flow tables and update messages. Performance evaluation shows up to 75% computation reduction. However, performance evaluation shows also a reduction in computation on the SB ovsdb-server side proportional to the degree that each logical network is spread over physical hosts in the DC. Evaluation shows that in a realistic scenarios there is a computation reduction also in the server side. Evaluation on simulated environment of 50 hosts and 1000 logical ports shows the following results (cycles #): LN spread over # hosts| master | patch | change ------------------------------------------------------------- 1 | 24597200127 | 24339235374 | 1.0% 6 | 23788521572 | 19145229352 | 19.5% 12 | 23886405758 | 17913143176 | 25.0% 18 | 25812686279 | 23675094540 | 8.2% 24 | 28414671499 | 24770202308 | 12.8% 30 | 31487218890 | 28397543436 | 9.8% 36 | 36116993930 | 34105388739 | 5.5% 42 | 37898342465 | 38647139083 | -1.9% 48 | 41637996229 | 41846616306 | -0.5% 50 | 41679995357 | 43455565977 | -4.2% Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-18 11:45:52 +03:00
}
if (aux->efficacy < efficacy) {
aux->efficacy = efficacy;
}
return true;
}
void
ovsdb_monitor_get_initial(struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_change_set **p_mcs)
{
if (!dbmon->init_change_set) {
struct ovsdb_monitor_change_set *change_set =
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
ovsdb_monitor_add_change_set(dbmon, true, NULL);
dbmon->init_change_set = change_set;
struct ovsdb_monitor_change_set_for_table *mcst;
LIST_FOR_EACH (mcst, list_in_change_set,
&change_set->change_set_for_tables) {
if (mcst->mt->select & OJMS_INITIAL) {
struct ovsdb_row *row;
HMAP_FOR_EACH (row, hmap_node, &mcst->mt->table->rows) {
ovsdb_monitor_changes_update(NULL, row, mcst->mt, mcst);
}
}
}
} else {
dbmon->init_change_set->n_refs++;
}
*p_mcs = dbmon->init_change_set;
}
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
static bool
ovsdb_monitor_history_change_cb(const struct ovsdb_row *old,
const struct ovsdb_row *new,
const unsigned long int *changed,
void *aux)
{
struct ovsdb_monitor_change_set *change_set = aux;
struct ovsdb_table *table = new ? new->table : old->table;
struct ovsdb_monitor_change_set_for_table *mcst;
enum ovsdb_monitor_selection type =
ovsdb_monitor_row_update_type(false, old, new);
LIST_FOR_EACH (mcst, list_in_change_set,
&change_set->change_set_for_tables) {
if (mcst->mt->table == table) {
enum ovsdb_monitor_changes_efficacy efficacy =
ovsdb_monitor_changes_classify(type, mcst->mt, changed);
if (efficacy > OVSDB_CHANGES_NO_EFFECT) {
ovsdb_monitor_changes_update(old, new, mcst->mt, mcst);
}
return true;
}
}
return false;
}
void
ovsdb_monitor_get_changes_after(const struct uuid *txn_uuid,
struct ovsdb_monitor *dbmon,
struct ovsdb_monitor_change_set **p_mcs)
{
ovs_assert(*p_mcs == NULL);
ovs_assert(!uuid_is_zero(txn_uuid));
struct ovsdb_monitor_change_set *change_set =
ovsdb_monitor_find_change_set(dbmon, txn_uuid);
if (change_set) {
change_set->n_refs++;
*p_mcs = change_set;
return;
}
struct ovsdb_txn_history_node *h_node;
bool found = false;
LIST_FOR_EACH (h_node, node, &dbmon->db->txn_history) {
struct ovsdb_txn *txn = h_node->txn;
if (!found) {
/* find the txn with last_id in history */
if (uuid_equals(ovsdb_txn_get_txnid(txn), txn_uuid)) {
found = true;
change_set = ovsdb_monitor_add_change_set(dbmon, false,
txn_uuid);
}
} else {
/* Already found. Add changes in each follow up transaction to
* the new change_set. */
ovsdb_txn_for_each_change(txn, ovsdb_monitor_history_change_cb,
change_set);
}
}
*p_mcs = change_set;
}
void
ovsdb_monitor_remove_jsonrpc_monitor(struct ovsdb_monitor *dbmon,
struct ovsdb_jsonrpc_monitor *jsonrpc_monitor,
struct ovsdb_monitor_change_set *change_set)
{
struct jsonrpc_monitor_node *jm;
if (ovs_list_is_empty(&dbmon->jsonrpc_monitors)) {
ovsdb_monitor_destroy(dbmon);
return;
}
/* Find and remove the jsonrpc monitor from the list. */
LIST_FOR_EACH(jm, node, &dbmon->jsonrpc_monitors) {
if (jm->jsonrpc_monitor == jsonrpc_monitor) {
/* Release the tracked changes. */
if (change_set) {
ovsdb_monitor_untrack_change_set(dbmon, change_set);
}
ovs_list_remove(&jm->node);
free(jm);
/* Destroy ovsdb monitor if this is the last user. */
if (ovs_list_is_empty(&dbmon->jsonrpc_monitors)) {
ovsdb_monitor_destroy(dbmon);
}
return;
};
}
/* Should never reach here. jsonrpc_monitor should be on the list. */
OVS_NOT_REACHED();
}
static bool
ovsdb_monitor_table_equal(const struct ovsdb_monitor_table *a,
const struct ovsdb_monitor_table *b)
{
size_t i;
ovs_assert(b->n_columns == b->n_monitored_columns);
if ((a->table != b->table) ||
(a->select != b->select) ||
(a->n_monitored_columns != b->n_monitored_columns)) {
return false;
}
/* Compare only monitored columns that must be sorted already */
for (i = 0; i < a->n_monitored_columns; i++) {
if ((a->columns[i].column != b->columns[i].column) ||
(a->columns[i].select != b->columns[i].select)) {
return false;
}
}
return true;
}
static bool
ovsdb_monitor_equal(const struct ovsdb_monitor *a,
const struct ovsdb_monitor *b)
{
struct shash_node *node;
if (shash_count(&a->tables) != shash_count(&b->tables)) {
return false;
}
SHASH_FOR_EACH(node, &a->tables) {
const struct ovsdb_monitor_table *mta = node->data;
const struct ovsdb_monitor_table *mtb;
mtb = shash_find_data(&b->tables, node->name);
if (!mtb) {
return false;
}
if (!ovsdb_monitor_table_equal(mta, mtb)) {
return false;
}
}
return true;
}
static size_t
ovsdb_monitor_hash(const struct ovsdb_monitor *dbmon, size_t basis)
{
const struct shash_node **nodes;
size_t i, j, n;
nodes = shash_sort(&dbmon->tables);
n = shash_count(&dbmon->tables);
for (i = 0; i < n; i++) {
struct ovsdb_monitor_table *mt = nodes[i]->data;
ovs_assert(mt);
basis = hash_pointer(mt->table, basis);
basis = hash_3words(mt->select, mt->n_columns, basis);
for (j = 0; j < mt->n_columns; j++) {
basis = hash_pointer(mt->columns[j].column, basis);
basis = hash_2words(mt->columns[j].select, basis);
}
}
free(nodes);
return basis;
}
struct ovsdb_monitor *
ovsdb_monitor_add(struct ovsdb_monitor *new_dbmon)
{
struct ovsdb_monitor *dbmon;
size_t hash;
/* New_dbmon should be associated with only one jsonrpc
* connections. */
ovs_assert(ovs_list_is_singleton(&new_dbmon->jsonrpc_monitors));
ovsdb_monitor_columns_sort(new_dbmon);
hash = ovsdb_monitor_hash(new_dbmon, 0);
HMAP_FOR_EACH_WITH_HASH(dbmon, hmap_node, hash, &ovsdb_monitors) {
if (ovsdb_monitor_equal(dbmon, new_dbmon)) {
return dbmon;
}
}
hmap_insert(&ovsdb_monitors, &new_dbmon->hmap_node, hash);
return new_dbmon;
}
static void
ovsdb_monitor_destroy(struct ovsdb_monitor *dbmon)
{
struct shash_node *node;
ovs_list_remove(&dbmon->list_node);
if (!hmap_node_is_null(&dbmon->hmap_node)) {
hmap_remove(&ovsdb_monitors, &dbmon->hmap_node);
}
ovsdb_monitor_json_cache_flush(dbmon);
hmap_destroy(&dbmon->json_cache);
struct ovsdb_monitor_change_set *cs;
LIST_FOR_EACH_SAFE (cs, list_node, &dbmon->change_sets) {
ovsdb_monitor_change_set_destroy(cs);
}
SHASH_FOR_EACH (node, &dbmon->tables) {
struct ovsdb_monitor_table *mt = node->data;
ovs_assert(ovs_list_is_empty(&mt->change_sets));
free(mt->columns);
free(mt->columns_index_map);
free(mt);
}
shash_destroy(&dbmon->tables);
free(dbmon);
}
static void
ovsdb_monitor_commit(struct ovsdb_monitor *m, const struct ovsdb_txn *txn)
{
struct ovsdb_monitor_aux aux;
ovsdb_monitor_init_aux(&aux, m);
ovsdb_txn_for_each_change(txn, ovsdb_monitor_change_cb, &aux);
if (aux.efficacy > OVSDB_CHANGES_NO_EFFECT) {
/* The transaction is has impact to the monitor.
* Reset new_change_set, so that a new change set will be
* created for future trackings. */
m->new_change_set = NULL;
if (aux.efficacy == OVSDB_CHANGES_REQUIRE_EXTERNAL_UPDATE) {
ovsdb_monitor_json_cache_flush(m);
}
}
}
void
ovsdb_monitors_commit(struct ovsdb *db, const struct ovsdb_txn *txn)
{
struct ovsdb_monitor *m;
LIST_FOR_EACH (m, list_node, &db->monitors) {
ovsdb_monitor_commit(m, txn);
}
}
void
ovsdb_monitors_remove(struct ovsdb *db)
{
struct ovsdb_monitor *m;
LIST_FOR_EACH_SAFE (m, list_node, &db->monitors) {
struct jsonrpc_monitor_node *jm;
/* Delete all front-end monitors. Removing the last front-end monitor
* will also destroy the corresponding ovsdb_monitor. */
LIST_FOR_EACH_SAFE (jm, node, &m->jsonrpc_monitors) {
ovsdb_jsonrpc_monitor_destroy(jm->jsonrpc_monitor, false);
}
}
}
/* Add some memory usage statics for monitors into 'usage', for use with
* memory_report(). */
void
ovsdb_monitor_get_memory_usage(struct simap *usage)
{
struct ovsdb_monitor *dbmon;
simap_put(usage, "monitors", hmap_count(&ovsdb_monitors));
HMAP_FOR_EACH(dbmon, hmap_node, &ovsdb_monitors) {
simap_increase(usage, "json-caches", hmap_count(&dbmon->json_cache));
}
}
void
ovsdb_monitor_prereplace_db(struct ovsdb *db)
{
struct ovsdb_monitor *m;
LIST_FOR_EACH_SAFE (m, list_node, &db->monitors) {
struct jsonrpc_monitor_node *jm;
/* Delete all front-end monitors. Removing the last front-end monitor
* will also destroy the corresponding ovsdb_monitor. */
LIST_FOR_EACH_SAFE (jm, node, &m->jsonrpc_monitors) {
ovsdb_jsonrpc_monitor_destroy(jm->jsonrpc_monitor, true);
}
}
}
ovsdb-monitor: Support monitor_cond_since. Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-28 09:15:18 -08:00
const struct uuid *
ovsdb_monitor_get_last_txnid(struct ovsdb_monitor *dbmon) {
static struct uuid dummy = { .parts = { 0, 0, 0, 0 } };
if (dbmon->db->n_txn_history) {
struct ovsdb_txn_history_node *thn = CONTAINER_OF(
ovs_list_back(&dbmon->db->txn_history),
struct ovsdb_txn_history_node, node);
return ovsdb_txn_get_txnid(thn->txn);
}
return &dummy;
}