2
0
mirror of https://gitlab.com/apparmor/apparmor synced 2025-08-22 01:57:43 +00:00
John Johansen 22855508e8 Add Differential State Compression to the DFA
Differential state compression encodes a state's transitions as the
difference between the state and its default state (the state it is
relative too).

This reduces the number of transitions that need to be stored in the
transition table, hence reducing the size of the dfa.  There is a
trade off in that a single input character may have to traverse more
than one state.  This is somewhat offset by reduced table sizes providing
better locality and caching properties.

With carefully encoding we can still make constant match time guarentees.
This patch guarentees that a state that is differentially encoded will do at
most 3m state traversal to match an input of length m (as opposed to a
non-differentially compressed dfa doing exactly m state traversals).
In practice the actually number of extra traversals is less than this becaus
we selectively choose which states are differentially encoded.

In addition to reducing the size of the dfa by reducing the number of
transitions that have to be stored.  Differential encoding reduces the
number of transitions that need to be considered by comb compression,
which can result in tighter packing, due to a reduction in sparseness, and
also reduces the time spent in comb compression which currently uses an
O(n^2) algorithm.

Differential encoding will always result in a DFA that is smaller or equal
in size to the encoded DFA, and will usually improve compilation times,
with the performance improvements increasing as the DFA gets larger.

Eg. Given a example DFA that created 8991 states after minimization.
* If only comb compression (current default) is used

 52057 transitions are packed into a table of 69591 entries. Achieving an
 efficiency of about 75% (an average of about 7.74 table entries per state).
 With a resulting compressed dfa16 size of 404238 bytes and a run time for
 the dfa compilation of
   real 0m9.037s
   user 0m8.893s
   sys  0m0.036s

* If differential encoding + comb compression is used, 8292 of the 8991
  states are differentially encoded, with 31557 trans removed.  Resulting in

  20500 transitions are packed into a table of 20675 entries.  Acheiving an
  efficiency of about 99.2% (an average of about 2.3 table entries per state
  With a resulting compressed dfa16 size of 207874 bytes (about 48.6%
  reduction) and a run time for the dfa compilation of
   real 0m5.416s (about 40% faster)
   user 0m5.280s
   sys  0m0.040s

Repeating with a larger DFA that has 17033 states after minimization.
* If only comb compression (current default) is used

 102992 transitions are packed into a table of 137987 entries.  Achieving
 an efficiency of about 75% (an average of about 8.10 entries per state).
 With a resultant compressed dfa16 size of 790410 bytes and a run time for d
 compilation of
  real  0m28.153s
  user  0m27.634s
  sys   0m0.120s

* with differential encoding
 39374 transition are packed into a table of 39594 entries. Achieving an
 efficiency of about 99.4% (an average of about 2.32 entries per state).
 With a resultant compressed dfa16 size of 396838 bytes (about 50% reduction
 and a run time for dfa compilation of
  real  0m11.804s (about 58% faster)
  user  0m11.657s
  sys   0m0.084s

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2014-01-09 16:55:55 -08:00

60 lines
2.1 KiB
C

/*
* (C) 2006, 2007 Andreas Gruenbacher <agruen@suse.de>
* Copyright (c) 2003-2008 Novell, Inc. (All rights reserved)
* Copyright 2009-2012 Canonical Ltd.
*
* The libapparmor library is licensed under the terms of the GNU
* Lesser General Public License, version 2.1. Please see the file
* COPYING.LGPL.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef APPARMOR_RE_H
#define APPARMOR_RE_H
typedef int dfaflags_t;
#define DFA_CONTROL_EQUIV (1 << 0)
#define DFA_CONTROL_TREE_NORMAL (1 << 1)
#define DFA_CONTROL_TREE_SIMPLE (1 << 2)
#define DFA_CONTROL_TREE_LEFT (1 << 3)
#define DFA_CONTROL_MINIMIZE (1 << 4)
#define DFA_CONTROL_MINIMIZE_HASH_TRANS (1 << 5)
#define DFA_CONTROL_FILTER_DENY (1 << 6)
#define DFA_CONTROL_REMOVE_UNREACHABLE (1 << 7)
#define DFA_CONTROL_TRANS_HIGH (1 << 8)
#define DFA_CONTROL_DIFF_ENCODE (1 << 9)
#define DFA_DUMP_DIFF_PROGRESS (1 << 10)
#define DFA_DUMP_DIFF_ENCODE (1 << 11)
#define DFA_DUMP_DIFF_STATS (1 << 12)
#define DFA_DUMP_MIN_PARTS (1 << 13)
#define DFA_DUMP_UNIQ_PERMS (1 << 14)
#define DFA_DUMP_MIN_UNIQ_PERMS (1 << 15)
#define DFA_DUMP_TREE_STATS (1 << 16)
#define DFA_DUMP_TREE (1 << 17)
#define DFA_DUMP_SIMPLE_TREE (1 << 18)
#define DFA_DUMP_PROGRESS (1 << 19)
#define DFA_DUMP_STATS (1 << 20)
#define DFA_DUMP_STATES (1 << 21)
#define DFA_DUMP_GRAPH (1 << 22)
#define DFA_DUMP_TRANS_PROGRESS (1 << 23)
#define DFA_DUMP_TRANS_STATS (1 << 24)
#define DFA_DUMP_TRANS_TABLE (1 << 25)
#define DFA_DUMP_EQUIV (1 << 26)
#define DFA_DUMP_EQUIV_STATS (1 << 27)
#define DFA_DUMP_MINIMIZE (1 << 28)
#define DFA_DUMP_UNREACHABLE (1 << 29)
#define DFA_DUMP_RULE_EXPR (1 << 30)
#define DFA_DUMP_NODE_TO_DFA (1 << 31)
#endif /* APPARMOR_RE_H */