doc/guides/prog_guide/lpm6_lib.rst

   1 ..  BSD LICENSE
   2     Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
   3     All rights reserved.
   4
   5     Redistribution and use in source and binary forms, with or without
   6     modification, are permitted provided that the following conditions
   7     are met:
   8
   9     * Redistributions of source code must retain the above copyright
  10     notice, this list of conditions and the following disclaimer.
  11     * Redistributions in binary form must reproduce the above copyright
  12     notice, this list of conditions and the following disclaimer in
  13     the documentation and/or other materials provided with the
  14     distribution.
  15     * Neither the name of Intel Corporation nor the names of its
  16     contributors may be used to endorse or promote products derived
  17     from this software without specific prior written permission.
  18
  19     THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  20     "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  21     LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  22     A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  23     OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  24     SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  25     LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  26     DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  27     THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  28     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  29     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  30
  31 LPM6 Library
  32 ============
  33
  34 The LPM6 (LPM for IPv6) library component implements the Longest Prefix Match (LPM) table search method for 128-bit keys
  35 that is typically used to find the best match route in IPv6 forwarding applications.
  36
  37 LPM6 API Overview
  38 -----------------
  39
  40 The main configuration parameters for the LPM6 library are:
  41
  42 *   Maximum number of rules: This defines the size of the table that holds the rules,
  43     and therefore the maximum number of rules that can be added.
  44
  45 *   Number of tbl8s: A tbl8 is a node of the trie that the LPM6 algorithm is based on.
  46
  47 This parameter is related to the number of rules you can have,
  48 but there is no way to accurately predict the number needed to hold a specific number of rules,
  49 since it strongly depends on the depth and IP address of every rule.
  50 One tbl8 consumes 1 kb of memory. As a recommendation, 65536 tbl8s should be sufficient to store
  51 several thousand IPv6 rules, but the number can vary depending on the case.
  52
  53 An LPM prefix is represented by a pair of parameters (128-bit key, depth), with depth in the range of 1 to 128.
  54 An LPM rule is represented by an LPM prefix and some user data associated with the prefix.
  55 The prefix serves as the unique identifier for the LPM rule.
  56 In this implementation, the user data is 1-byte long and is called "next hop",
  57 which corresponds to its main use of storing the ID of the next hop in a routing table entry.
  58
  59 The main methods exported for the LPM component are:
  60
  61 *   Add LPM rule: The LPM rule is provided as input.
  62     If there is no rule with the same prefix present in the table, then the new rule is added to the LPM table.
  63     If a rule with the same prefix is already present in the table, the next hop of the rule is updated.
  64     An error is returned when there is no available space left.
  65
  66 *   Delete LPM rule: The prefix of the LPM rule is provided as input.
  67     If a rule with the specified prefix is present in the LPM table, then it is removed.
  68
  69 *   Lookup LPM key: The 128-bit key is provided as input.
  70     The algorithm selects the rule that represents the best match for the given key and returns the next hop of that rule.
  71     In the case that there are multiple rules present in the LPM table that have the same 128-bit value,
  72     the algorithm picks the rule with the highest depth as the best match rule,
  73     which means the rule has the highest number of most significant bits matching between the input key and the rule key.
  74
  75 Implementation Details
  76 ~~~~~~~~~~~~~~~~~~~~~~
  77
  78 This is a modification of the algorithm used for IPv4 (see :ref:`lpm4_details`).
  79 In this case, instead of using two levels, one with a tbl24 and a second with a tbl8, 14 levels are used.
  80
  81 The implementation can be seen as a multi-bit trie where the *stride*
  82 or number of bits inspected on each level varies from level to level.
  83 Specifically, 24 bits are inspected on the root node, and the remaining 104 bits are inspected in groups of 8 bits.
  84 This effectively means that the trie has 14 levels at the most, depending on the rules that are added to the table.
  85
  86 The algorithm allows the lookup operation to be performed with a number of memory accesses
  87 that directly depends on the length of the rule and
  88 whether there are other rules with bigger depths and the same key in the data structure.
  89 It can vary from 1 to 14 memory accesses, with 5 being the average value for the lengths
  90 that are most commonly used in IPv6.
  91
  92 The main data structure is built using the following elements:
  93
  94 *   A table with 224 entries
  95
  96 *   A number of tables, configurable by the user through the API, with 28 entries
  97
  98 The first table, called tbl24, is indexed using the first 24 bits of the IP address be looked up,
  99 while the rest of the tables, called tbl8s,
 100 are indexed using the rest of the bytes of the IP address, in chunks of 8 bits.
 101 This means that depending on the outcome of trying to match the IP address of an incoming packet to the rule stored in the tbl24
 102 or the subsequent tbl8s we might need to continue the lookup process in deeper levels of the tree.
 103
 104 Similar to the limitation presented in the algorithm for IPv4,
 105 to store every possible IPv6 rule, we would need a table with 2^128 entries.
 106 This is not feasible due to resource restrictions.
 107
 108 By splitting the process in different tables/levels and limiting the number of tbl8s,
 109 we can greatly reduce memory consumption while maintaining a very good lookup speed (one memory access per level).
 110
 111
 112 .. figure:: img/tbl24_tbl8_tbl8.*
 113
 114    Table split into different levels
 115
 116
 117 An entry in a table contains the following fields:
 118
 119 *   next hop / index to the tbl8
 120
 121 *   depth of the rule (length)
 122
 123 *   valid flag
 124
 125 *   valid group flag
 126
 127 *   external entry flag
 128
 129 The first field can either contain a number indicating the tbl8 in which the lookup process should continue
 130 or the next hop itself if the longest prefix match has already been found.
 131 The depth or length of the rule is the number of bits of the rule that is stored in a specific entry.
 132 The flags are used to determine whether the entry/table is valid or not
 133 and whether the search process have finished or not respectively.
 134
 135 Both types of tables share the same structure.
 136
 137 The other main data structure is a table containing the main information about the rules (IP, next hop and depth).
 138 This is a higher level table, used for different things:
 139
 140 *   Check whether a rule already exists or not, prior to addition or deletion,
 141     without having to actually perform a lookup.
 142
 143 When deleting, to check whether there is a rule containing the one that is to be deleted.
 144 This is important, since the main data structure will have to be updated accordingly.
 145
 146 Addition
 147 ~~~~~~~~
 148
 149 When adding a rule, there are different possibilities.
 150 If the rule's depth is exactly 24 bits, then:
 151
 152 *   Use the rule (IP address) as an index to the tbl24.
 153
 154 *   If the entry is invalid (i.e. it doesn't already contain a rule) then set its next hop to its value,
 155     the valid flag to 1 (meaning this entry is in use),
 156     and the external entry flag to 0 (meaning the lookup process ends at this point,
 157     since this is the longest prefix that matches).
 158
 159 If the rule's depth is bigger than 24 bits but a multiple of 8, then:
 160
 161 *   Use the first 24 bits of the rule as an index to the tbl24.
 162
 163 *   If the entry is invalid (i.e. it doesn't already contain a rule) then look for a free tbl8,
 164     set the index to the tbl8 to this value,
 165     the valid flag to 1 (meaning this entry is in use),
 166     and the external entry flag to 1
 167     (meaning the lookup process must continue since the rule hasn't been explored completely).
 168
 169 *   Use the following 8 bits of the rule as an index to the next tbl8.
 170
 171 *   Repeat the process until the tbl8 at the right level (depending on the depth) has been reached
 172     and fill it with the next hop, setting the next entry flag to 0.
 173
 174 If the rule's depth is any other value, prefix expansion must be performed.
 175 This means the rule is copied to all the entries (as long as they are not in use) which would also cause a match.
 176
 177 As a simple example, let's assume the depth is 20 bits.
 178 This means that there are 2^(24-20) = 16 different combinations of the first 24 bits of an IP address that would cause a match.
 179 Hence, in this case, we copy the exact same entry to every position indexed by one of these combinations.
 180
 181 By doing this we ensure that during the lookup process, if a rule matching the IP address exists,
 182 it is found in, at the most, 14 memory accesses,
 183 depending on how many times we need to move to the next table.
 184 Prefix expansion is one of the keys of this algorithm, since it improves the speed dramatically by adding redundancy.
 185
 186 Prefix expansion can be performed at any level.
 187 So, for example, is the depth is 34 bits, it will be performed in the third level (second tbl8-based level).
 188
 189 Lookup
 190 ~~~~~~
 191
 192 The lookup process is much simpler and quicker. In this case:
 193
 194 *   Use the first 24 bits of the IP address as an index to the tbl24.
 195     If the entry is not in use, then it means we don't have a rule matching this IP.
 196     If it is valid and the external entry flag is set to 0, then the next hop is returned.
 197
 198 *   If it is valid and the external entry flag is set to 1, then we use the tbl8 index to find out the tbl8 to be checked,
 199     and the next 8 bits of the IP address as an index to this table.
 200     Similarly, if the entry is not in use, then we don't have a rule matching this IP address.
 201     If it is valid then check the external entry flag for a new tbl8 to be inspected.
 202
 203 *   Repeat the process until either we find an invalid entry (lookup miss) or a valid entry with the external entry flag set to 0.
 204     Return the next hop in the latter case.
 205
 206 Limitations in the Number of Rules
 207 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 208
 209 There are different things that limit the number of rules that can be added.
 210 The first one is the maximum number of rules, which is a parameter passed through the API.
 211 Once this number is reached, it is not possible to add any more rules to the routing table unless one or more are removed.
 212
 213 The second limitation is in the number of tbl8s available.
 214 If we exhaust tbl8s, we won't be able to add any more rules.
 215 How to know how many of them are necessary for a specific routing table is hard to determine in advance.
 216
 217 In this algorithm, the maximum number of tbl8s a single rule can consume is 13,
 218 which is the number of levels minus one, since the first three bytes are resolved in the tbl24. However:
 219
 220 *   Typically, on IPv6, routes are not longer than 48 bits, which means rules usually take up to 3 tbl8s.
 221
 222 As explained in the LPM for IPv4 algorithm, it is possible and very likely that several rules will share one or more tbl8s,
 223 depending on what their first bytes are.
 224 If they share the same first 24 bits, for instance, the tbl8 at the second level will be shared.
 225 This might happen again in deeper levels, so, effectively,
 226 two 48 bit-long rules may use the same three tbl8s if the only difference is in their last byte.
 227
 228 The number of tbl8s is a parameter exposed to the user through the API in this version of the algorithm,
 229 due to its impact in memory consumption and the number or rules that can be added to the LPM table.
 230 One tbl8 consumes 1 kilobyte of memory.
 231
 232 Use Case: IPv6 Forwarding
 233 -------------------------
 234
 235 The LPM algorithm is used to implement the Classless Inter-Domain Routing (CIDR) strategy used by routers implementing IP forwarding.