docs/overview/whatisvpp/what-is-vector-packet-processing.rst

   1 :orphan:
   2
   3 .. _what-is-vector-packet-processing:
   4
   5 =================================
   6 What is vector packet processing?
   7 =================================
   8
   9 FD.io VPP is developed using vector packet processing concepts, as opposed to
  10 scalar packet processing, these concepts are explained in the following sections.
  11
  12 Vector packet processing is a common approach among high performance `Userspace
  13 <https://en.wikipedia.org/wiki/User_space>`_ packet processing applications such
  14 as developed with FD.io VPP and `DPDK
  15 <https://en.wikipedia.org/wiki/Data_Plane_Development_Kit>`_. The scalar based
  16 approach tends to be favoured by Operating System `Kernel
  17 <https://en.wikipedia.org/wiki/Kernel_(operating_system)>`_ Network Stacks and
  18 Userspace stacks that don't have strict performance requirements.
  19
  20 **Scalar Packet Processing**
  21
  22 A scalar packet processing network stack typically processes one packet at a
  23 time: an interrupt handling function takes a single packet from a Network
  24 Interface, and processes it through a set of functions: fooA calls fooB calls
  25 fooC and so on.
  26
  27 .. code-block:: none
  28
  29    +---> fooA(packet1) +---> fooB(packet1) +---> fooC(packet1)
  30    +---> fooA(packet2) +---> fooB(packet2) +---> fooC(packet2)
  31    ...
  32    +---> fooA(packet3) +---> fooB(packet3) +---> fooC(packet3)
  33
  34
  35 Scalar packet processing is simple, but inefficient in these ways:
  36
  37 * When the code path length exceeds the size of the Microprocessor's instruction
  38   cache (I-cache), `thrashing
  39   <https://en.wikipedia.org/wiki/Thrashing_(computer_science)>`_ occurs as the
  40   Microprocessor is continually loading new instructions. In this model, each
  41   packet incurs an identical set of I-cache misses.
  42 * The associated deep call stack will also add load-store-unit pressure as
  43   stack-locals fall out of the Microprocessor's Layer 1 Data Cache (D-cache).
  44
  45 **Vector Packet Processing**
  46
  47 In contrast, a vector packet processing network stack processes multiple packets
  48 at a time, called 'vectors of packets' or simply a 'vector'. An interrupt
  49 handling function takes the vector of packets from a Network Interface, and
  50 processes the vector through a set of functions: fooA calls fooB calls fooC and
  51 so on.
  52
  53 .. code-block:: none
  54
  55    +---> fooA([packet1, +---> fooB([packet1, +---> fooC([packet1, +--->
  56                packet2,             packet2,             packet2,
  57                ...                  ...                  ...
  58                packet256])          packet256])          packet256])
  59
  60 This approach fixes:
  61
  62 * The I-cache thrashing problem described above, by amortizing the cost of
  63   I-cache loads across multiple packets.
  64
  65 * The inefficiencies associated with the deep call stack by receiving vectors
  66   of up to 256 packets at a time from the Network Interface, and processes them
  67   using a directed graph of node. The graph scheduler invokes one node dispatch
  68   function at a time, restricting stack depth to a few stack frames.
  69
  70 The further optimizations that this approaches enables are pipelining and
  71 prefetching to minimize read latency on table data and parallelize packet loads
  72 needed to process packets.
  73