+
+Stateful traffic profiles
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are several important detais which distinguish ASTF profiles
+from stateless profiles.
+
+General considerations
+~~~~~~~~~~~~~~~~~~~~~~
+
+Protocols
+_________
+
+ASTF profiles are limited to either UDP or TCP protocol.
+
+Programs
+________
+
+Each template in the profile defines two "programs", one for client side
+and one for server side. Each program specifies when that side has to wait
+until enough data is received (counted in packets for UDP and in bytes for TCP)
+and when to send additional data. Together, the two programs
+define a single transaction. Due to packet loss, transaction may take longer,
+use more packets (retransmission) or never finish in its entirety.
+
+Instances
+_________
+
+Client instance is created according to TPS parameter for the trial,
+and sends the first packet of the transaction (in some cases more packets).
+Server instance is created when first packet arrives on server side,
+each instance has different address or port.
+When a program reaches its end, the instance is deleted.
+
+This creates possible issues with server instances. If the server instance
+does not read all the data client has sent, late data packets
+can cause second copy of server instance to be created,
+which breaks assumptions on how many packet a transaction should have.
+
+The need for server instances to read all the data reduces the overall
+bandwidth TRex is able to create in ASTF mode.
+
+Note that client instances are not created on packets,
+so it is safe to end client program without reading all server data
+(unless the definition of transaction success requires that).
+
+Sequencing
+__________
+
+ASTF profiles offer two modes for choosing source and destination IP addresses
+for client programs: seqential and pseudorandom.
+In current tests we are using sequential addressing only (if destination
+address varies at all).
+
+For choosing client source UDP/TCP port, there is only one mode.
+We have not investigated whether it results in sequential or pseudorandom order.
+
+For client destination UDP/TCP port, we use a constant value,
+as typical TRex usage pattern binds the server instances (of the same program)
+to a single port. (If profile defines multiple server programs, different
+programs use different ports.)
+
+Transaction overlap
+___________________
+
+If a transaction takes longer to finish, compared to period implied by TPS,
+TRex will have multiple client or server instances active at a time.
+
+During calibration testing we have found this increases CPU utilization,
+and for high TPS it can lead to TRex's Rx or Tx buffers becoming full.
+This generally leads to duration stretching, and/or packet loss on TRex.
+
+Currently used transactions were chosen to be short, so risk of bad behavior
+is decreased. But in MRR tests, where load is computed based on NIC ability,
+not TRex ability, anomalous behavior is still possible.
+
+Delays
+______
+
+TRex supports adding constant delays to ASTF programs.
+This can be useful, for example if we want to separate connection establishment
+from data transfer.
+
+But as TRex tracks delayed instances as active, this still results
+in higher CPU utilization and reduced performance issues
+(as other overlaping transactions). So the current tests do not use any delays.
+
+Keepalives
+__________
+
+Both UDP and TCP protocol implementations in TRex programs support keepalive
+duration. That means there is a configurable period of keepalive time,
+and TRex sends keepalive packets automatically (outside the program)
+for the time the program is active (started, not ended yet)
+but not sending any packets.
+
+For TCP this is generally not a big deal, as the other side usually
+retransmits faster. But for UDP it means a packet loss may leave
+the receiving program running.
+
+In order to avoid keepalive packets, keepalive value is set to a high number.
+Here, "high number" means that even at maximum scale and minimum TPS,
+there are still no keepalive packets sent within the corresponding
+(computed) trial duration. This number is kept the same also for
+smaller scale traffic profiles, to simplify maintenance.
+
+Transaction success
+___________________
+
+The transaction is considered successful at Layer-7 (L7) level
+when both program instances close. At this point, various L7 counters
+(unofficial name) are updated on TRex.
+
+We found that proper close and L7 counter update can be CPU intensive,
+whereas lower-level counters (ipackets, opackets) called L2 counters
+can keep up with higher loads.
+
+For some tests, we do not need to confirm the whole transaction was successful.
+CPS (connections per second) tests are a typical example.
+We care only for NAT44ed creating a session (needs one packet in inside-to-outside
+direction per session) and being able to use it (needs one packet
+in outside-to-inside direction).
+
+Similarly in PPS (packets per second, combining session creation
+with data transfer) tests, we care about NAT44ed ability to forward packets,
+we do not care whether aplications (TRex) can fully process them at that rate.
+
+Therefore each type of tests has its own formula (usually just one counter
+already provided by TRex) to count "successful enough" transactions
+and attempted transactions. Currently, all tests relying on L7 counters
+use size-limited profiles, so they know what the count of attempted
+transactions should be, but due to duration stretching
+TRex might have been unable to send that many packets.
+For search purposes, unattempted transactions are treated the same
+as attemted byt failed transactions.
+
+Sometimes even the number of transactions as tracked by search algorithm
+does not match the transactions as defined by ASTF programs.
+See PPS profiles below.
+
+UDP CPS
+~~~~~~~
+
+This profile uses a minimalistic transaction to verify NAT44ed session has been
+created and it allows outside-to-inside traffic.
+
+Client instance sends one packet and ends.
+Server instance sends one packet upon creation and ends.
+
+In principle, packet size is configurable,
+but currently used tests apply only one value (64 bytes frame).
+
+Transaction counts as attempted when opackets counter increases on client side.
+Transaction counts as successful when ipackets counter increases on client side.
+
+TCP CPS
+~~~~~~~
+
+This profile uses a minimalistic transaction to verify NAT44ed session has been
+created and it allows outside-to-inside traffic.
+
+Client initiates TCP connection. Client waits until connection is confirmed
+(by reading zero data bytes). Client ends.
+Server accepts the connection. Server waits for indirect confirmation
+from client (by waiting for client to initiate close). Server ends.
+
+Without packet loss, the whole transaction takes 7 packets to finish
+(4 and 3 per direction, respectively).
+From NAT44ed point of view, only the first two are needed to verify
+the session got created.
+
+Packet size is not configurable, but currently used tests report
+frame size as 64 bytes.
+
+Transaction counts as attempted when tcps_connattempt counter increases
+on client side.
+Transaction counts as successful when tcps_connects counter increases
+on client side.
+
+UDP PPS
+~~~~~~~
+
+This profile uses a small transaction of "request-response" type,
+with several packets simulating data payload.
+
+Client sends 33 packets and closes immediately.
+Server reads all 33 packets (needed to avoid late packets creating new
+server instances), then sends 33 packets and closes.
+The value 33 was chosen ad-hoc (1 "protocol" packet and 32 "data" packets).
+It is possible other values would still be safe from avoiding overlapping
+transactions point of view.
+
+..
+ TODO: 32 was chosen as it is a batch size DPDK driver puts on the PCIe bus
+ at a time. May want to verify this with TRex ASTF devs and see if better
+ UDP transaction sizes can be found to yield higher performance out of TRex.
+
+In principle, packet size is configurable,
+but currently used tests apply only one value (64 bytes frame)
+for both "protocol" and "data" packets.
+
+As this is a PPS tests, we do not track the big 66 packet transaction.
+Similarly to stateless tests, we treat each packet as a "transaction"
+for search algorthm purposes. Therefore a "transaction" is attempted
+when opacket counter on client or server side is increased.
+Transaction is successful if ipacket counter on client or server side
+is increased.
+
+If one of 33 client packets is lost, server instance will get stuck
+in the reading phase. This probably decreases TRex performance,
+but it leads to more stable results.
+
+TCP PPS
+~~~~~~~
+
+This profile uses a small transaction of "request-response" type,
+with some data size to be transferred both ways.
+
+Client connects, sends 11111 bytes of data, receives 11111 of data and closes.
+Server accepts connection, reads 11111 bytes of data, sends 11111 bytes
+of data and closes.
+Server read is needed to avoid premature close and second server instance.
+Client read is not stricly needed, but acks help TRex to close server quickly,
+thus saving CPU and improving performance.
+
+The value of 11111 bytes was chosen ad-hoc. It leads to 22 packets
+(11 each direction) to be exchanged if no loss occurs.
+In principle, size of data packets is configurable via setting
+maximum segment size. Currently that is not applied, so the TRex default value
+(1460 bytes) is used, while the test name still (wrongly) mentions
+64 byte frame size.
+
+Exactly as in UDP_PPS, ipackets and opackets counters are used for counting
+"transactions" (in fact packets).
+
+If packet loss occurs, there is large transaction overlap, even if most
+ASTF programs finish eventually. This leads to big duration stretching
+and somehow uneven rate of packets sent. This makes it hard to interpret
+MRR results, but NDR and PDR results tend to be stable enough.
+
+Ip4base tests
+^^^^^^^^^^^^^
+
+Contrary to stateless traffic profiles, we do not have a simple limit
+that would guarantee TRex is able to send traffic at specified load.
+For that reason, we have added tests where "nat44ed" is replaced by "ip4base".
+Instead of NAT44ed processing, the tests set minimalistic IPv4 routes,
+so that packets are forwarded in both inside-to-outside and outside-to-inside
+directions.
+
+The packets arrive to server end of TRex with different source address&port
+than in NAT44ed tests (no translation to outside values is done with ip4base),
+but those are not specified in the stateful traffic profiles.
+The server end uses the received address&port as destination
+for outside-to-inside traffic. Therefore the same stateful traffic profile
+works for both NAT44ed and ip4base test (of the same scale).
+
+The NAT44ed results are displayed together with corresponding ip4base results.
+If they are similar, TRex is probably the bottleneck.
+If NAT44ed result is visibly smaller, it describes the real VPP performance.