When computing tcp/udp checksums across large amounts of data -
e.g. when NIC h/w checksum offload is not available - it's worth
providing arch-dependent code; if only to compile the code w/ -O3.
Fix calculation when data is fully unaligned / on an odd byte
boundary.
Add a buffer alignment test vector.
Change-Id: I7644e2276ac6cbc3f575bf61746a6ffedbbb6150
Signed-off-by: Dave Barach <dave@barachs.net>