Optimize xxx_zero_byte_mask NEON function 56/15756/2
authorLijian Zhang <Lijian.Zhang@arm.com>
Wed, 31 Oct 2018 05:35:20 +0000 (13:35 +0800)
committerDamjan Marion <dmarion@me.com>
Wed, 7 Nov 2018 12:03:34 +0000 (12:03 +0000)
commitf5942d5612d99c5ea1189cb9f8de6b6097b0456e
treed02d1de927e7c16f985ce8f784bb968973c67e30
parentc3baf62702b7b9d339f10da48a55039e7ddc6bc9
Optimize xxx_zero_byte_mask NEON function

Optimize zero byte mask NEON functions below with less intrinsics,
and get their outputs consistent with functions in vector_sse42.h

always_inline u32 u64x2_zero_byte_mask (u64x2 input)
always_inline u32 u32x4_zero_byte_mask (u32x4 input)
always_inline u32 u16x8_zero_byte_mask (u16x8 input)
always_inline u32 u8x16_zero_byte_mask (u8x16 input)
always_inline u32 i64x2_zero_byte_mask (i64x2 input)
always_inline u32 i32x4_zero_byte_mask (i32x4 input)
always_inline u32 i16x8_zero_byte_mask (i16x8 input)
always_inline u32 i8x16_zero_byte_mask (i8x16 input)

Change-Id: I7f485915baeb37fa2dd484699b8769e0136f6574
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Reviewed-by: Sirshak Das <Sirshak.Das@arm.com>
src/vppinfra/vector_neon.h