-
-
Notifications
You must be signed in to change notification settings - Fork 67
Description
On line 186, there is a call to_mm_loadu_si128:
Line 186 in 3dc5939
| let chunk = _mm_loadu_si128(ptr as *const __m128i); |
That call does not require the pointer to be aligned at all. But it could be replaced with _mm_load_si128, which requires alignment to a 16-byte boundary.
On line 139, the ptr is aligned to the VECTOR_ALIGN mask, which aligns it to size_of::<__m128i>(), or to 16 bytes.
Lines 120 to 121 in 3dc5939
| const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>(); | |
| const VECTOR_ALIGN: usize = VECTOR_SIZE - 1; |
https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L139C9-L139C12
The ptr is always advanced by VECTOR_LOOP_SIZE in a loop, which is a multiple of 16 bytes:
Lines 120 to 122 in 3dc5939
| const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>(); | |
| const VECTOR_ALIGN: usize = VECTOR_SIZE - 1; | |
| const VECTOR_LOOP_SIZE: usize = 4 * VECTOR_SIZE; |
https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L180C36-L180C52
And then further advanced by VECTOR_SIZE which is 16 bytes:
Line 120 in 3dc5939
| const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>(); |
Line 191 in 3dc5939
| ptr = ptr.add(VECTOR_SIZE); |
So in the loop at L186, the pointer is always aligned to 16 bytes:
Line 186 in 3dc5939
| let chunk = _mm_loadu_si128(ptr as *const __m128i); |
The aligned version of the function was pointed out by @anforowicz during unsafe code audit: https://chromium-review.googlesource.com/c/chromium/src/+/5925797/comment/f08dc00c_1b24061c/