ascii.rs optimization since ptr is aligned

@anforowicz

On line 186, there is a call to_mm_loadu_si128:

Line 186 in 3dc5939

let chunk = _mm_loadu_si128(ptr as *const __m128i);

That call does not require the pointer to be aligned at all. But it could be replaced with _mm_load_si128, which requires alignment to a 16-byte boundary.

On line 139, the ptr is aligned to the VECTOR_ALIGN mask, which aligns it to size_of::<__m128i>(), or to 16 bytes.

bstr/src/ascii.rs

Lines 120 to 121 in 3dc5939

    
           const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>(); 
        
           const VECTOR_ALIGN: usize = VECTOR_SIZE - 1;

https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L139C9-L139C12

The ptr is always advanced by VECTOR_LOOP_SIZE in a loop, which is a multiple of 16 bytes:

bstr/src/ascii.rs

Lines 120 to 122 in 3dc5939

    
           const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>(); 
        
           const VECTOR_ALIGN: usize = VECTOR_SIZE - 1; 
        
           const VECTOR_LOOP_SIZE: usize = 4 * VECTOR_SIZE;

https://github.com/BurntSushi/bstr/blob/3dc5939f30daa1a8a6e5cc346bb77841f19ea415/src/ascii.rs#L180C36-L180C52

And then further advanced by VECTOR_SIZE which is 16 bytes:

bstr/src/ascii.rs

Line 120 in 3dc5939

const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>();

bstr/src/ascii.rs

Line 191 in 3dc5939

ptr = ptr.add(VECTOR_SIZE);

So in the loop at L186, the pointer is always aligned to 16 bytes:

bstr/src/ascii.rs

Line 186 in 3dc5939

let chunk = _mm_loadu_si128(ptr as *const __m128i);

The aligned version of the function was pointed out by @anforowicz during unsafe code audit: https://chromium-review.googlesource.com/c/chromium/src/+/5925797/comment/f08dc00c_1b24061c/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ascii.rs optimization since ptr is aligned #193

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>();
	const VECTOR_ALIGN: usize = VECTOR_SIZE - 1;

	const VECTOR_SIZE: usize = core::mem::size_of::<__m128i>();
	const VECTOR_ALIGN: usize = VECTOR_SIZE - 1;
	const VECTOR_LOOP_SIZE: usize = 4 * VECTOR_SIZE;

Uh oh!

ascii.rs optimization since ptr is aligned #193

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions