Skip to content

Add missing vzeroupper to memset-avx2#8171

Open
octmoraru wants to merge 1 commit into
facebook:masterfrom
octmoraru:memset-fix
Open

Add missing vzeroupper to memset-avx2#8171
octmoraru wants to merge 1 commit into
facebook:masterfrom
octmoraru:memset-fix

Conversation

@octmoraru

Copy link
Copy Markdown
Contributor

This diff ensures vzerouppper is executed when memset count
is >64 and <128 bytes.

The guidance from Intel is to use vzeroupper when transitioning from
AVX to SSE code in order to avoid transition penalties:

"When the upper 128 bits of the YMM registers are set to zero by the
vzeroupper instruction, the hardware does not need to save those values,
so the hardware assists do not occur. The vzeroupper instruction must be
used after 256-bit Intel AVX code and before Intel SSE code, which will
remove both the save and the restore operations. Zeroing out the YMM
registers with other methods, such as with XORs, will not prevent
AVX-SSE transition penalties."
https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties

No expensive transitions were detected when using the Intel SDE AVX/SSE
transition checker.

This diff ensures vzerouppper is executed when memset count
is >64 and <128 bytes.

The guidance from Intel is to use vzeroupper when transitioning from
AVX to SSE code in order to avoid transition penalties:

"When the upper 128 bits of the YMM registers are set to zero by the
vzeroupper instruction, the hardware does not need to save those values,
so the hardware assists do not occur. The vzeroupper instruction must be
used after 256-bit Intel AVX code and before Intel SSE code, which will
remove both the save and the restore operations. Zeroing out the YMM
registers with other methods, such as with XORs, will not prevent
AVX-SSE transition penalties."
https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties

No expensive transitions were detected when using the Intel SDE AVX/SSE
transition checker.

@hhvm-bot hhvm-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredemmott has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants