[release/10.0] Fix SIGILL crash on ARM64 platforms with SME but no SVE #127518
Merged
Conversation
Replace CONTEXT_GetSveLengthFromOS() calls in signal context handling with direct reads of sve->vl from the kernel-provided signal frame. The CONTEXT_GetSveLengthFromOS function executes the SVE 'rdvl' instruction, which causes SIGILL on platforms that have SME (streaming SVE) but not standalone SVE — such as Apple M4 under macOS Virtualization.Framework with Podman/Colima. On these platforms, the Linux kernel includes an SVE_MAGIC record in signal frames (with vl=0 and minimal size) due to SME's streaming SVE mode, but the CPU does not support SVE instructions. When a signal fires (e.g. SIGUSR1 for activation injection), CONTEXTFromNativeContext sees the SVE record and calls rdvl, which triggers SIGILL. The SIGILL handler then tries to capture context again, hitting rdvl recursively. The fix uses sve->vl from the signal frame directly, which is always available when an SVE context record is present. On real SVE hardware, sve->vl equals what rdvl would return. On SME-only platforms, sve->vl is 0, so the SVE register save/restore is correctly skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove the now-tautological _ASSERTE((sve->vl > 0) && (sve->vl % 16 == 0)) inside the 'if (sve->vl == 16)' block (janvorli). - In CONTEXTToNativeContext, derive vq from sve->vl (the signal frame's authoritative layout) instead of lpContext->Vl (copilot-reviewer). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused CONTEXT_GetSveLengthFromOS definition (context2.S) and declaration (context.h) since all callers now use sve->vl (am11). - Reword comments to not reference the deleted function (am11). - Replace redundant assert with meaningful size check (janvorli/AndyAyersMS): _ASSERTE(sve->head.size >= SVE_SIG_CONTEXT_SIZE(sve_vq_from_vl(16))) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
|
@janvorli PTAL |
janvorli
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #127398 to release/10.0
/cc @AndyAyersMS
Customer Impact
When building .NET 10 applications in Docker on Apple M4 hardware, the .NET 10 SDK (version 10.0.101) crashes with "Illegal instruction (core dumped)" on ARM64 images. The crash occurs intermittently during various .NET CLI operations including dotnet new, dotnet add package, and dotnet build.
#122608
Regression
Regression over 9.0
Testing
Verified creating new dotnet console projects in an Ubuntu container on an M4 works reliably (previously it would fail 5-10% of the time).
Risk
Low, handles a situation unique to running in a Linux container on a Mac M4.