-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Core] Make sleep-mode backend capability flags communicator-agnostic
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#47243
opened Jul 1, 2026 by
matteso1
Contributor
Loading…
[CI/Build] Fix LoRA testing
ready
ONLY add when PR is ready to merge/full CI is needed
#47242
opened Jul 1, 2026 by
jeejeelee
Member
Loading…
4 tasks
[do not merge][test only][xpu][ci]Debug Intel B50 agent
ci/build
intel-gpu
Related to Intel GPU
#47240
opened Jul 1, 2026 by
zxd1997066
Contributor
•
Draft
4 tasks
[BugFix][Spec Decode] Compact shared topk indices buffer after first MTP draft step
bug
Something isn't working
deepseek
Related to DeepSeek models
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
#47238
opened Jul 1, 2026 by
TheEpicDolphin
Collaborator
Loading…
[Bugifx][INC] Fix INC quantization method selection for non-quantized layers
#47237
opened Jul 1, 2026 by
lvliang-intel
Loading…
4 tasks
fix(simple_kv_offload): restore evicted CPU blocks and fix cursor advancement
v1
#47235
opened Jul 1, 2026 by
Alex-ai-future
Contributor
Loading…
[KVOffload] Document CPU eviction behavior as reference (control branch)
v1
#47234
opened Jul 1, 2026 by
Alex-ai-future
Contributor
Loading…
Revert "Remove more unnecessary Related to DeepSeek models
llama
Related to Llama models
mistral
Related to Mistral models
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
speculative-decoding
load_weights methods" (#47058)
deepseek
#47233
opened Jul 1, 2026 by
vllm-agent
Contributor
•
Draft
Revert "[Platform] Replace Related to CPU backends
intel-gpu
Related to Intel GPU
kv-connector
multi-modality
Related to multi-modality (#4194)
nvidia
v1
torch.cuda.mem_get_info with torch.accelerator.get_memory_info" (#44825)
cpu
#47232
opened Jul 1, 2026 by
vllm-agent
Contributor
•
Draft
[XPU][CI] Add tests/v1/e2e/general/test_correctness_sliding_window.py in Intel GPU CI
ci/build
intel-gpu
Related to Intel GPU
#47231
opened Jul 1, 2026 by
zxd1997066
Contributor
Loading…
4 tasks done
fix(serve): return HTTP 422 instead of 500 for image/media URL fetch errors
frontend
gpt-oss
Related to GPT-OSS models
multi-modality
Related to multi-modality (#4194)
#47230
opened Jul 1, 2026 by
aoright
Loading…
[DSV4] Better MXFP8 quantization kernel
ready
ONLY add when PR is ready to merge/full CI is needed
#47229
opened Jul 1, 2026 by
zyongye
Member
Loading…
fix(deepseek_v4): resolve auto kv-cache-dtype to fp8_ds_mla on SM120
deepseek
Related to DeepSeek models
nvidia
#47228
opened Jul 1, 2026 by
hclsys
Contributor
Loading…
[Doc] add AI Runway to integrations
documentation
Improvements or additions to documentation
#47227
opened Jul 1, 2026 by
robert-cronin
Loading…
4 tasks done
[Perf][Model] Build Phi4MM Conformer streaming mask on target device
#47225
opened Jun 30, 2026 by
Juice-XIJ
Loading…
4 tasks done
[Perf] Build prompt mask with presence-only scatter in apply_penalties
#47224
opened Jun 30, 2026 by
Arthur-St-06
Loading…
[Bugfix] Fix online-quant MoE loading zero weights after #47058
bug
Something isn't working
#47221
opened Jun 30, 2026 by
mgoin
Member
Loading…
[AMD][EPLB] Enable EPLB for Quark OCP MXFP4 MoE
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#47220
opened Jun 30, 2026 by
okorzh-amd
Contributor
Loading…
4 tasks
[Bugfix] Include SM12x in _is_fa4_supported() compute capability check
bug
Something isn't working
#47218
opened Jun 30, 2026 by
tgmerritt
Loading…
[Bugfix][Gemma4] Keep image bidirectional attention within the sliding window
bug
Something isn't working
v1
#47217
opened Jun 30, 2026 by
lucianommartins
Contributor
Loading…
[Spec Decode][DSpark] Add Gemma4-12B DSpark draft model (stacked on #46995)
new-model
Requests to new models
performance
Performance-related issues
qwen
Related to Qwen models
v1
[Misc][Docs] Remove stale VLLM_MOE_DP_CHUNK_SIZE tip
documentation
Improvements or additions to documentation
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.