vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 7.2k
Star 46.5k

Code
Issues 1.8k
Pull requests 599
Discussions
Actions
Projects 11
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 47 Milestones 1

New pull request New

599 Open 8,059 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix] validate grammar and throw 400 error instead of crashing the engine when xgrammar validation fails structured-output v1

#17623 opened May 4, 2025 by Jason-CKY

Loading…

[PERF] Speed up of prepare_inputs / mrope

#17617 opened May 3, 2025 by vadiklyutiy

Loading…

Use git-path commit in hook

#17616 opened May 3, 2025 by thomasjpfan

Loading…

[BugFix] Fix --disable-log-stats in V1 server mode bug

Something isn't working

ready

ONLY add when PR is ready to merge/full CI is needed

#17600 opened May 2, 2025 by njhill

Loading…

Enable Pydantic mypy checks and convert configs to Pydantic dataclasses frontend structured-output tpu

Related to Google TPUs

#17599 opened May 2, 2025 by hmellor

Loading…

[V1] Disable pickle by default for new serial_utils usage v1

#17596 opened May 2, 2025 by russellb

Loading…

[Security] Document StatelessProcessGroup security concerns documentation

Improvements or additions to documentation

#17591 opened May 2, 2025 by russellb

Loading…

Feature/vllm/input embedding completion api frontend

#17590 opened May 2, 2025 by Nan2018 • Draft

[Model] 1.58bits BitNet Model Support documentation

Improvements or additions to documentation

#17588 opened May 2, 2025 by Alex4210987

Loading…

[Bugfix][ROCm] Fix incorrect casting in GPTQ GEMM kernel

#17583 opened May 2, 2025 by nlzy

Loading…

Make key optional for rotary embedding ready

ONLY add when PR is ready to merge/full CI is needed

#17566 opened May 1, 2025 by sarckk

Loading…

[WIP] Support multiple kv connectors ci/build frontend v1

#17564 opened May 1, 2025 by mgoin • Draft

AMD tests updated experiment ci/build

#17563 opened May 1, 2025 by Concurrensee

Loading…

Improve configs - the rest! frontend structured-output

#17562 opened May 1, 2025 by hmellor • Draft

[WIP][V1][Spec Decode] EAGLE tree-attention v1

#17560 opened May 1, 2025 by wwl2755 • Draft

3 of 9 tasks

AMD experimental all tests updated EXPERIMENT (no need to merge) ci/build needs-rebase

#17556 opened May 1, 2025 by Alexei-V-Ivanov-AMD

Loading…

[WIP] Initial attempt to add microbatching functionality to RowParallelLinear

#17552 opened May 1, 2025 by SageMoore • Draft

[Perf] API-server scaleout with all-to-all server-engine comms frontend v1

#17546 opened May 1, 2025 by njhill • Draft

[Misc] add get kv cache token capacity frontend v1

#17538 opened May 1, 2025 by lengrongfu

Loading…

[FEAT][ROCm]: Support AITER MLA on V1 Engine ci/build rocm

Related to AMD ROCm

#17523 opened May 1, 2025 by vllmellm

Loading…

[prototype] prioritized block soft pinning/evictions documentation

Improvements or additions to documentation

frontend v1

#17520 opened May 1, 2025 by simon-mo • Draft

[V1] Add num_cached_tokens stats for request output ready

ONLY add when PR is ready to merge/full CI is needed

#17519 opened May 1, 2025 by simon-mo

Loading…

[Bugfix][Model] vllm-v0 engine run eagle algo with qwen2.5 model, KeyError: 'norm.weight' bugfix

#17518 opened May 1, 2025 by Greatpanc

Loading…

[Bugfix][V1][Spec Dec] Add generator to request even when no seed is provided. speculative-decoding v1

#17509 opened May 1, 2025 by luyuzhe111

Loading…

[BugFix] Qwen3 tool calling failed using qwen3 reasoning parser. documentation

Improvements or additions to documentation

frontend tool-calling

#17506 opened Apr 30, 2025 by Xu-Wenqing

Loading…

Previous 1 2 3 4 5 … 23 24 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly