🚨 Move `rotary_partial_emb` to RopeParams and delete unnecessary code 🔪 by zucchini-nlp · Pull Request #42255 · huggingface/transformers

zucchini-nlp · 2025-11-18T10:24:56Z

What does this PR do?

To finalize the work on rope config, I am moving rotary_partial_emb to rope parameter dict as well. Along with it, I did some clean-up on standardization because we can make a few assumptions with the models we have

Note, PR is breaking BC completely and users will no longer have access to config.rope_theta since I pop it from config kwargs manually. That way is more clear imo than having two rope thetas in different places

HuggingFaceDocBuilderDev · 2025-11-18T10:34:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…e rope dict

zucchini-nlp · 2025-11-26T21:57:36Z

+def get_standardized_rope_params(config):
    """
    Helper to standardize the config's rope params field by ensuring the params are defined for each
    later type. For old model the fn will duplicate a single rope param in each layer type (backward compatibility)
    """
-    rope_parameters = getattr(config, "rope_parameters", None)
-    layer_types = getattr(config, "layer_types", None)
-    if rope_theta is None:
-        rope_theta = getattr(config, "rope_theta", None)
+    rope_parameters = getattr(config, "rope_parameters", {})



could have been simplified if we make a few assumption, and we can make assumptions because only 2 models have a nested rope parameterization

vasqu

Just my 2 cents 😄

ArthurZucker

In general if we can put stuff in PreTrainedConfig I am also happy, but fine this way as well

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

…loftr.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

ArthurZucker

RotaryEmbeddingConfigMixin is my Christmas gift! ty its a lot better

zucchini-nlp · 2025-11-28T10:54:07Z

run-slow: llama, gemma3, qwen2, qwen2_vl, mistral, mixtral, modernbert, llava

github-actions · 2025-11-28T10:54:38Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: apertus, arcee, aria, bamba, bitnet, blt, chameleon, cohere, cohere2, csm, cwm, dbrx, deepseek_v2, deepseek_v3, dia, diffllama

github-actions · 2025-11-28T10:55:16Z

This comment contains run-slow, running the specified jobs:

models: ["models/gemma3", "models/llama", "models/llava", "models/mistral", "models/mixtral", "models/modernbert", "models/qwen2", "models/qwen2_vl"]
quantizations: []

zucchini-nlp · 2025-11-28T10:57:37Z

Doc-builder and weight tying tests will fail but are not related

github-actions · 2025-11-28T11:23:24Z

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

… 🔪 (huggingface#42255) * tmp * batch push * maybe better pop and break, and we'll have one theta per config in the rope dict * update a few models? * fix tests that are easu first * dont overwrite if already present!!! * partial rotary factor * more fixes to the god of fixes * setdefault * fix copies * Update src/transformers/modeling_rope_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update src/transformers/models/efficientloftr/configuration_efficientloftr.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * attempt one * update all models * fix tests * fix tests * oops * fix slow tests with nested rope models * fix copies * deal with circular import and move the mixin to base config class * fix copies * fix a few tests * update the migration guide --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

Upstream now handles this natively in huggingface/transformers#42255

tmp

b8f8dd8

zucchini-nlp added 8 commits November 18, 2025 16:02

batch push

2ee00d0

maybe better pop and break, and we'll have one theta per config in th…

b64791e

…e rope dict

update a few models?

a2b780b

fix tests that are easu first

ccc697a

dont overwrite if already present!!!

eb282d1

partial rotary factor

5a87125

more fixes to the god of fixes

6d07c32

rebase

dfa93a1

zucchini-nlp changed the title ~~[WIP] Move rotary_partial_emb to RopeParams and delete unnecessary code 🔪~~ Nov 26, 2025

zucchini-nlp commented Nov 26, 2025

View reviewed changes

Comment thread src/transformers/modeling_rope_utils.py Outdated

zucchini-nlp commented Nov 26, 2025

View reviewed changes

Comment thread src/transformers/models/apertus/configuration_apertus.py Outdated

zucchini-nlp added 2 commits November 27, 2025 09:36

setdefault

22f94e2

fix copies

6f4ed17

vasqu mentioned this pull request Nov 27, 2025

Add support for MiniMax-M2 #42028

Merged

5 tasks

vasqu reviewed Nov 27, 2025

View reviewed changes

ArthurZucker approved these changes Nov 27, 2025

View reviewed changes

Comment thread src/transformers/models/apertus/configuration_apertus.py Outdated

zucchini-nlp and others added 6 commits November 27, 2025 16:19

Update src/transformers/modeling_rope_utils.py

b3fa5cf

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

Update src/transformers/models/efficientloftr/configuration_efficient…

b2ca2eb

…loftr.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

attempt one

32adaac

update all models

5bb12c4

fix tests

f5dd9d5

fix tests

a50598e

ArthurZucker approved these changes Nov 28, 2025

View reviewed changes

Comment thread src/transformers/models/apertus/configuration_apertus.py Outdated

zucchini-nlp added 4 commits November 28, 2025 09:37

oops

e4f2b82

fix slow tests with nested rope models

f9260c4

fix copies

a1dbf30

merge main

117732b

zucchini-nlp added 4 commits November 28, 2025 11:19

deal with circular import and move the mixin to base config class

80a1283

fix copies

0bb5402

fix a few tests

3e18fd3

update the migration guide

594523e

zucchini-nlp added the for_v5? label Nov 28, 2025

zucchini-nlp changed the title ~~Move rotary_partial_emb to RopeParams and delete unnecessary code 🔪~~ Nov 28, 2025

ArthurZucker merged commit 078ff68 into huggingface:main Nov 28, 2025
20 of 25 checks passed

hmellor mentioned this pull request Nov 28, 2025

Fix RoPE failures in Transformers nightly vllm-project/vllm#29700

Merged

panyz522 mentioned this pull request Dec 1, 2025

[Bug]: rope_config_validation is removed from transformers repo, vllm/transformers_utils/configs/cohere2.py needs update wangxiongts/vllm#12

Open

1 task

This was referenced Dec 1, 2025

Add backward compatibility for methods which have been moved to RotaryEmbeddingConfigMixin #42517

Merged

Access partial_rotary_factor from rope_parameters vllm-project/vllm#29966

Merged

fxmarty-amd mentioned this pull request Dec 22, 2025

Welcome v5 #40822

Closed

BBC-Esq mentioned this pull request Mar 30, 2026

Upgrade to HF transformers >= v5 docling-project/docling#3090

Closed

jaybe1234 added a commit to jaybe1234/sglang that referenced this pull request Apr 28, 2026

fix: remove manual rope parameters injection in PretrainedConfig

bb6c36b

Upstream now handles this natively in huggingface/transformers#42255

jaybe1234 mentioned this pull request Apr 28, 2026

fix: remove manual rope parameters injection in PretrainedConfig sgl-project/sglang#23910

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚨 Move `rotary_partial_emb` to RopeParams and delete unnecessary code 🔪 #42255

🚨 Move `rotary_partial_emb` to RopeParams and delete unnecessary code 🔪 #42255
ArthurZucker merged 25 commits into
huggingface:mainfrom
zucchini-nlp:rope-params

zucchini-nlp commented Nov 18, 2025 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 18, 2025

zucchini-nlp Nov 26, 2025

Uh oh!

Uh oh!

vasqu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker left a comment

Uh oh!

zucchini-nlp commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

zucchini-nlp commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

Uh oh!

Labels

4 participants

Uh oh!

Conversation

zucchini-nlp commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

HuggingFaceDocBuilderDev commented Nov 18, 2025

zucchini-nlp Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

zucchini-nlp commented Nov 28, 2025

github-actions Bot commented Nov 28, 2025

CI Results

Uh oh!

Labels

4 participants

zucchini-nlp commented Nov 18, 2025 •

edited

Loading