:rotating_light: [`v5`] Remove relative position embeddings (for bert like models) by vasqu · Pull Request #41170 · huggingface/transformers

vasqu · 2025-09-25T18:28:04Z

These embedding types are barely used and make the modeling files just more complex without justifying their existence. Position embedding types still exist in a few models; this PR just addresses the relative_key(_query) ones.

Some stats:

None of the slow tests use them except bert
The respective models in those tests together have less than 2k downloads in the last month

cc @hmellor this should remove any clashes with the kwargs you encountered in vLLM :D

vasqu · 2025-09-25T18:28:50Z

run-slow: flava, instructblib, mra

vasqu · 2025-09-25T18:29:59Z

This is mostly due to me forgetting to update them in my bert refactor PR --> big diff because the whole refactor is included (same for the roberta example)

Updated: Only includes the changes here now

github-actions · 2025-09-25T18:30:34Z

This comment contains run-slow, running the specified jobs:

models: ['models/flava', 'models/mra']
quantizations: [] ...

HuggingFaceDocBuilderDev · 2025-09-25T18:37:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vasqu · 2025-09-25T18:40:53Z

run-slow: instructblip

github-actions · 2025-09-25T18:42:25Z

This comment contains run-slow, running the specified jobs:

models: ['models/instructblip']
quantizations: [] ...

vasqu · 2025-09-26T11:16:26Z

run-slow: bert, roberta, albert, mra, instructblip, blip_2, flava

github-actions · 2025-09-26T11:17:59Z

This comment contains run-slow, running the specified jobs:

models: ['models/albert', 'models/bert', 'models/blip_2', 'models/flava', 'models/instructblip', 'models/mra', 'models/roberta']
quantizations: [] ...

vasqu · 2025-09-26T11:31:00Z

Failing slow tests are the same as in main 👀

zucchini-nlp

Thanks, super nice clean-up! 🧼

zucchini-nlp · 2025-09-30T17:01:54Z

        return embeddings


 def eager_attention_forward(


i think now we can copy bert from llama or another big model group? 👀 Keeping less sources of truth makes it easier to submit PRs

Let me make a follow-up PR for that, would like to sync bert and bart instead tho since llama would indicate causal masks which is not the case here + unnecessary gqa dependency from llama

Opened #41248 for the sync

zucchini-nlp · 2025-09-30T17:04:52Z

-        if self.position_embedding_type == "relative_key" or self.position_embedding_type == "relative_key_query":
-            seq_length = hidden_states.size()[1]
-            position_ids_l = torch.arange(seq_length, dtype=torch.long, device=hidden_states.device).view(-1, 1)
-            position_ids_r = torch.arange(seq_length, dtype=torch.long, device=hidden_states.device).view(1, -1)
-            distance = position_ids_l - position_ids_r
-            positional_embedding = self.distance_embedding(distance + self.max_position_embeddings - 1)
-            positional_embedding = positional_embedding.to(dtype=query_layer.dtype)  # fp16 compatibility
-
-            if self.position_embedding_type == "relative_key":
-                relative_position_scores = torch.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
-                attention_scores = attention_scores + relative_position_scores
-            elif self.position_embedding_type == "relative_key_query":
-                relative_position_scores_query = torch.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
-                relative_position_scores_key = torch.einsum("bhrd,lrd->bhlr", key_layer, positional_embedding)
-                attention_scores = attention_scores + relative_position_scores_query + relative_position_scores_key
-


happy to see it, I assumed that BLIP models use relative positions haha. Now it can support attention implementation API in qformer 🙌🏻

yea, it's honestly a bit baffling how many models have this while not using it at all 👀

ArthurZucker

Longdue ! Thanks

github-actions · 2025-10-03T00:46:19Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: albert, align, altclip, bert, bert_generation, big_bird, blip, blip_2, bridgetower, bros, camembert, canine, chinese_clip, clap, data2vec, dpr

vasqu · 2025-10-06T12:21:36Z

Merging as nothing has changed, just forgot to merge 😅

…uggingface#41170) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops

vasqu commented Sep 25, 2025

View reviewed changes

vasqu added 4 commits September 25, 2025 20:48

remove from modeling files

967cb1d

remaining changes

e0569a5

style / copies

34c3605

revert deprecated models and fixup some models

0dbd18b

vasqu force-pushed the remove-relative-positions-bert-likes branch from 6046d27 to 0dbd18b Compare September 25, 2025 18:49

vasqu marked this pull request as ready for review September 26, 2025 11:14

Merge branch 'main' into remove-relative-positions-bert-likes

9cb62af

vasqu requested review from ArthurZucker, Cyrilvallez and zucchini-nlp September 26, 2025 11:17

vasqu added 2 commits September 30, 2025 17:09

Merge branch 'main' into remove-relative-positions-bert-likes

b95b5be

oops

c626431

vasqu added the for_v5? label Sep 30, 2025

zucchini-nlp approved these changes Oct 1, 2025

View reviewed changes

vasqu mentioned this pull request Oct 1, 2025

[v5] Sync Bert and Bart eager attention #41248

Merged

ArthurZucker approved these changes Oct 2, 2025

View reviewed changes

vasqu and others added 2 commits October 2, 2025 19:45

Merge branch 'main' into remove-relative-positions-bert-likes

47142fc

Merge branch 'main' into remove-relative-positions-bert-likes

04def6b

vasqu merged commit c27b67f into huggingface:main Oct 6, 2025
25 checks passed

vasqu deleted the remove-relative-positions-bert-likes branch October 6, 2025 12:21

vasqu mentioned this pull request Oct 9, 2025

Welcome v5 #40822

Closed

BBC-Esq mentioned this pull request Mar 30, 2026

Upgrade to HF transformers >= v5 docling-project/docling#3090

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚨 [`v5`] Remove relative position embeddings (for bert like models)#41170

🚨 [`v5`] Remove relative position embeddings (for bert like models)#41170
vasqu merged 9 commits into
huggingface:mainfrom
vasqu:remove-relative-positions-bert-likes

vasqu commented Sep 25, 2025 •

edited

Loading

vasqu commented Sep 25, 2025

vasqu Sep 25, 2025

vasqu Sep 30, 2025

github-actions Bot commented Sep 25, 2025

HuggingFaceDocBuilderDev commented Sep 25, 2025

vasqu commented Sep 25, 2025

github-actions Bot commented Sep 25, 2025

vasqu commented Sep 26, 2025

github-actions Bot commented Sep 26, 2025

vasqu commented Sep 26, 2025

zucchini-nlp left a comment

zucchini-nlp Sep 30, 2025

vasqu Oct 1, 2025 •

edited

Loading

vasqu Oct 1, 2025

zucchini-nlp Sep 30, 2025

vasqu Oct 1, 2025

ArthurZucker left a comment

github-actions Bot commented Oct 3, 2025

vasqu commented Oct 6, 2025

Uh oh!

Labels

4 participants

Uh oh!

Conversation

vasqu commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

vasqu commented Sep 25, 2025

vasqu Sep 25, 2025

Choose a reason for hiding this comment

vasqu Sep 30, 2025

Choose a reason for hiding this comment

github-actions Bot commented Sep 25, 2025

HuggingFaceDocBuilderDev commented Sep 25, 2025

vasqu commented Sep 25, 2025

github-actions Bot commented Sep 25, 2025

vasqu commented Sep 26, 2025

github-actions Bot commented Sep 26, 2025

vasqu commented Sep 26, 2025

zucchini-nlp left a comment

Choose a reason for hiding this comment

zucchini-nlp Sep 30, 2025

Choose a reason for hiding this comment

vasqu Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

vasqu Oct 1, 2025

Choose a reason for hiding this comment

zucchini-nlp Sep 30, 2025

Choose a reason for hiding this comment

vasqu Oct 1, 2025

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

github-actions Bot commented Oct 3, 2025

vasqu commented Oct 6, 2025

Uh oh!

Labels

4 participants

vasqu commented Sep 25, 2025 •

edited

Loading

vasqu Oct 1, 2025 •

edited

Loading