What is the difference between batch_encode_plus() and encode_plus()

Question

I am doing a project using T5 Transformer. I have read documentations related to T5 Transformer model. While using T5Tokenizer I am kind of confused with tokenizing my sentences.

Can someone please help me understand the difference between batch_encode_plus() and encode_plus() and when should I use either of the tokenizers.

Oxbowerce · Accepted Answer · 2021-10-13 11:49:40Z

1

See also the huggingface documentation, but as the name suggests batch_encode_plus tokenizes a batch of (pairs of) sequences whereas encode_plus tokenizes just a single sequence. Looking at the documentation both of these methods are deprecated and you use __call__ instead, which checks by itself if the inputs are batched or not and calls the correct method (see the source code with the is_batched variable and if statement).

answered Oct 13, 2021 at 11:49

Oxbowerce

8,9272 gold badges11 silver badges27 bronze badges

Add a comment |

Stack Exchange Network

What is the difference between batch_encode_plus() and encode_plus()

1 Answer 1

Hot Network Questions

What is the difference between batch_encode_plus() and encode_plus()

1 Answer 1

Related

Hot Network Questions