1
$\begingroup$

I am doing a project using T5 Transformer. I have read documentations related to T5 Transformer model. While using T5Tokenizer I am kind of confused with tokenizing my sentences.

Can someone please help me understand the difference between batch_encode_plus() and encode_plus() and when should I use either of the tokenizers.

$\endgroup$

1 Answer 1

1
$\begingroup$

See also the huggingface documentation, but as the name suggests batch_encode_plus tokenizes a batch of (pairs of) sequences whereas encode_plus tokenizes just a single sequence. Looking at the documentation both of these methods are deprecated and you use __call__ instead, which checks by itself if the inputs are batched or not and calls the correct method (see the source code with the is_batched variable and if statement).

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.