I’m a bit confused about how OpenAI’s API rate limits work - specifically the TPM (tokens per minute) limit.
If I have, for example, 2 million TPM, is that limit calculated based on:
only the input tokens I send in my request,
or both input + output tokens combined?
I’ve seen different explanations online, so I’d love to hear from people who have tested this or know for sure. Thanks!