Frequently Asked Questions

The sonar-reasoning-pro model is designed to output a <think> section containing reasoning tokens, immediately followed by a valid JSON object. As a result, the response_format parameter does not remove these reasoning tokens from the output.

We recommend using a custom parser to extract the valid JSON portion. An example implementation can be found here.

To file a bug report, please use our GitHub repository and file the bug in issues. Once you’ve submitted your report, we kindly ask that you share the link to the issue with us via email at api@perplexity.ai so we can track it on our end.

We truly appreciate your patience, and we’ll get back to you as soon as possible. Due to the current volume of reports, it may take a little time for us to respond—but rest assured, we’re on it.

Our compute is hosted via Amazon Web Services in North America. By default, the API has zero day retention of user prompt data, which is never used for AI training.

The only way for an account to be upgraded to the next usage tier is through all-time credit purchase.

Here are the spending criteria associated with each tier:

Tier	Credit Purchase (all time)
Tier 0	-
Tier 1	$50
Tier 2	$250
Tier 3	$500
Tier 4	$1000
Tier 5	$5000

We offer a way to track your billing per API key. You can do this by navigating to the following location:

Settings > View Dashboard > Invoice history > Invoices

Then click on any invoice and each item from the total bill will have a code at the end of it (e.g., pro (743S)). Those 4 characters are the last 4 of your API key.

A Feature Request is a suggestion to improve or add new functionality to the Perplexity Sonar API, such as:

Requesting support for a new model or capability (e.g., image processing, fine-tuning options)
Asking for new API parameters (e.g., additional filters, search options)
Suggesting performance improvements (e.g., faster response times, better citation handling)
Enhancing existing API features (e.g., improving streaming reliability, adding new output formats)

If your request aligns with these, please submit a feature request here: Github Feature requests

The API uses the same search system as the UI with differences in configuration—so their outputs may differ.
The underlying AI model might differ between the API and the UI for a given query.
We give users the power to tune the API to their respective use cases using sampling parameters like presence_penalty, top_p, etc. Custom tuning to specific use cases might lead to less generalization compared to the UI. We set optimized defaults and recommend not to explicitly provide sampling parameters in your API requests.

We collect the following types of information:

API Usage Data: We collect billable usage metadata such as the number of requests and tokens. You can view your own usage in the Perplexity API dashboard.

User Account Information: When you create an account with us, we collect your name, email address, and other relevant contact information.

We do not retain any query data sent through the API and do not train on any of your data.

Yes, the Sonar Models leverage information from Perplexity’s search index and the public internet.

You can find our rate limits here.

We email users about new developments and also post in the changelog.

401 error codes indicate that the provided API key is invalid, deleted, or belongs to an account which ran out of credits. You likely need to purchase more credits in the Perplexity API dashboard. You can avoid this issue by configuring auto-top-up.

Currently, we do not support fine-tuning.

Please reach out to api@perplexity.ai or support@perplexity.ai for other API inquiries. You can also post on our discussion forum and we will get back to you.

We do not guarantee this at the moment.

The models are hosted in the US and we do not train on any of your data. And no, your data is not going to China.

Yes, our reasoning APIs that use DeepSeek’s models are uncensored and on par with the other APIs in terms of content moderation.

We expose the CoTs for Sonar Reasoning Pro and Sonar Reasoning. We don’t currently expose the CoTs for Deep Research.

R1-1776 is an offline chat model that does not search the web. So this model might not have the most up-to-date information beyond its training cutoff date—which should be the same as R1.

Reasoning tokens in Deep Research are a bit different than the CoTs in the answer—these tokens are used to reason through the research material before generating the final output via the CoTs.

Yes, the API offers exactly the same internet data access as Perplexity’s web platform.

The Perplexity API is designed to be broadly compatible with OpenAI’s chat completions endpoint. It adopts a similar structure—including fields such as id, model, and usage—and supports analogous parameters like model, messages, and stream.

Key Differences from the standard OpenAI response include:

Response Object Structure:
- OpenAI responses typically have an object value of "chat.completion" and a created timestamp, whereas our response uses object: "response" and a created_at field.
- Instead of a choices array, our response content is provided under an output array that contains detailed message objects.
Message Details:
- Each message in our output includes a type (usually "message"), a unique id, and a status.
- The actual text is nested within a content array that contains objects with type, text, and an annotations array for additional context.
Additional Fields:
- Our API response provides extra meta-information (such as status, error, incomplete_details, instructions, and max_output_tokens) that are not present in standard OpenAI responses.
- The usage field also differs, offering detailed breakdowns of input and output tokens (including fields like input_tokens_details and output_tokens_details).

These differences are intended to provide enhanced functionality and additional context while maintaining broad compatibility with OpenAI’s API design.

FAQ