models-inference.md

Inference Providers

Please refer to the Inference Providers Documentation for detailed information.

What is HF-Inference API?

HF-Inference API is one of the many providers available on the Hugging Face Hub. It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers.

For more details about the HF-Inference API, check out it's dedicated page.

What technology do you use to power the HF-Inference API?

The HF-Inference API is powered by Inference Endpoints under the hood.

Why don't I see an inference widget, or why can't I use the API?

For some tasks, there might not be support by any Inference Provider, and hence, there is no widget.

How can I see my usage?

To check usage across all providers, check out your billing page.

To check your HF-Inference usage specifically, check out the Inference Dashboard. The dashboard shows both your serverless and dedicated endpoints usage.

Is there programmatic access to Inference Providers?

Yes! We provide client wrappers in both JS and Python:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models-inference.md

models-inference.md

Inference Providers

What is HF-Inference API?

What technology do you use to power the HF-Inference API?

Why don't I see an inference widget, or why can't I use the API?

How can I see my usage?

Is there programmatic access to Inference Providers?

Files

models-inference.md

Latest commit

History

models-inference.md

File metadata and controls

Inference Providers

What is HF-Inference API?

What technology do you use to power the HF-Inference API?

Why don't I see an inference widget, or why can't I use the API?

How can I see my usage?

Is there programmatic access to Inference Providers?