Skip to content

fix(backends/vastai): route /instances/ to /api/v1/ (v0 deprecated, returns HTTP 410)#3969

Open
evolv3ai wants to merge 1 commit into
dstackai:masterfrom
evolv3ai:fix/vastai-v1-instances
Open

fix(backends/vastai): route /instances/ to /api/v1/ (v0 deprecated, returns HTTP 410)#3969
evolv3ai wants to merge 1 commit into
dstackai:masterfrom
evolv3ai:fix/vastai-v1-instances

Conversation

@evolv3ai

Copy link
Copy Markdown

Problem

VastAIAPIClient._url() in src/dstack/_internal/core/backends/vastai/api_client.py hardcodes https://console.vast.ai/api/v0 for all Vast API calls. Vast.ai has deprecated only the /api/v0/instances/ path family — the endpoint now responds with HTTP 410 deprecated_endpoint.

Reproduction (any current dstack release, including master and PyPI 0.20.24):

curl -sS -o /dev/null -w '%{http_code}\n' "https://console.vast.ai/api/v0/instances/?api_key=$VASTAI_API_KEY"
# -> 410
curl -sS -o /dev/null -w '%{http_code}\n' "https://console.vast.ai/api/v1/instances/?api_key=$VASTAI_API_KEY"
# -> 200

This breaks vastai backend registration: auth_test() calls get_instances(), the 410 is raised as an HTTPError, dstack reports it back to the operator as "Invalid credentials". The API key is fine — the URL is dead.

/bundles/ and /asks/ still work on v0 (v1 is not yet published for them), so offer queries and instance creation are unaffected — only the /instances/ family needs the v1 prefix.

Fix

Route paths under /instances/ (covers get_instances, destroy_instance, request_logs) to /api/v1/. Leave /bundles/ and /asks/ on v0. The v1 instances response preserves the existing schema (success flag, instances list); it adds pagination fields the existing code ignores.

Verification

Applied to a self-hosted dstack 0.20.23 broker. Before: backend registration fails with "Invalid credentials". After patch + restart:

  • POST /api/project/<p>/backends/create_yaml → 200
  • Project backends list includes vastai
  • dstack offer --backend vastai --gpu A6000 returns 7+ live offers
  • dstack apply of a task config successfully provisions a Vast.ai instance through the patched code path (instance transitions provisioning → running cleanly)

Notes

  • Targeted patch — only the _url() helper changes. No new dependencies, no behavioral change for bundles/asks callers.
  • Bug exists on master and on every tagged release I checked (0.20.20 through 0.20.24).
  • If Vast publishes /api/v1/bundles/ and /api/v1/asks/ later, the same conditional shape will accept additional path prefixes cleanly.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

Vast.ai deprecated /api/v0/instances/, returning HTTP 410
"deprecated_endpoint". dstack interprets that as "Invalid credentials"
during auth_test() and refuses to register the backend.

Only the /instances/ path family is deprecated; /bundles/ and /asks/
remain on v0 (v1 not yet published for them). The v1 instances response
preserves the existing schema (success flag, instances list); it adds
pagination fields the existing code ignores.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5dcaac677b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +142 to +143
if path.lstrip("/").startswith("instances/"):
base = base.replace("/api/v0", "/api/v1")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep non-list instance calls on v0

This prefix check sends every /instances/... URL to v1, including destroy_instance() and request_logs(). The current Vast docs only move the list endpoint to GET /api/v1/instances/; they still document destruction as DELETE /api/v0/instances/{id}/ and logs as PUT /api/v0/instances/request_logs/{id}. In VastAI runs, terminate_instance() relies on destroy_instance() and ignores a false return, so routing deletes to an unpublished v1 endpoint can leave paid instances running while also breaking log retrieval.

Useful? React with 👍 / 👎.

Comment on lines +142 to +143
if path.lstrip("/").startswith("instances/"):
base = base.replace("/api/v0", "/api/v1")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Page through v1 instance results

Routing get_instances() to v1 changes the response from the old unpaginated list to Vast's paginated endpoint: the docs state limit defaults to 25, has max 25, and provide next_token for additional pages. Since the existing client caches only data["instances"], accounts with more than 25 matching instances can have a dstack instance omitted, causing get_instance() to return None and update_provisioning_data() to stop discovering SSH/status data for that run.

Useful? React with 👍 / 👎.

@r4victor

Copy link
Copy Markdown
Collaborator

Thanks for reporting. This seems to be a duplicate of the #3938 fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants