Skip to content

fix(cli): make spacy download --url actually use the custom URL#13964

Open
shaun0927 wants to merge 2 commits into
explosion:masterfrom
shaun0927:fix/download-custom-url
Open

fix(cli): make spacy download --url actually use the custom URL#13964
shaun0927 wants to merge 2 commits into
explosion:masterfrom
shaun0927:fix/download-custom-url

Conversation

@shaun0927

Copy link
Copy Markdown

Fixes #13963.

Description

The --url flag added in #13848 is broken: any value passed via --url is either silently dropped (no trailing slash) or rejected by the post-construction guard (with trailing slash), so the flag cannot succeed under any input. See #13963 for the full reproduction.

spacy/cli/download.py:180-186 had two interlocking defects:

  1. When base_url (= the user's custom URL) lacked a trailing slash, line 183 reassigned it to about.__download_url__ + "/", discarding the user's choice.
  2. The line-185 startswith(about.__download_url__) guard then rejected any URL that was preserved, because a custom mirror by definition does not start with the GitHub URL.

This change:

  • Appends / to the actual base_url instead of swapping it for the default.
  • Skips the GitHub-origin guard when custom_url is provided. The user has explicitly opted into a custom source via --url; the relative-path guard is still in force when no --url is passed (the test_download_rejects_relative_urls case continues to pass).

Adds a regression test (test_download_uses_custom_url) exercising both trailing-slash variants of --url plus the default-URL path so the relative-path rejection from #13848's review thread is preserved.

No new dependencies. No behavior change for users who do not pass --url.

Types of change

Bug fix.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.
The custom-URL feature added in explosion#13848 has been broken since release: any
value passed via `--url` is silently dropped (when the URL has no trailing
slash) or rejected by the post-construction guard (when it does), so the
flag cannot succeed under any input.

Two interlocking defects in `download_model`:

1. When `base_url` (= the user's custom URL) lacked a trailing slash, line
   183 reassigned it to `about.__download_url__ + "/"`, discarding the
   user's choice instead of just appending `/`.
2. The line-185 `startswith(about.__download_url__)` guard then rejected
   any URL that *was* preserved, because a custom mirror by definition
   does not start with the GitHub URL.

This change:

  * Appends `/` to the actual `base_url` instead of swapping it for the
    default.
  * Skips the GitHub-origin guard when `custom_url` is provided. The user
    has explicitly opted into a custom source via `--url`; the relative-
    path guard is still in force when no `--url` is passed, preserving
    the protection from explosion#13848's review thread.

Adds a regression test exercising both trailing-slash variants of `--url`
plus the default-URL path so the relative-path rejection from
`test_download_rejects_relative_urls` keeps working.
…test

Adds the signed SCA at .github/contributors/shaun0927.md as required for
first-time contributors. Reformats the new test_download_uses_custom_url
block so `python -m ruff format spacy --check` passes (the for-loop tuple
exceeded the project's 88-char line length).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant