Skip to content

bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) #24818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 11, 2021

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Mar 10, 2021

urllib.parse_qsl previously (and currently) deals with bytes query string by coercing them to string internally. After bpo-42967 was fixed, a new separator argument was added. This patch follows the rest of parse_qsl by coercing the separator to string too.

https://bugs.python.org/issue42967

@Fidget-Spinner
Copy link
Member Author

Off topic: wow the bot is more alert than I thought. I never knew mentioning a bpo in the description would also cause it to link

@encukou
Copy link
Member

encukou commented Mar 10, 2021

Would it make sense to disallow mixing types (e.g. string qs with bytes separator)?
qs, separator, _coerce_result = _coerce_args(qs, separator) should do that.

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Mar 10, 2021

Would it make sense to disallow mixing types (e.g. string qs with bytes separator)?
qs, separator, _coerce_result = _coerce_args(qs, separator) should do that.

Yeap I thought about that for a while :). The problem is it will break existing code:

# currently this works
urllib.parse_qsl(b"qwerty")

# but after patch:
urllib.parse_qsl(b"qwerty")
# TypeError: Cannot mix str and non-str arguments

Because the default separator argument is a string "&", passing in bytes will make _coerce_args complain about mixing types.

@orsenthil orsenthil self-assigned this Mar 10, 2021
Copy link
Member

@orsenthil orsenthil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fidget-Spinner - This change looks good to me.

Ouch! - We missed this in the original PR. Using the same ticket or a new ticket is fine with with me.

@Fidget-Spinner Fidget-Spinner changed the title coerce bytes separator to string in urllib.parse_qs(l) Mar 10, 2021
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Mar 10, 2021

@Fidget-Spinner - This change looks good to me.

Ouch! - We missed this in the original PR. Using the same ticket or a new ticket is fine with with me.

Good thing this isn't a regression against prior versions (IMO, I may be wrong here). Because currently, the following still works, and all existing user code should look like this:

urllib.parse_qsl("qwerty")
urllib.parse_qsl(b"qwerty")

It's only triggered when separator is bytes. And this can only happen when specifically using the new argument in the new patch. Though it's still less-than-ideal so I'm glad Petr caught it.

Edit: Hmm considering this is bugfix. I'm guessing backport to 3.8 only?

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Apr 10, 2021
@Fidget-Spinner
Copy link
Member Author

A very gentle ping @orsenthil

@orsenthil
Copy link
Member

Oh, thanks @Fidget-Spinner. Sorry, I missed this.

@orsenthil orsenthil merged commit b38601d into python:master Apr 11, 2021
@bedevere-bot
Copy link

@orsenthil: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

Thanks @Fidget-Spinner for the PR, and @orsenthil for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @Fidget-Spinner for the PR, and @orsenthil for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Apr 11, 2021
…thonGH-24818)

* coerce bytes separator to string

* Add news

* Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst
(cherry picked from commit b38601d)

Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Apr 11, 2021
@bedevere-bot
Copy link

GH-25344 is a backport of this pull request to the 3.9 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Apr 11, 2021
…thonGH-24818)

* coerce bytes separator to string

* Add news

* Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst
(cherry picked from commit b38601d)

Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
@bedevere-bot
Copy link

GH-25345 is a backport of this pull request to the 3.8 branch.

miss-islington added a commit that referenced this pull request Apr 11, 2021
…-24818)

* coerce bytes separator to string

* Add news

* Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst
(cherry picked from commit b38601d)

Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
orsenthil pushed a commit that referenced this pull request Apr 16, 2021
…-24818) (#25345)

* coerce bytes separator to string

* Add news

* Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst
(cherry picked from commit b38601d)

Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>

Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
@Fidget-Spinner Fidget-Spinner deleted the urllibparse-bytes branch May 16, 2021 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale PR or inactive for long period of time.
6 participants