Preserve percent-encoded reserved characters during normalization#589
Open
cvx wants to merge 1 commit into
Open
Preserve percent-encoded reserved characters during normalization#589cvx wants to merge 1 commit into
cvx wants to merge 1 commit into
Conversation
RFC 3986 section 2.2 states that a reserved character and its percent-encoded form are not equivalent, and section 6.2.2.2 only permits decoding unreserved characters during normalization. Previously decoding and re-encoding URLs was turning `%3A` into `:`, `%2F` into `/`, etc. Both normalization methods (`URI.normalized_encode` and `URI#normalize`) now pass the reserved characters to `normalize_component`. Decoding of unreserved characters is unchanged. --- NOTE: This is considered a breaking change. If anyone depended on percent-encoded reserved characters, they'll now have a different output. (more "correct" but still) Also, not public/not documented constants `PCHAR`, `FRAGMENT`, `QUERY` are removed since they're no longer used internally. (fwiw I didn't find any refences to those in public repos on github) --- Fixes sporkmonger#472 Fixes sporkmonger#424 Fixes sporkmonger#386 Fixes sporkmonger#366 Fixes sporkmonger#295
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RFC 3986 section 2.2 states that a reserved character and its percent-encoded form are not equivalent, and section 6.2.2.2 only permits decoding unreserved characters during normalization.
Previously decoding and re-encoding URLs was turning
%3Ainto:,%2Finto/, etc.Both normalization methods (
URI.normalized_encodeandURI#normalize) now pass the reserved characters tonormalize_component. Decoding of unreserved characters is unchanged.NOTE: This is considered a breaking change.
If anyone depended on percent-encoded reserved characters, they'll now have a different output. (more "correct" but still) Also, not public/not documented constants
PCHAR,FRAGMENT,QUERYare removed since they're no longer used internally. (fwiw I didn't find any references to those in public repos on github)Fixes #472
Fixes #424
Fixes #386
Fixes #366
Fixes #295