# SerializationTest.txt # Date: 2025-06-27 # © 2025 Unicode®, Inc. # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. # For terms of use and license, see https://www.unicode.org/terms_of_use.html # # For documentation and usage, see https://www.unicode.org/reports/tr58/ # # Field 0: Pre-Path portion of URL # Field 1: Path # Field 2: Query # Field 3: Fragment # Field 4: Expected result # # The input is 4 separate fields, 0..4. It represents the internal form of a URL, pre-escaping. # The result is an escaped string for output. # # The reason the input is represented as 4 different fields is that the algorithm applies different escaping # to each piece. # # Notes: # - The # character is represented by \x{23} when it is part of a field (instead of introducing a comment in the data file) # - Leading and trailing spaces in a field are to be omitted, but interior spaces are retained # Path only https://example.com; α; ; ; https://example.com/α # Query only https://example.com; ; α; ; https://example.com?α # Fragment only https://example.com; ; ; α; https://example.com\x{23}α # All parts https://example.com; αβγ/δεζ; θ=ικλ&μ=γξο; πρς; https://example.com/αβγ/δεζ?θ=ικλ&μ=γξο\x{23}πρς # Escape ? in Path https://example.com; α?μπ; ; ; https://example.com/α%3Fμπ # Escape # in Path/Query https://example.com; α\x{23}β; γ=δ\x{23}ε; ; https://example.com/α%23β?γ=δ%23ε # Escape hard (' ') https://example.com; αβ γ/δεζ; θ=ικ λ&=γξο; πρ σ; https://example.com/αβ%20γ/δεζ?θ=ικ%20λ&=γξο\x{23}πρ%20σ # Escape soft ('.') unless followed by include https://example.com; αβγ./δεζ.; θ=ικ.λ&=γξο.; πρς.; https://example.com/αβγ./δεζ.?θ=ικ.λ&=γξο.\x{23}πρς%2E # Escape unmatched brackets https://example.com; α(β)); γ(δ)); ε(ζ)); https://example.com/α(β)%29?γ(δ)%29\x{23}ε(ζ)%29