Home / Developer Tool Guides / URL Encoding Guide

The Complete URL Encoding Guide

From RFC 3986 to real-world development: master percent-encoding, reserved vs unreserved characters, choosing between encodeURI and encodeURIComponent, common pitfalls, real-world scenarios and best practices for handling API parameters, links and form submissions correctly.

📖 ~10 min read 📅 Updated 2026-06-20 ✍️ Tudousi Team
🔧 Try Our URL Encoder / Decoder
Percent-encode and decode online. Supports Chinese characters, special symbols and query parameters. All computation happens locally in your browser to protect your privacy.
Open Tool
#01

What Is URL Percent-Encoding? Core Rules of RFC 3986

URL percent-encoding was originally proposed by Tim Berners-Lee in 1994 for the URI specification, and was later standardized as RFC 3986 (Uniform Resource Identifier, Generic Syntax). It solves a very practical problem: URLs only allow "unreserved" ASCII characters to appear literally — everything else — Chinese characters, emoji, spaces, special symbols — must be "escaped".

A URL can be abstracted into the following components:

scheme:[//authority]path[?query][#fragment]

Each component has its own allowed character set, and percent-encoding is the "universal interface" between them: any character can be written as %XX, where XX is the hexadecimal representation of that character's UTF-8 byte (or multiple %XX groups for multi-byte characters).

Examples:

  • A space " " is encoded as %20; for historical application/x-www-form-urlencoded forms, it may also appear as +.
  • The characters "土豆" in UTF-8 are 0xE5 0x9C 0x9F 0xE8 0xB1 0x86, hence they encode as %E5%9C%9F%E8%B1%86.
  • The reserved character & encodes as %26.

With our tool, you can paste any text, choose "Encode" or "Decode", and instantly see its percent-encoded form or original characters.

#02

Reserved vs Unreserved Characters: What Must Be Encoded

RFC 3986 divides characters into two categories. Understanding them helps you decide when to encode.

1. Unreserved Characters — Never require encoding

A-Z a-z 0-9 - . _ ~

These can appear literally anywhere in a URL. Encoding them is legal but produces a fully equivalent result — no semantic difference.

2. Reserved Characters — Must be encoded when used as data

There are two sub-groups:

  • Generic Delimiters (separate URL layers): : / ? # [ ] @
  • Sub-Delimiters (separate internal parameter structure): ! $ & ' ( ) * + , ; =

When they serve as delimiters, they must remain literal; when they appear inside data values, they must be encoded. For example:

  • In ?a=1&b=2, & is a parameter separator and must stay literal.
  • In ?q=Tom%26Jerry, the user's "&" is part of the value and must be encoded as %26; otherwise the server would read it as an extra parameter.

Rule of thumb: when building URL parameter values, always use encodeURIComponent and let the browser / server decide what to decode.

#03

encodeURI vs encodeURIComponent — The Right Choice

Browsers expose two encoding functions, and they are not interchangeable. Here is the comparison:

1. encodeURI

Purpose: encode a complete URL. It preserves all "semantically meaningful" characters, including: : / ? # [ ] @ ! $ & ' ( ) * + , ; = - . _ ~.

Typical scenario: you have a URL containing non-ASCII characters or emoji, and you want to turn it into pure ASCII that browsers can open. E.g. turning https://示例.com/搜索 土豆 into https://%E7%A4%BA%E4%BE%8B.com/%E6%90%9C%E7%B4%A2%20%E5%9C%9F%E8%B1%86.

2. encodeURIComponent

Purpose: encode a single parameter value or path segment. It encodes every reserved character, leaving only the unreserved subset (letters, digits, - . _ ~ ! ~ * ' ( )).

Typical scenario: when building ?keyword=user-input, redirect_uri=target-address, etc. — always encode the "value" with encodeURIComponent.

Counterparts on the server side:

  • Node.js: decodeURIComponent(...) or querystring.unescape;
  • Python: urllib.parse.unquote / unquote_plus;
  • Java: URLDecoder.decode(value, StandardCharsets.UTF_8);
  • PHP: urldecode / rawurldecode.

Note: most server-side frameworks perform one urldecode pass automatically on query strings. Do NOT decode manually on the client side again — otherwise "double-decoding" attacks become possible: %2526%26&, allowing attackers to potentially bypass WAF rules.

#04

Common Encoding Pitfalls & Debugging Techniques

The following mistakes are the most common in production. When you face "links that won't open, lost parameters, garbled characters", check this list first.

Pitfall 1: Running the whole URL through encodeURIComponent

The result is that :, /, ? all get encoded as %3A %2F %3F, turning the URL into a long hex string that the browser cannot resolve. Correct approach: only encode the "parameter values" and "path segments" individually, then assemble them with template strings.

Pitfall 2: Double-Encoding

The classic symptom: "Tom & Jerry" → "Tom%20%26%20Jerry" → then someone encodes it again → "Tom%2520%2526%2520Jerry". After the server performs its single decode, it still sees percent-signs in the value. Solution: encode exactly once, in the layer closest to the user input.

Pitfall 3: Spaces encoded as + instead of %20

+ is a legacy convention of application/x-www-form-urlencoded. Strictly speaking, it is not part of RFC 3986. In paths, general query strings, or OAuth signatures, spaces must use %20. When debugging APIs, always check the raw query string in browser DevTools, not the human-friendly preview.

Pitfall 4: Using decodeURI on encodeURIComponent output

decodeURI expects a "valid-looking URL". If the string contains encoded reserved characters like %2F or %3F, decodeURI will throw a URIError. Always use decodeURIComponent to decode values produced by encodeURIComponent.

Debugging Checklist

  • Copy the URL from the browser address bar into our online tool, pick "Decode", and check whether the raw values match expectations;
  • In Chrome DevTools' Network panel, compare "Query String Parameters — view parsed" versus "view encoded" to spot double-encoding;
  • For OAuth 1.0a, URL-Safe Base64, and similar special scenarios, be aware that they use their own "URL-Safe" tables — e.g. + → -, / → _. These are different rules from standard percent-encoding.
#05

Real-World Scenarios: APIs, OAuth and Deep Links

Here are several scenarios where correct URL encoding is critical in production systems.

Scenario 1: REST API Query Parameters

When a user searches for "Tom & Jerry", simple string concatenation would result in ?q=Tom&Jerry. The server would read q as only "Tom" and treat "Jerry" as a spurious extra parameter. Correct form:

const url = `/search?q=${encodeURIComponent(keyword)}`;

Scenario 2: OAuth 2.0 redirect_uri

Authorization servers require the redirect_uri to match the registered value byte-for-byte. The client must run the redirect_uri through encodeURIComponent itself, so it appears in the query string like https%3A%2F%2Fapp.example.com%2Fcallback. A common bug: forgetting this step causes "invalid_redirect_uri" from the authorization server.

Scenario 3: Mobile Deep Links

When a WebView navigates to myapp://product?id=123&ref=search, any spaces, ampersands or non-ASCII characters inside "ref" must be individually encoded; otherwise iOS / Android parsers will truncate at the first literal &. Typical pattern: encodeURIComponent each field value, join with &, then append to the scheme.

Scenario 4: Open-Redirect & Phishing

An often-overlooked risk: if a server accepts a redirect parameter without validation, an attacker can craft ?redirect=//evil.com and forward users to a phishing site. Encoding alone does not solve this, but correct encoding combined with a strict host whitelist (only allow redirects to trusted domains) is the industry recommendation.

#06

Multilingual Characters, Symbols and the UTF-8 Effect

In modern URL specifications, characters are always encoded in UTF-8. Most programming languages and frameworks use this by default, but you still need to watch the following details.

1. Non-ASCII characters are always encoded

All Chinese characters, Japanese kana, emoji, Greek letters, diacritics, etc. must be written as %XX%XX... groups. Examples:

  • "你好" → %E4%BD%A0%E5%A5%BD
  • emoji "🎉" → %F0%9F%8E%89 (4 bytes → 4 %XX groups)

2. Length ≠ Character Count

A single Chinese character expands to 9 URL characters (3 bytes × 3 chars per byte: e.g. %E4%BD%A0). In constrained URL-length environments (such as the legacy 2083-char IE limit, or WAF header-length limits), long non-ASCII parameters can cause truncation. Recommendations:

  • Transfer extremely long parameters in a POST body instead;
  • When a URL is the only option, base64-encode the string first, then percent-encode;
  • Always verify the server-side max-query-string configuration.

3. Internationalized Domain Names (IDN / Punycode)

When a domain contains non-ASCII characters (e.g. 示例.com), browsers internally convert it to Punycode form xn--fsq668b.com before performing DNS lookups. This is a separate mechanism from "percent-encoding of path / query parameters" — do not mix them up:

  • Domain name → Punycode (xn--...);
  • Path / Query → Percent-Encoding (%XX).

When you paste a complete non-ASCII URL into our online tool, the path and query portions get correctly percent-encoded while structural characters (".", "/", "?") are preserved.

#07

Data Security & Privacy: Why Locally-Run Online Tools Matter

URLs often carry sensitive information: internal API paths, user IDs, redirect targets, OAuth state, search keywords, and more. Sending them to a third-party server means they can be logged, analyzed, or used for ad targeting.

Our tool follows a strict "100% frontend-only" principle:

  • All encode / decode logic calls the browser's native encodeURIComponent / decodeURIComponent or equivalent;
  • No input, output, or intermediate result is sent to any backend;
  • Nothing is persisted via localStorage, cookies, or similar mechanisms;
  • You can disconnect from the network and keep using it.

For users who need to process URLs containing tokens, internal addresses, or user data, we additionally recommend:

  • Work in an offline or otherwise controlled environment, or manually redact sensitive fields before pasting;
  • Avoid pasting production addresses on public computers;
  • Never share sensitive URLs over social media or IM — use separate, revocable short links or internal documents instead.

Final reminder: when evaluating any online URL processing tool, first confirm that it does not send your input to a server. A quick way to verify: open your browser's DevTools Network panel, then click "Execute" and see whether new requests are made — our tool makes none. Feel free to try it yourself at our URL encoder / decoder.