Handling Special Characters in URL
About
When working with REST APIs, it's common to pass data in the URL using path parameters or query parameters. However, certain characters are reserved or have special meaning in URLs. If not handled properly, these characters can cause errors or change the meaning of the URL.
Why It Matters
Some characters are reserved in URLs, such as:
/
(used to separate path segments)?
(starts query string)&
(separates query parameters)#
(fragment identifier)%
(used for encoding)+
,=
,:
,@
, and others
If these appear in data we are passing as part of the URL, they must be encoded to avoid breaking the URL structure.
Allowed Characters in a URL (Unencoded)
According to the URI specification (RFC 3986), only a specific set of characters are allowed unencoded in a URL. These characters are considered safe or unreserved, and everything else should be percent-encoded.
1. Unreserved Characters (Allowed Anywhere in URL)
These are always safe to use in URLs without encoding:
Alphanumeric characters:
A–Z
,a–z
,0–9
Special characters:
-
(hyphen),_
(underscore),.
(period),~
(tilde)
These characters have no special meaning in the URL structure.
2. Reserved Characters (Allowed but Must Be Encoded if Used as Data)
These characters are allowed in a URL, but they have special meanings, so they must be encoded if used as literal data
3. Examples of a Valid URL (Unencoded parts only)
https://example.com/users/john_doe
https
→ uses unreserved letters/users/john_doe
→ uses/
, letters,_
No special characters like
&
,@
, or=
are present unencoded
Special Characters which are Not Allowed
Technically, URLs can contain only a specific set of characters. Other characters must be percent-encoded using URL encoding (also known as percent-encoding). These characters are either:
Reserved characters – used to control URL structure
Unsafe characters – cause parsing issues or are not allowed
Non-ASCII characters – must always be encoded
1. Reserved Characters (Need Encoding if Used as Data)
These characters have special meaning in URLs and must be encoded when used as data (e.g., in path or query parameter values):
:
Protocol delimiter or port
%3A
/
Path separator
%2F
?
Begins query string
%3F
#
Fragment identifier
%23
[
IPv6 literal
%5B
]
IPv6 literal
%5D
@
User info (before hostname)
%40
!
Sub-delimiter
%21
$
Sub-delimiter
%24
&
Query param separator
%26
'
Often used in queries
%27
(
Grouping
%28
)
Grouping
%29
*
Wildcard
%2A
+
Space in query params
%2B
,
Multiple values separator
%2C
;
Params in paths
%3B
=
Assigning values in query
%3D
2. Unsafe Characters (Always Encode)
These characters are either control characters or not valid in URLs at all:
<
Unsafe in HTML
%3C
>
Unsafe in HTML
%3E
"
Delimiter in attributes
%22
{
Not valid in URLs
%7B
}
Not valid in URLs
%7D
`
`
Not valid in URLs
\
Escape character
%5C
^
Not valid in URLs
%5E
`
Not valid in URLs
%60
space
Breaks URLs
%20
or +
3. Non-ASCII Characters (Always Encode)
Any character outside the ASCII range (0–127), such as:
Accented letters:
é
,ü
,ç
Non-English letters:
ñ
,ø
,å
Symbols and emojis:
©
,✓
,🚀
These must be percent-encoded using UTF-8.
Example:
URLEncoder.encode("naïve café", StandardCharsets.UTF_8)
// Result: naive+caf%C3%A9
Path Parameters and Special Characters
Scenario
We are building a REST endpoint to fetch user details by username. Usernames can include special characters like @
, +
, or even slashes in some systems.
Example API
GET /users/{username}
Now, suppose the username is:
Problem
If we make a request like:
GET /users/[email protected]
It may not work as expected because:
+
could be interpreted as a space@
has special meaning in URLsIf the username has
/
, the server may interpret it as a path separator
Solution: Encode the Path Parameter
Before placing the value in the URL, it must be URL-encoded.
Example in Java (client-side)
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
String rawUsername = "[email protected]";
String encoded = URLEncoder.encode(rawUsername, StandardCharsets.UTF_8);
// Result: john.doe%2Badmin%40example.com
Use the encoded value in the URL:
GET /users/john.doe%2Badmin%40example.com
Example in Java (Spring Controller)
@GetMapping("/users/{username}")
public ResponseEntity<User> getUser(@PathVariable String username) {
return ResponseEntity.ok(userService.findByUsername(username));
}
Spring automatically decodes the path parameter, so in the above case, username
will be "[email protected]"
.
Query Parameters and Special Characters
Query parameters are more forgiving, but encoding is still important. For example:
GET /search?query=coffee & cream
This is invalid due to the space and &
in the value.
Correct Version
GET /search?query=coffee%20%26%20cream
Last updated