What is URL Parsing?
URL parsing is the process of breaking a Uniform Resource Locator into its individual components: protocol (scheme), hostname, port, pathname, query parameters, and fragment identifier. Every URL follows a structure defined by RFC 3986 and the WHATWG URL Standard. A URL parser reads the raw string, identifies each segment by its delimiter characters (://, :, /, ?, #, &, =), and returns them as separate, accessible fields.
Browsers perform URL parsing every time you type an address or click a link. The JavaScript URL constructor, Python's urllib.parse module, and Go's net/url package all implement parsers that follow the same structural rules. Parsing a URL is the inverse of URL encoding: instead of transforming characters for safe transport, you decompose an already-formed URL into the parts that compose it.
A typical URL like https://api.example.com:8080/v1/users?page=2&limit=10#section contains six distinct components. The delimiter characters — ://, :, /, ?, &, =, and # — are what make parsing deterministic: each one signals a boundary and allows a parser to extract fields without ambiguity.
Why Use an Online URL Parser?
Manually splitting a URL by eye is error-prone, especially when the string contains encoded characters, multiple query parameters, or non-standard ports. This tool parses the URL using the same WHATWG-compliant algorithm that browsers use and displays every component in a clear, copyable table.
URL Parser Use Cases
URL Component Reference
The table below shows every property returned by the JavaScript URL constructor when parsing a URL. The same components exist in Python's urlparse result, Go's url.URL struct, and PHP's parse_url output, though property names differ across languages.
| Property | Example | Description |
|---|---|---|
| protocol | https: | Scheme including the trailing colon |
| hostname | api.example.com | Domain name or IP address |
| port | 8080 | Port number (empty string if default) |
| pathname | /v1/users | Path starting with / |
| search | ?page=2&limit=10 | Query string including the leading ? |
| hash | #section | Fragment identifier including the leading # |
| origin | https://api.example.com:8080 | protocol + hostname + port |
| host | api.example.com:8080 | hostname + port |
| username | admin | Credentials before @ (rarely used in practice) |
| password | secret | Credentials before @ (avoid in production URLs) |
| href | (full URL) | The complete, serialized URL string |
WHATWG URL Standard vs RFC 3986
Two specifications define how URLs should be parsed. They agree on the basic structure but diverge in edge cases — and that divergence is usually the culprit when your browser handles a URL differently than your server does.
In practice, most differences appear when parsing URLs with international domain names (IDN), missing schemes, or unusual characters. The WHATWG parser converts IDN hostnames to Punycode automatically, while strict RFC 3986 parsers may reject them. If you paste a URL into this tool and see different results than your server-side code produces, the WHATWG vs RFC difference is the most likely cause.
Code Examples
Every major language has a built-in URL parser. The examples below parse the same URL and extract its components. Note the minor naming differences across languages: Python uses scheme instead of protocol, and Go exposes RawQuery instead of search.
const url = new URL('https://api.example.com:8080/v1/users?page=2&limit=10#section')
url.protocol // → "https:"
url.hostname // → "api.example.com"
url.port // → "8080"
url.pathname // → "/v1/users"
url.search // → "?page=2&limit=10"
url.hash // → "#section"
// Iterate over query parameters
for (const [key, value] of url.searchParams) {
console.log(`${key} = ${value}`)
}
// → "page = 2"
// → "limit = 10"
// Modify and re-serialize
url.searchParams.set('page', '3')
url.toString()
// → "https://api.example.com:8080/v1/users?page=3&limit=10#section"from urllib.parse import urlparse, parse_qs
result = urlparse('https://api.example.com:8080/v1/users?page=2&limit=10#section')
result.scheme # → 'https'
result.hostname # → 'api.example.com'
result.port # → 8080
result.path # → '/v1/users'
result.query # → 'page=2&limit=10'
result.fragment # → 'section'
# Parse query string into a dict
params = parse_qs(result.query)
params['page'] # → ['2']
params['limit'] # → ['10']
# Reconstruct with modifications
from urllib.parse import urlencode, urlunparse
new_query = urlencode({'page': '3', 'limit': '10'})
urlunparse(result._replace(query=new_query))
# → 'https://api.example.com:8080/v1/users?page=3&limit=10#section'package main
import (
"fmt"
"net/url"
)
func main() {
u, err := url.Parse("https://api.example.com:8080/v1/users?page=2&limit=10#section")
if err != nil {
panic(err)
}
fmt.Println(u.Scheme) // → "https"
fmt.Println(u.Hostname()) // → "api.example.com"
fmt.Println(u.Port()) // → "8080"
fmt.Println(u.Path) // → "/v1/users"
fmt.Println(u.RawQuery) // → "page=2&limit=10"
fmt.Println(u.Fragment) // → "section"
// Query params as map
q := u.Query()
fmt.Println(q.Get("page")) // → "2"
fmt.Println(q.Get("limit")) // → "10"
}<?php $url = 'https://api.example.com:8080/v1/users?page=2&limit=10#section'; $parts = parse_url($url); $parts['scheme']; // → "https" $parts['host']; // → "api.example.com" $parts['port']; // → 8080 $parts['path']; // → "/v1/users" $parts['query']; // → "page=2&limit=10" $parts['fragment']; // → "section" // Parse query string into an array parse_str($parts['query'], $params); $params['page']; // → "2" $params['limit']; // → "10"