What is URL Parsing?

URL parsing is the process of breaking a Uniform Resource Locator into its individual components: protocol (scheme), hostname, port, pathname, query parameters, and fragment identifier. Every URL follows a structure defined by RFC 3986 and the WHATWG URL Standard. A URL parser reads the raw string, identifies each segment by its delimiter characters (://, :, /, ?, #, &, =), and returns them as separate, accessible fields.

Browsers perform URL parsing every time you type an address or click a link. The JavaScript URL constructor, Python's urllib.parse module, and Go's net/url package all implement parsers that follow the same structural rules. Parsing a URL is the inverse of URL encoding: instead of transforming characters for safe transport, you decompose an already-formed URL into the parts that compose it.

A typical URL like https://api.example.com:8080/v1/users?page=2&limit=10#section contains six distinct components. The delimiter characters — ://, :, /, ?, &, =, and # — are what make parsing deterministic: each one signals a boundary and allows a parser to extract fields without ambiguity.

Why Use an Online URL Parser?

Manually splitting a URL by eye is error-prone, especially when the string contains encoded characters, multiple query parameters, or non-standard ports. This tool parses the URL using the same WHATWG-compliant algorithm that browsers use and displays every component in a clear, copyable table.

⚡

Parse instantly in your browser

Paste any URL and see all components broken down immediately. No page reload, no server call, no waiting.

🔒

Keep your URLs private

Parsing runs entirely in your browser using the native URL API. The URL you enter never leaves your machine.

🔍

Inspect every detail

See protocol, hostname, port, pathname, query string, hash, and each individual query parameter with its decoded value.

📋

Copy individual components

Click the copy button next to any field to grab the exact value. No need to manually select and trim substrings.

URL Parser Use Cases

Frontend routing debugging

Check that path segments and hash fragments match your router configuration. Spot misplaced slashes or unexpected query parameters before they cause 404s.

Backend API endpoint validation

Verify that incoming request URLs contain the correct hostname, port, and path structure before writing route handlers or middleware.

DevOps redirect rule testing

When writing Nginx, Apache, or CDN redirect rules, parse the original and target URLs to confirm each component maps correctly.

QA link verification

Parse URLs from test reports or bug tickets to isolate which query parameter or fragment is causing the wrong page to load.

Data pipeline URL extraction

Extract hostnames or path segments from URLs in log files or analytics data to build domain-level reports or filter traffic by endpoint.

Learning URL structure

Students and developers new to web protocols can paste real URLs and immediately see which delimiter marks which boundary.

URL Component Reference

The table below shows every property returned by the JavaScript URL constructor when parsing a URL. The same components exist in Python's urlparse result, Go's url.URL struct, and PHP's parse_url output, though property names differ across languages.

Property	Example	Description
protocol	https:	Scheme including the trailing colon
hostname	api.example.com	Domain name or IP address
port	8080	Port number (empty string if default)
pathname	/v1/users	Path starting with /
search	?page=2&limit=10	Query string including the leading ?
hash	#section	Fragment identifier including the leading #
origin	https://api.example.com:8080	protocol + hostname + port
host	api.example.com:8080	hostname + port
username	admin	Credentials before @ (rarely used in practice)
password	secret	Credentials before @ (avoid in production URLs)
href	(full URL)	The complete, serialized URL string

WHATWG URL Standard vs RFC 3986

Two specifications define how URLs should be parsed. They agree on the basic structure but diverge in edge cases — and that divergence is usually the culprit when your browser handles a URL differently than your server does.

WHATWG URL Standard

Used by all modern browsers and the JavaScript URL constructor. Accepts and normalizes sloppy input: missing schemes, backslashes as path separators, non-ASCII hostnames via Punycode. Defined as a living standard at url.spec.whatwg.org.

RFC 3986

The formal IETF specification (2005). Stricter than WHATWG: rejects some inputs that browsers accept. Used by many server-side libraries including Go's net/url and Python's urllib.parse. Defined in RFC 3986.

In practice, most differences appear when parsing URLs with international domain names (IDN), missing schemes, or unusual characters. The WHATWG parser converts IDN hostnames to Punycode automatically, while strict RFC 3986 parsers may reject them. If you paste a URL into this tool and see different results than your server-side code produces, the WHATWG vs RFC difference is the most likely cause.

Code Examples

Every major language has a built-in URL parser. The examples below parse the same URL and extract its components. Note the minor naming differences across languages: Python uses scheme instead of protocol, and Go exposes RawQuery instead of search.

JavaScript (browser / Node.js)

const url = new URL('https://api.example.com:8080/v1/users?page=2&limit=10#section')

url.protocol  // → "https:"
url.hostname  // → "api.example.com"
url.port      // → "8080"
url.pathname  // → "/v1/users"
url.search    // → "?page=2&limit=10"
url.hash      // → "#section"

// Iterate over query parameters
for (const [key, value] of url.searchParams) {
  console.log(`${key} = ${value}`)
}
// → "page = 2"
// → "limit = 10"

// Modify and re-serialize
url.searchParams.set('page', '3')
url.toString()
// → "https://api.example.com:8080/v1/users?page=3&limit=10#section"

Python

from urllib.parse import urlparse, parse_qs

result = urlparse('https://api.example.com:8080/v1/users?page=2&limit=10#section')

result.scheme    # → 'https'
result.hostname  # → 'api.example.com'
result.port      # → 8080
result.path      # → '/v1/users'
result.query     # → 'page=2&limit=10'
result.fragment  # → 'section'

# Parse query string into a dict
params = parse_qs(result.query)
params['page']   # → ['2']
params['limit']  # → ['10']

# Reconstruct with modifications
from urllib.parse import urlencode, urlunparse
new_query = urlencode({'page': '3', 'limit': '10'})
urlunparse(result._replace(query=new_query))
# → 'https://api.example.com:8080/v1/users?page=3&limit=10#section'

package main

import (
	"fmt"
	"net/url"
)

func main() {
	u, err := url.Parse("https://api.example.com:8080/v1/users?page=2&limit=10#section")
	if err != nil {
		panic(err)
	}

	fmt.Println(u.Scheme)   // → "https"
	fmt.Println(u.Hostname()) // → "api.example.com"
	fmt.Println(u.Port())     // → "8080"
	fmt.Println(u.Path)       // → "/v1/users"
	fmt.Println(u.RawQuery)   // → "page=2&limit=10"
	fmt.Println(u.Fragment)   // → "section"

	// Query params as map
	q := u.Query()
	fmt.Println(q.Get("page"))  // → "2"
	fmt.Println(q.Get("limit")) // → "10"
}

PHP

<?php
$url = 'https://api.example.com:8080/v1/users?page=2&limit=10#section';
$parts = parse_url($url);

$parts['scheme'];   // → "https"
$parts['host'];     // → "api.example.com"
$parts['port'];     // → 8080
$parts['path'];     // → "/v1/users"
$parts['query'];    // → "page=2&limit=10"
$parts['fragment']; // → "section"

// Parse query string into an array
parse_str($parts['query'], $params);
$params['page'];    // → "2"
$params['limit'];   // → "10"

Frequently Asked Questions

What is the difference between a URL and a URI?

A URL (Uniform Resource Locator) is a specific type of URI (Uniform Resource Identifier) that includes both the identifier and the access mechanism (the scheme, like https://). All URLs are URIs, but not all URIs are URLs. A URN like urn:isbn:0451450523 is a URI that identifies a resource by name without specifying how to retrieve it. In web development, the terms are often used interchangeably because almost every URI you encounter is a URL.

How does the URL constructor handle relative URLs?

The JavaScript URL constructor requires a base URL when parsing relative paths. Calling new URL('/path?q=1') throws a TypeError. You must provide a base: new URL('/path?q=1', 'https://example.com'). Python's urljoin and Go's url.ResolveReference serve the same purpose. This tool expects complete, absolute URLs with a scheme.

What happens when a URL has no port number?

When the port is omitted, the parser returns an empty string for the port property. The browser assumes the default port for the scheme: 443 for https, 80 for http, 21 for ftp. You can access the effective port through the origin or host property, but the port field itself stays empty because no explicit port was specified.

Can a URL contain Unicode characters?

Yes, but they must be encoded for transmission. The WHATWG URL Standard handles this automatically: international domain names are converted to Punycode (xn-- prefix), and path/query characters outside the ASCII range are percent-encoded. If you paste a URL with Unicode into this tool, you will see the normalized, ASCII-safe version in the parsed output.

What is the maximum length of a URL?

No standard defines a maximum URL length — RFC 3986 is silent on the topic. In practice, browsers enforce limits: Chrome supports up to roughly 2MB in the address bar, while Internet Explorer (legacy) had a 2,083-character limit. Most web servers default to 8KB (Nginx) or 8KB (Apache) for the request line. If you need to pass large data, use POST request bodies instead of query strings.

How do I parse just the query string without the full URL?

In JavaScript, use new URLSearchParams('page=2&limit=10') to parse a bare query string. In Python, use urllib.parse.parse_qs('page=2&limit=10'). Both return the parameters as key-value pairs. This is useful when you have the query string in isolation, for example from a form submission or a log entry that only captured the query portion.

Is URL parsing the same as URL decoding?

No. URL parsing splits a URL into structural components (scheme, host, path, query, fragment). URL decoding converts percent-encoded characters back to their original form (%20 becomes a space, %26 becomes &). The two operations are complementary: you typically parse the URL first, then decode individual component values. Decoding before parsing can break the URL structure because decoded delimiters like & and = would be misinterpreted.

URL Parser