XML Minifier
Minify XML by removing whitespace and comments
XML Input
Minified XML
What is XML Minification?
XML minification is the process of removing all unnecessary characters from an XML document without changing its meaning. An XML minifier strips whitespace between tags, removes comments, eliminates line breaks, and collapses indentation to produce compact, single-line output. The result is an XML string that parsers read identically to the original formatted version, producing the same data model.
The XML 1.0 specification (W3C Recommendation, Fifth Edition) defines whitespace handling rules in section 2.10. Whitespace between tags that has no semantic value is called "insignificant whitespace." XML processors are allowed to discard it. Whitespace inside text content, however, is significant by default unless the parent element declares xml:space="default". A correct XML minifier distinguishes between these two cases and only removes what is safe to remove.
Minification differs from compression. Gzip or Brotli reduce size at the transport layer and require decompression before parsing. Minification reduces the raw document size itself, so the XML remains valid and readable by any parser without a decompression step. In practice, minifying before compressing yields the best results: you eliminate redundant characters first, then the compression algorithm works on a tighter input.
<?xml version="1.0" encoding="UTF-8"?>
<!-- Product catalog for Q1 2026 -->
<catalog>
<product id="p101">
<name>Widget A</name>
<price currency="USD">29.99</price>
<!-- Temporarily discounted -->
<stock>142</stock>
</product>
<product id="p102">
<name>Widget B</name>
<price currency="EUR">19.50</price>
<stock>87</stock>
</product>
</catalog><?xml version="1.0" encoding="UTF-8"?><catalog><product id="p101"><name>Widget A</name><price currency="USD">29.99</price><stock>142</stock></product><product id="p102"><name>Widget B</name><price currency="EUR">19.50</price><stock>87</stock></product></catalog>
Why Use an XML Minifier?
Formatted XML with indentation and comments is ideal for development and code review. For storage, transmission, and machine consumption, that extra formatting adds bytes with no benefit. An XML minifier closes that gap.
XML Minifier Use Cases
What XML Minification Removes
Not everything in an XML document can be safely removed. This reference table shows each type of removable content and whether discarding it is always safe or conditional on your use case.
| Item | Example | Safety |
|---|---|---|
| Indentation | Spaces/tabs before tags | Always safe to remove |
| Line breaks | \n and \r\n between tags | Always safe to remove |
| Comments | <!-- ... --> | Safe unless parsed by app |
| XML declaration | <?xml version="1.0"?> | Keep if encoding is non-UTF-8 |
| Processing instructions | <?xml-stylesheet ...?> | Keep if consumed downstream |
| Trailing whitespace | Spaces after closing tags | Always safe to remove |
| Text node whitespace | Spaces inside text content | Remove only between tags, not within |
Minification vs Gzip vs Binary Formats
Minification, compression, and binary encoding each target a different layer of the size problem. Minification keeps the output as valid, human-readable XML. Compression (gzip, Brotli) shrinks further but requires a decompression step before parsing. Binary formats go furthest, but both ends of the connection need a compatible encoder/decoder — practical mainly for embedded systems or WSDL-heavy enterprise services.
Code Examples
Minifying XML programmatically follows the same pattern in every language: parse the document into a tree, optionally remove comment nodes, then serialize without indentation.
// Minify XML by parsing and re-serializing (strips formatting)
const raw = `<root>
<item id="1">
<!-- note -->
<name>Test</name>
</item>
</root>`
const parser = new DOMParser()
const doc = parser.parseFromString(raw, 'application/xml')
// Remove comment nodes
const walker = doc.createTreeWalker(doc, NodeFilter.SHOW_COMMENT)
const comments = []
while (walker.nextNode()) comments.push(walker.currentNode)
comments.forEach(c => c.parentNode.removeChild(c))
const minified = new XMLSerializer().serializeToString(doc)
// → "<root><item id=\"1\"><name>Test</name></item></root>"from lxml import etree
xml = """<root>
<item id="1">
<!-- note -->
<name>Test</name>
</item>
</root>"""
tree = etree.fromstring(xml.encode())
# Remove comments
for comment in tree.iter(etree.Comment):
comment.getparent().remove(comment)
# Serialize without pretty-print (minified)
result = etree.tostring(tree, xml_declaration=False).decode()
# → '<root><item id="1"><name>Test</name></item></root>'
# With xml.etree (stdlib, no lxml needed)
import xml.etree.ElementTree as ET
root = ET.fromstring(xml)
ET.indent(root, space='') # Python 3.9+
print(ET.tostring(root, encoding='unicode'))package main
import (
"encoding/xml"
"fmt"
"strings"
)
func minifyXML(input string) (string, error) {
decoder := xml.NewDecoder(strings.NewReader(input))
var out strings.Builder
encoder := xml.NewEncoder(&out)
// No indentation = minified output
for {
tok, err := decoder.Token()
if err != nil {
break
}
// Skip comments
if _, ok := tok.(xml.Comment); ok {
continue
}
// Skip whitespace-only char data
if cd, ok := tok.(xml.CharData); ok {
if strings.TrimSpace(string(cd)) == "" {
continue
}
}
encoder.EncodeToken(tok)
}
encoder.Flush()
return out.String(), nil
}
// minifyXML("<a>\n <b>1</b>\n</a>") → "<a><b>1</b></a>"# Minify XML with xmllint (part of libxml2) xmllint --noblanks input.xml > minified.xml # Minify from stdin echo '<root> <item>hello</item> </root>' | xmllint --noblanks - # → <?xml version="1.0"?><root><item>hello</item></root> # Strip comments too (combine with sed or xmlstarlet) xmlstarlet ed -d '//comment()' input.xml | xmllint --noblanks - # Check size reduction echo "Before: $(wc -c < input.xml) bytes" echo "After: $(xmllint --noblanks input.xml | wc -c) bytes"