public class XmlSlurper
extends DefaultHandler
Parse XML into a document tree that may be traversed similar to XPath expressions. For example:
import groovy.xml.XmlSlurper
def rootNode = new XmlSlurper().parseText(
'<root><one a1="uno!"/><two>Some text!</two></root>' )
assert rootNode.name() == 'root'
assert rootNode.one[0].@a1 == 'uno!'
assert rootNode.two.text() == 'Some text!'
rootNode.children().each { assert it.name() in ['one','two'] }
Note that in some cases, a 'selector' expression may not resolve to a single node. For example:
import groovy.xml.XmlSlurper
def rootNode = new XmlSlurper().parseText(
'''<root>
<a>one!</a>
<a>two!</a>
</root>''' )
assert rootNode.a.size() == 2
rootNode.a.each { assert it.text() in ['one!','two!'] }
A more realistic example — a book catalog. Given this XML:
<catalog>
<book id="b1">
<title>Programming Groovy 3</title>
<author>Venkat Subramaniam</author>
<year>2024</year>
</book>
<book id="b2">
<title>Groovy in Action</title>
<author>Dierk Koenig</author>
<year>2015</year>
</book>
</catalog>
the equivalent Groovy to slurp it and navigate the tree:
def catalog = new XmlSlurper().parseText(xml)
assert catalog.book.size() == 2
assert catalog.book[0].title.text() == 'Programming Groovy 3'
catalog.book.findAll { it.year.text().toInteger() >= 2020 }.each { book ->
println "${book.title} by ${book.author}"
}
Navigation through the returned GPathResult is lazy, so selectors are
evaluated on demand rather than exposing an eager groovy.util.Node tree.
| Constructor and description |
|---|
XmlSlurper()Creates a non-validating and namespace-aware XmlSlurper which does not allow DOCTYPE declarations in documents. |
XmlSlurper(boolean validating, boolean namespaceAware)Creates a XmlSlurper which does not allow DOCTYPE declarations in documents. |
XmlSlurper(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration)Creates a XmlSlurper. |
XmlSlurper(XMLReader reader)Creates a slurper backed by the supplied SAX reader. |
XmlSlurper(SAXParser parser)Creates a slurper backed by the supplied SAX parser. |
| Type Params | Return Type | Name and description |
|---|---|---|
|
public void |
characters(char[] ch, int start, int length)Buffers character data until the surrounding element boundary is reached. |
|
public void |
endDocument()Receives the end-of-document callback. |
|
public void |
endElement(String namespaceURI, String localName, String qName)Flushes buffered text and restores the parent node when an end tag is reached. |
|
public DTDHandler |
getDTDHandler()Returns the SAX DTD handler configured on the underlying reader. |
|
public GPathResult |
getDocument()
|
|
public EntityResolver |
getEntityResolver()Returns the SAX entity resolver configured on the underlying reader. |
|
public ErrorHandler |
getErrorHandler()Returns the SAX error handler configured on the underlying reader. |
|
public boolean |
getFeature(String uri)Looks up a SAX feature on the underlying reader. |
|
public Object |
getProperty(String uri)Looks up a SAX property on the underlying reader. |
|
public void |
ignorableWhitespace(char[] buffer, int start, int len)Receives ignorable whitespace and optionally preserves it as text content. |
|
public boolean |
isAllowDocTypeDeclaration()Determine if DOCTYPE declarations are allowed. |
|
public boolean |
isKeepIgnorableWhitespace()
|
|
public boolean |
isNamespaceAware()Determine if namespace handling is enabled. |
|
public boolean |
isValidating()Determine if the parser validates documents. |
|
public GPathResult |
parse(InputSource input)Parse the content of the specified input source into a GPathResult object |
|
public GPathResult |
parse(File file)Parses the content of the given file as XML turning it into a GPathResult object |
|
public GPathResult |
parse(InputStream input)Parse the content of the specified input stream into an GPathResult Object. |
|
public GPathResult |
parse(Reader in)Parse the content of the specified reader into a GPathResult Object. |
|
public GPathResult |
parse(String uri)Parse the content of the specified URI into a GPathResult Object |
|
public GPathResult |
parse(Path path)Parses the content of the file at the given path as XML turning it into a GPathResult object |
|
public GPathResult |
parseText(String text)A helper method to parse the given text as XML |
|
public void |
setAllowDocTypeDeclaration(boolean allowDocTypeDeclaration)Enable and/or disable DOCTYPE declaration support. |
|
public void |
setDTDHandler(DTDHandler dtdHandler)Sets the SAX DTD handler on the underlying reader. |
|
public void |
setEntityBaseUrl(URL base)Resolves entities against using the supplied URL as the base for relative URLs |
|
public void |
setEntityResolver(EntityResolver entityResolver)Sets the SAX entity resolver on the underlying reader. |
|
public void |
setErrorHandler(ErrorHandler errorHandler)Sets the SAX error handler on the underlying reader. |
|
public void |
setFeature(String uri, boolean value)Enables or disables a SAX feature on the underlying reader. |
|
public void |
setKeepIgnorableWhitespace(boolean keepIgnorableWhitespace)
|
|
public void |
setKeepWhitespace(boolean keepWhitespace)
|
|
public void |
setNamespaceAware(boolean namespaceAware)Enable and/or disable namespace handling. |
|
public void |
setProperty(String uri, Object value)Sets a SAX property on the underlying reader. |
|
public void |
setValidating(boolean validating)Enable and/or disable validation. |
|
public void |
startDocument()Resets the current slurped document before SAX events for a new parse begin. |
|
public void |
startElement(String namespaceURI, String localName, String qName, Attributes atts)Creates a slurper node for the current element and pushes it onto the parse stack. |
|
public void |
startPrefixMapping(String tag, String uri)Records namespace prefix hints for later GPathResult navigation. |
| Methods inherited from class | Name |
|---|---|
class DefaultHandler |
characters, declaration, endDocument, endElement, endPrefixMapping, equals, error, fatalError, getClass, hashCode, ignorableWhitespace, notationDecl, notify, notifyAll, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, toString, unparsedEntityDecl, wait, wait, wait, warning |
Creates a non-validating and namespace-aware XmlSlurper which does not allow DOCTYPE declarations in documents.
Parser options can be configured via setters before the first parse call:
// Using Groovy named parameters:
def slurper = new XmlSlurper(namespaceAware: false, keepIgnorableWhitespace: true)
Creates a XmlSlurper which does not allow DOCTYPE declarations in documents.
validating - true if the parser should validate documents as they are parsed; false otherwise.namespaceAware - true if the parser should provide support for XML namespaces; false otherwise. Creates a XmlSlurper.
validating - true if the parser should validate documents as they are parsed; false otherwise.namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.Creates a slurper backed by the supplied SAX reader.
reader - the XML reader whose features, properties, and handlers will be usedBuffers character data until the surrounding element boundary is reached.
ch - the character buffer supplied by SAXstart - the start offset in the bufferlength - the number of characters to readReceives the end-of-document callback. The built tree remains available through the one-shot getDocument() result.
Flushes buffered text and restores the parent node when an end tag is reached.
namespaceURI - the namespace URI, or an empty string if namespaces are unavailablelocalName - the local element nameqName - the qualified element name as reported by SAXReturns the SAX DTD handler configured on the underlying reader.
null if none has been set
Returns the SAX entity resolver configured on the underlying reader.
null if none has been setReturns the SAX error handler configured on the underlying reader.
null if none has been setLooks up a SAX feature on the underlying reader.
uri - the fully qualified SAX feature URItrue if the feature is enabledLooks up a SAX property on the underlying reader.
uri - the fully qualified SAX property URIReceives ignorable whitespace and optionally preserves it as text content.
buffer - the character buffer supplied by SAXstart - the start offset in the bufferlen - the number of characters to readDetermine if DOCTYPE declarations are allowed.
Determine if namespace handling is enabled.
Determine if the parser validates documents.
Parse the content of the specified input source into a GPathResult object
input - the InputSource to parseParses the content of the given file as XML turning it into a GPathResult object
file - the File to parseParse the content of the specified input stream into an GPathResult Object. Note that using this method will not provide the parser with any URI for which to find DTDs etc. It is up to you to close the InputStream after parsing is complete (if required).
input - the InputStream to parseParse the content of the specified reader into a GPathResult Object. Note that using this method will not provide the parser with any URI for which to find DTDs etc. It is up to you to close the Reader after parsing is complete (if required).
in - the Reader to parseParse the content of the specified URI into a GPathResult Object
uri - a String containing the URI to parseParses the content of the file at the given path as XML turning it into a GPathResult object
path - the path of the File to parseA helper method to parse the given text as XML
text - a String containing XML to parseEnable and/or disable DOCTYPE declaration support. Must be set before the first parse call.
allowDocTypeDeclaration - the new desired valueSets the SAX DTD handler on the underlying reader.
dtdHandler - the DTD handler to receive notation and unparsed entity callbacksResolves entities against using the supplied URL as the base for relative URLs
base - The URL used to resolve relative URLsSets the SAX entity resolver on the underlying reader.
entityResolver - the resolver to use for external entitiesSets the SAX error handler on the underlying reader.
errorHandler - the handler to receive parser warnings and errorsEnables or disables a SAX feature on the underlying reader.
uri - the fully qualified SAX feature URIvalue - the value to apply
keepIgnorableWhitespace - If true then ignorable whitespace (i.e. whitespace before elements) is kept.
The default is to discard the whitespace.keepWhitespace - If true then whitespace before elements is kept.
The default is to discard the whitespace.Enable and/or disable namespace handling. Must be set before the first parse call.
namespaceAware - the new desired valueSets a SAX property on the underlying reader.
uri - the fully qualified SAX property URIvalue - the value to applyEnable and/or disable validation. Must be set before the first parse call.
validating - the new desired valueResets the current slurped document before SAX events for a new parse begin.
Creates a slurper node for the current element and pushes it onto the parse stack.
namespaceURI - the namespace URI, or an empty string if namespaces are unavailablelocalName - the local element nameqName - the qualified element name as reported by SAXatts - the element attributesRecords namespace prefix hints for later GPathResult navigation.
tag - the declared prefixuri - the namespace URI bound to the prefixCopyright © 2003-2026 The Apache Software Foundation. All rights reserved.