Package groovy.xml
Class XmlSlurper
java.lang.Object
org.xml.sax.helpers.DefaultHandler
groovy.xml.XmlSlurper
- All Implemented Interfaces:
ContentHandler,DTDHandler,EntityResolver,ErrorHandler
Parse XML into a document tree that may be traversed similar to XPath
expressions. For example:
Note that in some cases, a 'selector' expression may not resolve to a single node. For example:
A more realistic example — a book catalog. Given this XML:
the equivalent Groovy to slurp it and navigate the tree:
Navigation through the returned GPathResult is lazy, so selectors are
evaluated on demand rather than exposing an eager groovy.util.Node tree.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a non-validating and namespace-awareXmlSlurperwhich does not allow DOCTYPE declarations in documents.XmlSlurper(boolean validating, boolean namespaceAware) Creates aXmlSlurperwhich does not allow DOCTYPE declarations in documents.XmlSlurper(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration) Creates aXmlSlurper.XmlSlurper(SAXParser parser) Creates a slurper backed by the supplied SAX parser.XmlSlurper(XMLReader reader) Creates a slurper backed by the supplied SAX reader. -
Method Summary
Modifier and TypeMethodDescriptionvoidcharacters(char[] ch, int start, int length) Buffers character data until the surrounding element boundary is reached.voidReceives the end-of-document callback.voidendElement(String namespaceURI, String localName, String qName) Flushes buffered text and restores the parent node when an end tag is reached.Returns the SAX DTD handler configured on the underlying reader.Returns the SAX entity resolver configured on the underlying reader.Returns the SAX error handler configured on the underlying reader.booleangetFeature(String uri) Looks up a SAX feature on the underlying reader.getProperty(String uri) Looks up a SAX property on the underlying reader.voidignorableWhitespace(char[] buffer, int start, int len) Receives ignorable whitespace and optionally preserves it as text content.booleanDetermine if DOCTYPE declarations are allowed.booleanbooleanDetermine if namespace handling is enabled.booleanDetermine if the parser validates documents.Parses the content of the given file as XML turning it into a GPathResult objectparse(InputStream input) Parse the content of the specified input stream into an GPathResult Object.Parse the content of the specified reader into a GPathResult Object.Parse the content of the specified URI into a GPathResult ObjectParses the content of the file at the given path as XML turning it into a GPathResult objectparse(InputSource input) Parse the content of the specified input source into a GPathResult objectA helper method to parse the given text as XMLvoidsetAllowDocTypeDeclaration(boolean allowDocTypeDeclaration) Enable and/or disable DOCTYPE declaration support.voidsetDTDHandler(DTDHandler dtdHandler) Sets the SAX DTD handler on the underlying reader.voidsetEntityBaseUrl(URL base) Resolves entities against using the supplied URL as the base for relative URLsvoidsetEntityResolver(EntityResolver entityResolver) Sets the SAX entity resolver on the underlying reader.voidsetErrorHandler(ErrorHandler errorHandler) Sets the SAX error handler on the underlying reader.voidsetFeature(String uri, boolean value) Enables or disables a SAX feature on the underlying reader.voidsetKeepIgnorableWhitespace(boolean keepIgnorableWhitespace) voidsetKeepWhitespace(boolean keepWhitespace) Deprecated.use setKeepIgnorableWhitespacevoidsetNamespaceAware(boolean namespaceAware) Enable and/or disable namespace handling.voidsetProperty(String uri, Object value) Sets a SAX property on the underlying reader.voidsetValidating(boolean validating) Enable and/or disable validation.voidResets the current slurped document before SAX events for a new parse begin.voidstartElement(String namespaceURI, String localName, String qName, Attributes atts) Creates a slurper node for the current element and pushes it onto the parse stack.voidstartPrefixMapping(String tag, String uri) Records namespace prefix hints for laterGPathResultnavigation.Methods inherited from class org.xml.sax.helpers.DefaultHandler
endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, unparsedEntityDecl, warningMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.xml.sax.ContentHandler
declaration
-
Constructor Details
-
XmlSlurper
Creates a non-validating and namespace-awareXmlSlurperwhich does not allow DOCTYPE declarations in documents.Parser options can be configured via setters before the first parse call:
// Using Groovy named parameters: def slurper = new XmlSlurper(namespaceAware: false, keepIgnorableWhitespace: true)
- Throws:
ParserConfigurationException- if no parser which satisfies the requested configuration can be created.SAXException- for SAX errors.
-
XmlSlurper
public XmlSlurper(boolean validating, boolean namespaceAware) throws ParserConfigurationException, SAXException Creates aXmlSlurperwhich does not allow DOCTYPE declarations in documents.- Parameters:
validating-trueif the parser should validate documents as they are parsed; false otherwise.namespaceAware-trueif the parser should provide support for XML namespaces;falseotherwise.- Throws:
ParserConfigurationException- if no parser which satisfies the requested configuration can be created.SAXException- for SAX errors.
-
XmlSlurper
public XmlSlurper(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration) throws ParserConfigurationException, SAXException Creates aXmlSlurper.- Parameters:
validating-trueif the parser should validate documents as they are parsed; false otherwise.namespaceAware-trueif the parser should provide support for XML namespaces;falseotherwise.allowDocTypeDeclaration-trueif the parser should provide support for DOCTYPE declarations;falseotherwise.- Throws:
ParserConfigurationException- if no parser which satisfies the requested configuration can be created.SAXException- for SAX errors.
-
XmlSlurper
Creates a slurper backed by the supplied SAX reader.- Parameters:
reader- the XML reader whose features, properties, and handlers will be used
-
XmlSlurper
Creates a slurper backed by the supplied SAX parser.- Parameters:
parser- the SAX parser providing theXMLReaderused for parsing- Throws:
SAXException- if the parser cannot provide an XML reader
-
-
Method Details
-
setKeepWhitespace
Deprecated.use setKeepIgnorableWhitespace- Parameters:
keepWhitespace- If true then whitespace before elements is kept. The default is to discard the whitespace.
-
setKeepIgnorableWhitespace
public void setKeepIgnorableWhitespace(boolean keepIgnorableWhitespace) - Parameters:
keepIgnorableWhitespace- If true then ignorable whitespace (i.e. whitespace before elements) is kept. The default is to discard the whitespace.
-
isKeepIgnorableWhitespace
public boolean isKeepIgnorableWhitespace()- Returns:
- true if ignorable whitespace is kept
-
isNamespaceAware
public boolean isNamespaceAware()Determine if namespace handling is enabled.- Returns:
- true if namespace handling is enabled
- Since:
- 6.0.0
-
setNamespaceAware
public void setNamespaceAware(boolean namespaceAware) Enable and/or disable namespace handling. Must be set before the first parse call.- Parameters:
namespaceAware- the new desired value- Throws:
IllegalStateException- if called after parsing has started- Since:
- 6.0.0
-
isValidating
public boolean isValidating()Determine if the parser validates documents.- Returns:
- true if validation is enabled
- Since:
- 6.0.0
-
setValidating
public void setValidating(boolean validating) Enable and/or disable validation. Must be set before the first parse call.- Parameters:
validating- the new desired value- Throws:
IllegalStateException- if called after parsing has started- Since:
- 6.0.0
-
isAllowDocTypeDeclaration
public boolean isAllowDocTypeDeclaration()Determine if DOCTYPE declarations are allowed.- Returns:
- true if DOCTYPE declarations are allowed
- Since:
- 6.0.0
-
setAllowDocTypeDeclaration
public void setAllowDocTypeDeclaration(boolean allowDocTypeDeclaration) Enable and/or disable DOCTYPE declaration support. Must be set before the first parse call.- Parameters:
allowDocTypeDeclaration- the new desired value- Throws:
IllegalStateException- if called after parsing has started- Since:
- 6.0.0
-
getDocument
- Returns:
- The GPathResult instance created by consuming a stream of SAX events Note if one of the parse methods has been called then this returns null Note if this is called more than once all calls after the first will return null
-
parse
Parse the content of the specified input source into a GPathResult object- Parameters:
input- the InputSource to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parse
Parses the content of the given file as XML turning it into a GPathResult object- Parameters:
file- the File to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parse
Parse the content of the specified input stream into an GPathResult Object. Note that using this method will not provide the parser with any URI for which to find DTDs etc. It is up to you to close the InputStream after parsing is complete (if required).- Parameters:
input- the InputStream to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parse
Parse the content of the specified reader into a GPathResult Object. Note that using this method will not provide the parser with any URI for which to find DTDs etc. It is up to you to close the Reader after parsing is complete (if required).- Parameters:
in- the Reader to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parse
Parse the content of the specified URI into a GPathResult Object- Parameters:
uri- a String containing the URI to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parse
Parses the content of the file at the given path as XML turning it into a GPathResult object- Parameters:
path- the path of the File to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
parseText
A helper method to parse the given text as XML- Parameters:
text- a String containing XML to parse- Returns:
- An object which supports GPath expressions
- Throws:
SAXException- Any SAX exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
-
getDTDHandler
Returns the SAX DTD handler configured on the underlying reader.- Returns:
- the configured DTD handler, or
nullif none has been set
-
getEntityResolver
Returns the SAX entity resolver configured on the underlying reader.- Returns:
- the configured entity resolver, or
nullif none has been set
-
getErrorHandler
Returns the SAX error handler configured on the underlying reader.- Returns:
- the configured error handler, or
nullif none has been set
-
getFeature
Looks up a SAX feature on the underlying reader.- Parameters:
uri- the fully qualified SAX feature URI- Returns:
trueif the feature is enabled- Throws:
SAXNotRecognizedException- if the feature name is not recognizedSAXNotSupportedException- if the feature is recognized but not supported
-
getProperty
Looks up a SAX property on the underlying reader.- Parameters:
uri- the fully qualified SAX property URI- Returns:
- the current value of the property
- Throws:
SAXNotRecognizedException- if the property name is not recognizedSAXNotSupportedException- if the property is recognized but not supported
-
setDTDHandler
Sets the SAX DTD handler on the underlying reader.- Parameters:
dtdHandler- the DTD handler to receive notation and unparsed entity callbacks
-
setEntityResolver
Sets the SAX entity resolver on the underlying reader.- Parameters:
entityResolver- the resolver to use for external entities
-
setEntityBaseUrl
Resolves entities against using the supplied URL as the base for relative URLs- Parameters:
base- The URL used to resolve relative URLs
-
setErrorHandler
Sets the SAX error handler on the underlying reader.- Parameters:
errorHandler- the handler to receive parser warnings and errors
-
setFeature
public void setFeature(String uri, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException Enables or disables a SAX feature on the underlying reader.- Parameters:
uri- the fully qualified SAX feature URIvalue- the value to apply- Throws:
SAXNotRecognizedException- if the feature name is not recognizedSAXNotSupportedException- if the feature is recognized but not supported
-
setProperty
public void setProperty(String uri, Object value) throws SAXNotRecognizedException, SAXNotSupportedException Sets a SAX property on the underlying reader.- Parameters:
uri- the fully qualified SAX property URIvalue- the value to apply- Throws:
SAXNotRecognizedException- if the property name is not recognizedSAXNotSupportedException- if the property is recognized but not supported
-
startDocument
Resets the current slurped document before SAX events for a new parse begin.- Specified by:
startDocumentin interfaceContentHandler- Overrides:
startDocumentin classDefaultHandler- Throws:
SAXException- if the SAX pipeline reports an error
-
startPrefixMapping
Records namespace prefix hints for laterGPathResultnavigation.- Specified by:
startPrefixMappingin interfaceContentHandler- Overrides:
startPrefixMappingin classDefaultHandler- Parameters:
tag- the declared prefixuri- the namespace URI bound to the prefix- Throws:
SAXException- if the SAX pipeline reports an error
-
startElement
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException Creates a slurper node for the current element and pushes it onto the parse stack.- Specified by:
startElementin interfaceContentHandler- Overrides:
startElementin classDefaultHandler- Parameters:
namespaceURI- the namespace URI, or an empty string if namespaces are unavailablelocalName- the local element nameqName- the qualified element name as reported by SAXatts- the element attributes- Throws:
SAXException- if node creation fails
-
ignorableWhitespace
Receives ignorable whitespace and optionally preserves it as text content.- Specified by:
ignorableWhitespacein interfaceContentHandler- Overrides:
ignorableWhitespacein classDefaultHandler- Parameters:
buffer- the character buffer supplied by SAXstart- the start offset in the bufferlen- the number of characters to read- Throws:
SAXException- if the SAX pipeline reports an error
-
characters
Buffers character data until the surrounding element boundary is reached.- Specified by:
charactersin interfaceContentHandler- Overrides:
charactersin classDefaultHandler- Parameters:
ch- the character buffer supplied by SAXstart- the start offset in the bufferlength- the number of characters to read- Throws:
SAXException- if the SAX pipeline reports an error
-
endElement
Flushes buffered text and restores the parent node when an end tag is reached.- Specified by:
endElementin interfaceContentHandler- Overrides:
endElementin classDefaultHandler- Parameters:
namespaceURI- the namespace URI, or an empty string if namespaces are unavailablelocalName- the local element nameqName- the qualified element name as reported by SAX- Throws:
SAXException- if text handling fails
-
endDocument
Receives the end-of-document callback. The built tree remains available through the one-shotgetDocument()result.- Specified by:
endDocumentin interfaceContentHandler- Overrides:
endDocumentin classDefaultHandler- Throws:
SAXException- if the SAX pipeline reports an error
-