Class XMLParser
- Direct Known Subclasses:
HTMLParser
Parser class used to parse an XML document into a DOM object (Element). This code was originally developed to parse HTML and as a result isn't as strict as most XML parsers and can parse many HTML documents out of the box. The parser is mostly stateful (although it does have an event callback API as well), its modeled closely to the Java DOM API's.
In this sample an XML hierarchy is displayed using a com.codename1.ui.tree.Tree:
class XMLTreeModel implements TreeModel {
private Element root;
public XMLTreeModel(Element e) {
root = e;
}
public Vector getChildren(Object parent) {
if(parent == null) {
Vector c = new Vector();
c.addElement(root);
return c;
}
Vector result = new Vector();
Element e = (Element)parent;
for(int iter = 0 ; iter
@author Ofir Leitner
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidaddCharEntitiesRange(String[] symbols, int startcode) Adds the given symbols array to the user defined char entities table with the startcode provided as the code of the first string, startcode+1 for the second etc.voidaddCharEntity(String symbol, int code) Adds the given symbol and code to the user defined char entities table http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_referencesprotected voidInvoked for every attribute value of the givne tag This callback method is invoked only on the eventParser.protected StringconvertCharEntity(String charEntity) Converts a char entity to the matching character.protected ElementcreateNewElement(String name) Creates a new element.protected ElementcreateNewTextElement(String text) Creates a new text element.protected voidInvoked when a tag ends This callback method is invoked only on the eventParser.voidThe event parser requires deriving this class and overriding callback methods to work effectively.protected StringReturns a string identifying the document type this parser supports.booleanSets the parser to be case sensitive and retain case, otherwise it will convert all data to lower caseprotected booleanisEmptyTag(String tagName) Checks whether the specified tag is an empty tagprotected booleanisSupported(Element element) Returns true if this element is supported, false otherwise In XMLParser this always returns true, but subclasses can determine if an element is supported in their context according to its name etc.protected booleanisWhiteSpace(char ch) Checks if the specified character is a white space or not.protected voidnotifyError(int errorId, String tag, String attribute, String value, String description) A utility method used to notify an error to the ParserCallback and throw an IllegalArgumentException if parsingError returned falseThis is the entry point for parsing a document and the only non-private member method in this classprotected ElementparseCommentOrXMLDeclaration(Reader is, String endTag) This utility method is used to parse comments and XML declarations in the XML.protected ElementThis method collects the tag name and all of its attributes.protected voidparseTagContent(Element element, Reader is) Parses tags content, accumulating text and child elements .voidsetCaseSensitive(boolean caseSensitive) Sets the parser to be case sensitive and retain case, otherwise it will convert all data to lower casevoidsetIncludeWhitespacesBetweenTags(boolean include) voidsetParserCallback(ParserCallback parserCallback) Sets the specified callback to serve as the callback for parsing errorsprotected booleanshouldEvaluate(Element element) Checks if this element should be evaluated by the parser This can be overriden by subclasses to skip certain elementsprotected booleanInvoked when a tag is opened, this method should return true to process the tag or return false to skip the tag.protected voidtextElement(String text) Invoked when the event parser encounters a text element.
-
Constructor Details
-
XMLParser
public XMLParser()
-
-
Method Details
-
getSupportedStandardName
Returns a string identifying the document type this parser supports. This should be overriden by subclassing parsers.
Returns
a string identifying the document type this parser supports.
-
addCharEntity
Adds the given symbol and code to the user defined char entities table http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Parameters
-
symbol: The symbol to add -
code: The symbol's code
-
-
addCharEntitiesRange
Adds the given symbols array to the user defined char entities table with the startcode provided as the code of the first string, startcode+1 for the second etc. Some strings in the symbols array may be null thus skipping code numbers.
Parameters
-
symbols: The symbols to add -
startcode: The symbol's code
-
-
convertCharEntity
Converts a char entity to the matching character. This handles both numbered and symbol char entities (The latter is done via getCharEntityCode)
Parameters
charEntity: The char entity to convert
Returns
A string containing a single char, or the original char entity string (with & and ;) if the char entity couldn't be resolved
-
parse
-
createNewElement
-
createNewTextElement
-
setIncludeWhitespacesBetweenTags
public void setIncludeWhitespacesBetweenTags(boolean include) -
eventParser
The event parser requires deriving this class and overriding callback methods to work effectively. To stop the event parser in mid way a callback can simply throw an IOException on purpose.
Parameters
r: the reader from which the data should be parsed
Throws
java.io.IOException: if an exception is thrown by the reader
- Throws:
IOException
-
textElement
Invoked when the event parser encounters a text element. This callback method is invoked only on the eventParser.
Parameters
text: the text encountered
-
startTag
Invoked when a tag is opened, this method should return true to process the tag or return false to skip the tag. This callback method is invoked only on the eventParser.
Parameters
tag: the tag name
Returns
true to process the tag, false to skip the tag
-
endTag
Invoked when a tag ends This callback method is invoked only on the eventParser.
Parameters
tag: the tag name
-
attribute
-
parseTagContent
Parses tags content, accumulating text and child elements . Upon bumping a start tag character it calls the parseTag method. This method is called at first from the parse method, and later on from parseTag (which creates the recursion).
Parameters
-
element: The current parent element -
is: The InputStream containing the XML
Throws
IOException: if an I/O error in the stream is encountered
- Throws:
IOException
-
-
isWhiteSpace
protected boolean isWhiteSpace(char ch) Checks if the specified character is a white space or not. Exposed to packaage since used by HTMLComponent as well
Parameters
ch: The character to check
Returns
true if the character is a white space, false otherwise
-
parseTag
This method collects the tag name and all of its attributes. For comments and XML declarations this will call the parseCommentOrXMLDeclaration method. Note that this method returns an Element with a name and attrbutes, but not its content/children which will be done by parseTagContent
Parameters
is: The InputStream containing the XML
Returns
The parsed element
Throws
IOException: if an I/O error in the stream is encountered
- Throws:
IOException
-
parseCommentOrXMLDeclaration
This utility method is used to parse comments and XML declarations in the XML. The comment/declaration is returned as an Element, but is flagged as a comment since both comments and XML declarations are not part of the XML DOM. This method can be overridden to process specific XML declarations
Parameters
-
is: The inputstream -
endTag: The endtag to look for
Returns
An Element representing the comment or XML declartaion
Throws
IOException
- Throws:
IOException
-
-
isEmptyTag
Checks whether the specified tag is an empty tag
Parameters
tagName: The tag name to check
Returns
true if that tag is defined as an empty tag, false otherwise
-
notifyError
protected void notifyError(int errorId, String tag, String attribute, String value, String description) A utility method used to notify an error to the ParserCallback and throw an IllegalArgumentException if parsingError returned false
Parameters
-
errorId: The error ID, one of the ERROR_* constants in ParserCallback -
tag: The tag in which the error occured (Can be null for non-tag related errors) -
attribute: The attribute in which the error occured (Can be null for non-attribute related errors) -
value: The value in which the error occured (Can be null for non-value related errors) -
description: A verbal description of the error
Throws
IllegalArgumentException: If the parser callback returned false on this error
-
-
isSupported
Returns true if this element is supported, false otherwise In XMLParser this always returns true, but subclasses can determine if an element is supported in their context according to its name etc. Unsupported elements will be skipped by the parser and excluded from the resulting DOM object
Parameters
element: The element to check
Returns
true if the element is supported, false otherwise
-
shouldEvaluate
Checks if this element should be evaluated by the parser This can be overriden by subclasses to skip certain elements
Parameters
element: The element to check
Returns
true if this element should be evaluated by the parser, false to skip it completely
-
setParserCallback
Sets the specified callback to serve as the callback for parsing errors
Parameters
parserCallback: The callback to use for parsing errors
-
isCaseSensitive
public boolean isCaseSensitive()Sets the parser to be case sensitive and retain case, otherwise it will convert all data to lower case
Returns
the caseSensitive
-
setCaseSensitive
public void setCaseSensitive(boolean caseSensitive) Sets the parser to be case sensitive and retain case, otherwise it will convert all data to lower case
Parameters
caseSensitive: the caseSensitive to set
-