Class XMLParser

java.lang.Object
com.codename1.xml.XMLParser
Direct Known Subclasses:
HTMLParser

public class XMLParser extends Object

Parser class used to parse an XML document into a DOM object (Element). This code was originally developed to parse HTML and as a result isn't as strict as most XML parsers and can parse many HTML documents out of the box. The parser is mostly stateful (although it does have an event callback API as well), its modeled closely to the Java DOM API's.

In this sample an XML hierarchy is displayed using a com.codename1.ui.tree.Tree:

class XMLTreeModel implements TreeModel {
    private Element root;
    public XMLTreeModel(Element e) {
        root = e;
    }

    public Vector getChildren(Object parent) {
        if(parent == null) {
            Vector c = new Vector();
            c.addElement(root);
            return c;
        }
        Vector result = new Vector();
        Element e = (Element)parent;
        for(int iter = 0 ; iter

@author Ofir Leitner
  • Constructor Details

    • XMLParser

      public XMLParser()
  • Method Details

    • getSupportedStandardName

      protected String getSupportedStandardName()

      Returns a string identifying the document type this parser supports. This should be overriden by subclassing parsers.

      Returns

      a string identifying the document type this parser supports.

    • addCharEntity

      public void addCharEntity(String symbol, int code)

      Adds the given symbol and code to the user defined char entities table http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

      Parameters
      • symbol: The symbol to add

      • code: The symbol's code

    • addCharEntitiesRange

      public void addCharEntitiesRange(String[] symbols, int startcode)

      Adds the given symbols array to the user defined char entities table with the startcode provided as the code of the first string, startcode+1 for the second etc. Some strings in the symbols array may be null thus skipping code numbers.

      Parameters
      • symbols: The symbols to add

      • startcode: The symbol's code

    • convertCharEntity

      protected String convertCharEntity(String charEntity)

      Converts a char entity to the matching character. This handles both numbered and symbol char entities (The latter is done via getCharEntityCode)

      Parameters
      • charEntity: The char entity to convert
      Returns

      A string containing a single char, or the original char entity string (with & and ;) if the char entity couldn't be resolved

    • parse

      public Element parse(Reader is)

      This is the entry point for parsing a document and the only non-private member method in this class

      Parameters
      • is: The InputStream containing the XML
      Returns

      an Element object describing the parsed document (Basically its DOM)

    • createNewElement

      protected Element createNewElement(String name)

      Creates a new element. This should be overriden by parsers that use a subclass of Element.

      Parameters
      • name: The new element's name
      Returns

      a new instance of the element

    • createNewTextElement

      protected Element createNewTextElement(String text)

      Creates a new text element. This should be overriden by parsers that use a subclass of Element.

      Parameters
      • text: The new element's text
      Returns

      a new instance of the element

    • setIncludeWhitespacesBetweenTags

      public void setIncludeWhitespacesBetweenTags(boolean include)
    • eventParser

      public void eventParser(Reader r) throws IOException

      The event parser requires deriving this class and overriding callback methods to work effectively. To stop the event parser in mid way a callback can simply throw an IOException on purpose.

      Parameters
      • r: the reader from which the data should be parsed
      Throws
      • java.io.IOException: if an exception is thrown by the reader
      Throws:
      IOException
    • textElement

      protected void textElement(String text)

      Invoked when the event parser encounters a text element. This callback method is invoked only on the eventParser.

      Parameters
      • text: the text encountered
    • startTag

      protected boolean startTag(String tag)

      Invoked when a tag is opened, this method should return true to process the tag or return false to skip the tag. This callback method is invoked only on the eventParser.

      Parameters
      • tag: the tag name
      Returns

      true to process the tag, false to skip the tag

    • endTag

      protected void endTag(String tag)

      Invoked when a tag ends This callback method is invoked only on the eventParser.

      Parameters
      • tag: the tag name
    • attribute

      protected void attribute(String tag, String attributeName, String value)

      Invoked for every attribute value of the givne tag This callback method is invoked only on the eventParser.

      Parameters
      • tag: the tag name
    • parseTagContent

      protected void parseTagContent(Element element, Reader is) throws IOException

      Parses tags content, accumulating text and child elements . Upon bumping a start tag character it calls the parseTag method. This method is called at first from the parse method, and later on from parseTag (which creates the recursion).

      Parameters
      • element: The current parent element

      • is: The InputStream containing the XML

      Throws
      • IOException: if an I/O error in the stream is encountered
      Throws:
      IOException
    • isWhiteSpace

      protected boolean isWhiteSpace(char ch)

      Checks if the specified character is a white space or not. Exposed to packaage since used by HTMLComponent as well

      Parameters
      • ch: The character to check
      Returns

      true if the character is a white space, false otherwise

    • parseTag

      protected Element parseTag(Reader is) throws IOException

      This method collects the tag name and all of its attributes. For comments and XML declarations this will call the parseCommentOrXMLDeclaration method. Note that this method returns an Element with a name and attrbutes, but not its content/children which will be done by parseTagContent

      Parameters
      • is: The InputStream containing the XML
      Returns

      The parsed element

      Throws
      • IOException: if an I/O error in the stream is encountered
      Throws:
      IOException
    • parseCommentOrXMLDeclaration

      protected Element parseCommentOrXMLDeclaration(Reader is, String endTag) throws IOException

      This utility method is used to parse comments and XML declarations in the XML. The comment/declaration is returned as an Element, but is flagged as a comment since both comments and XML declarations are not part of the XML DOM. This method can be overridden to process specific XML declarations

      Parameters
      • is: The inputstream

      • endTag: The endtag to look for

      Returns

      An Element representing the comment or XML declartaion

      Throws
      • IOException
      Throws:
      IOException
    • isEmptyTag

      protected boolean isEmptyTag(String tagName)

      Checks whether the specified tag is an empty tag

      Parameters
      • tagName: The tag name to check
      Returns

      true if that tag is defined as an empty tag, false otherwise

    • notifyError

      protected void notifyError(int errorId, String tag, String attribute, String value, String description)

      A utility method used to notify an error to the ParserCallback and throw an IllegalArgumentException if parsingError returned false

      Parameters
      • errorId: The error ID, one of the ERROR_* constants in ParserCallback

      • tag: The tag in which the error occured (Can be null for non-tag related errors)

      • attribute: The attribute in which the error occured (Can be null for non-attribute related errors)

      • value: The value in which the error occured (Can be null for non-value related errors)

      • description: A verbal description of the error

      Throws
      • IllegalArgumentException: If the parser callback returned false on this error
    • isSupported

      protected boolean isSupported(Element element)

      Returns true if this element is supported, false otherwise In XMLParser this always returns true, but subclasses can determine if an element is supported in their context according to its name etc. Unsupported elements will be skipped by the parser and excluded from the resulting DOM object

      Parameters
      • element: The element to check
      Returns

      true if the element is supported, false otherwise

    • shouldEvaluate

      protected boolean shouldEvaluate(Element element)

      Checks if this element should be evaluated by the parser This can be overriden by subclasses to skip certain elements

      Parameters
      • element: The element to check
      Returns

      true if this element should be evaluated by the parser, false to skip it completely

    • setParserCallback

      public void setParserCallback(ParserCallback parserCallback)

      Sets the specified callback to serve as the callback for parsing errors

      Parameters
      • parserCallback: The callback to use for parsing errors
    • isCaseSensitive

      public boolean isCaseSensitive()

      Sets the parser to be case sensitive and retain case, otherwise it will convert all data to lower case

      Returns

      the caseSensitive

    • setCaseSensitive

      public void setCaseSensitive(boolean caseSensitive)

      Sets the parser to be case sensitive and retain case, otherwise it will convert all data to lower case

      Parameters
      • caseSensitive: the caseSensitive to set