如何使用SAX解析器解析XML?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (51)

我希望它返回一个包含所有字符串的数组,而不是一个带有最后一个元素的字符串。

提问于
用户回答回答于

<? xml version = "1.0" encoding = "UTF-8" standalone = "no"?> 
<doc> 
<id> 3 </ id> 
<fam> Ivanov </ fam> 
<name> Ivan </ name> 
<otc> I. </ otc> 
<dateb> 10-03-2005 </ dateb> 
<datep> 10-03-2005 </ datep> 
<datev> 10-03-2005 </ datev> 
<datebegin> 09-06-2009 </ datebegin> 
<dateend> 10-03-2005 </ dateend> 
<vdolid> 1 </ vdolid> 
<specid> 1 </ specid> 
<klavid> 1 </ klavid> 
<stav> 2.0 </ stav> 
<progid> 1 </ progid> 
</ doc> 

如您所知,要成功地使用解析器,我们需要重写所需的DefaultHandler方法。首先,连接所需的包。

import org.xml.sax.helpers.DefaultHandler; 
import org.xml.sax. *; 

现在我们可以开始编写解析器了。

public class SAXPars extends DefaultHandler {
   ... 
} 

让我们从startDocument()方法开始。顾名思义,他对文档开头的事件作出反应。在这里,您可以挂起各种操作,例如内存分配,或者重置值​​,但是我们的示例非常简单,因此只需标记适当消息的开始工作:

Override 
public void startDocument () throws SAXException {
   System.out.println ("Start parse XML ..."); 
} 

解析器通过文档满足其结构的元素。启动方法startElement()。事实上,他的出现如下:startElement(StringNampaceURI,StringlocalName,StringqName,Attribute atts)。在这里,名称空间URI-命名空间,localName-元素的本地名称,qName-一个本地名称与名称空间(由冒号分隔)和atts的组合-该元素的属性。在这种情况下,一切都很简单。只需使用qName‘OM并将其放入某个服务行这个元素即可。因此,我们标记了我们目前所处的元素。

@Override 
public void startElement (String namespaceURI, String localName, String qName, Attributes atts) throws SAXException {
   thisElement = qName; 
} 

接下来,会议项目我们得到了它的意义。这里包括方法字符()。他的形式是:字符(char[]ch,int start,int ength)。在这里,一切都是清楚的。ch--一个包含字符串本身的文件,在这个元素中具有自我重要性。开始和长度-服务的数量,指示行和长度中的起始点。

@Override 
public void characters (char [] ch, int start, int length) throws SAXException {
   if (thisElement.equals ("id")) {
      doc.setId (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("fam")) {
      doc.setFam (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("name")) {
      doc.setName (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("otc")) {
      doc.setOtc (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("dateb")) {
      doc.setDateb (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datep")) {
      doc.setDatep (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datev")) {
      doc.setDatev (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datebegin")) {
      doc.setDatebegin (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("dateend")) {
      doc.setDateend (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("vdolid")) {
      doc.setVdolid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("specid")) {
      doc.setSpecid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("klavid")) {
      doc.setKlavid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("stav")) {
      doc.setStav (new Float (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("progid")) {
      doc.setProgid (new Integer (new String (ch, start, length))); 
   } 
} 

这个类是定义的,并且拥有所有必需的setters-getter。下一个明显的元素结束,然后是下一个元素。负责结束endElement()。它向我们发出信号,这个项目已经结束,你可以在这个时候做任何事情。将继续进行。

@Override 
public void endElement (String namespaceURI, String localName, String qName) throws SAXException {
   thisElement = ""; 
} 

所以整个文件,我们就到了文件的末尾。Work endDocument()。在它中,我们可以释放内存,做一些诊断,如打印等。在我们的例子中,只需写一下解析的结尾。

@Override 
public void endDocument () {
   System.out.println ("Stop parse XML ..."); 
} 

所以我们有一个类来解析XML我们的格式。全文如下:

import org.xml.sax.helpers.DefaultHandler; 
import org.xml.sax. *; 
 
public class SAXPars extends DefaultHandler {
 
Doctors doc = new Doctors (); 
String thisElement = ""; 
 
public Doctors getResult () {
   return doc; 
} 
 
@Override 
public void startDocument () throws SAXException {
   System.out.println ("Start parse XML ..."); 
} 
 
@Override 
public void startElement (String namespaceURI, String localName, String qName, Attributes atts) throws SAXException {
   thisElement = qName; 
} 
 
@Override 
public void endElement (String namespaceURI, String localName, String qName) throws SAXException {
   thisElement = ""; 
} 
 
@Override 
public void characters (char [] ch, int start, int length) throws SAXException {
   if (thisElement.equals ("id")) {
      doc.setId (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("fam")) {
      doc.setFam (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("name")) {
      doc.setName (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("otc")) {
      doc.setOtc (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("dateb")) {
      doc.setDateb (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datep")) {
      doc.setDatep (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datev")) {
      doc.setDatev (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("datebegin")) {
      doc.setDatebegin (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("dateend")) {
      doc.setDateend (new String (ch, start, length)); 
   } 
   if (thisElement.equals ("vdolid")) {
      doc.setVdolid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("specid")) {
      doc.setSpecid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("klavid")) {
      doc.setKlavid (new Integer (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("stav")) {
      doc.setStav (new Float (new String (ch, start, length))); 
   } 
   if (thisElement.equals ("progid")) {
      doc.setProgid (new Integer (new String (ch, start, length))); 
   } 
} 
 
@Override 
public void endDocument () {
   System.out.println ("Stop parse XML ..."); 
} 
} 

要运行这个解析器,可以使用以下代码:

SAXParserFactory factory = SAXParserFactory.newInstance (); 
SAXParser parser = factory.newSAXParser (); 
SAXPars saxp = new SAXPars (); 
 
parser.parse (new File ("..."), saxp); 

用户回答回答于

因此,您希望构建一个XML解析器来解析类似于此的RSS提要。

<rss version="0.92">
<channel>
    <title>MyTitle</title>
    <link>http://myurl.com</link>
    <description>MyDescription</description>
    <lastBuildDate>SomeDate</lastBuildDate>
    <docs>http://someurl.com</docs>
    <language>SomeLanguage</language>

    <item>
        <title>TitleOne</title>
        <description><![CDATA[Some text.]]></description>
        <link>http://linktoarticle.com</link>
    </item>

    <item>
        <title>TitleTwo</title>
        <description><![CDATA[Some other text.]]></description>
        <link>http://linktoanotherarticle.com</link>
    </item>

</channel>
</rss>

现在您可以使用两个SAX实现。或者使用org.xml.sax或者android.sax执行。我将解释的正反两方面后,张贴了一个短手的例子。

android.SAX:

Channel.java

public class Channel implements Serializable {

    private Items items;
    private String title;
    private String link;
    private String description;
    private String lastBuildDate;
    private String docs;
    private String language;

    public Channel() {
        setItems(null);
        setTitle(null);
        // set every field to null in the constructor
    }

    public void setItems(Items items) {
        this.items = items;
    }

    public Items getItems() {
        return items;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTitle() {
        return title;
    }
    // rest of the class looks similar so just setters and getters
}

Items.java

public class Items extends ArrayList<Item> {

    public Items() {
        super();
    }

}

这是我们的物品容器。我们现在需要一个类来保存每个项目的数据。

Item.java

public class Item implements Serializable {

    private String title;
    private String description;
    private String link;

    public Item() {
        setTitle(null);
        setDescription(null);
        setLink(null);
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTitle() {
        return title;
    }

    // same as above.

}

例子:

public class Example extends DefaultHandler {

    private Channel channel;
    private Items items;
    private Item item;

    public Example() {
        items = new Items();
    }

    public Channel parse(InputStream is) {
        RootElement root = new RootElement("rss");
        Element chanElement = root.getChild("channel");
        Element chanTitle = chanElement.getChild("title");
        Element chanLink = chanElement.getChild("link");
        Element chanDescription = chanElement.getChild("description");
        Element chanLastBuildDate = chanElement.getChild("lastBuildDate");
        Element chanDocs = chanElement.getChild("docs");
        Element chanLanguage = chanElement.getChild("language");

        Element chanItem = chanElement.getChild("item");
        Element itemTitle = chanItem.getChild("title");
        Element itemDescription = chanItem.getChild("description");
        Element itemLink = chanItem.getChild("link");

        chanElement.setStartElementListener(new StartElementListener() {
            public void start(Attributes attributes) {
                channel = new Channel();
            }
        });

        // Listen for the end of a text element and set the text as our
        // channel's title.
        chanTitle.setEndTextElementListener(new EndTextElementListener() {
            public void end(String body) {
                channel.setTitle(body);
            }
        });

        // Same thing happens for the other elements of channel ex.

        // On every <item> tag occurrence we create a new Item object.
        chanItem.setStartElementListener(new StartElementListener() {
            public void start(Attributes attributes) {
                item = new Item();
            }
        });

        // On every </item> tag occurrence we add the current Item object
        // to the Items container.
        chanItem.setEndElementListener(new EndElementListener() {
            public void end() {
                items.add(item);
            }
        });

        itemTitle.setEndTextElementListener(new EndTextElementListener() {
            public void end(String body) {
                item.setTitle(body);
            }
        });

        // and so on

        // here we actually parse the InputStream and return the resulting
        // Channel object.
        try {
            Xml.parse(is, Xml.Encoding.UTF_8, root.getContentHandler());
            return channel;
        } catch (SAXException e) {
            // handle the exception
        } catch (IOException e) {
            // handle the exception
        }

        return null;
    }

}

org.xml.SAX实现

public class ExampleHandler extends DefaultHandler {

    private Channel channel;
    private Items items;
    private Item item;
    private boolean inItem = false;

    private StringBuilder content;

    public ExampleHandler() {
        items = new Items();
        content = new StringBuilder();
    }

    public void startElement(String uri, String localName, String qName, 
            Attributes atts) throws SAXException {
        content = new StringBuilder();
        if(localName.equalsIgnoreCase("channel")) {
            channel = new Channel();
        } else if(localName.equalsIgnoreCase("item")) {
            inItem = true;
            item = new Item();
        }
    }

    public void endElement(String uri, String localName, String qName) 
            throws SAXException {
        if(localName.equalsIgnoreCase("title")) {
            if(inItem) {
                item.setTitle(content.toString());
            } else {
                channel.setTitle(content.toString());
            }
        } else if(localName.equalsIgnoreCase("link")) {
            if(inItem) {
                item.setLink(content.toString());
            } else {
                channel.setLink(content.toString());
            }
        } else if(localName.equalsIgnoreCase("description")) {
            if(inItem) {
                item.setDescription(content.toString());
            } else {
                channel.setDescription(content.toString());
            }
        } else if(localName.equalsIgnoreCase("lastBuildDate")) {
            channel.setLastBuildDate(content.toString());
        } else if(localName.equalsIgnoreCase("docs")) {
            channel.setDocs(content.toString());
        } else if(localName.equalsIgnoreCase("language")) {
            channel.setLanguage(content.toString());
        } else if(localName.equalsIgnoreCase("item")) {
            inItem = false;
            items.add(item);
        } else if(localName.equalsIgnoreCase("channel")) {
            channel.setItems(items);
        }
    }

    public void characters(char[] ch, int start, int length) 
            throws SAXException {
        content.append(ch, start, length);
    }

    public void endDocument() throws SAXException {
        // you can do something here for example send
        // the Channel object somewhere or whatever.
    }

}

扫码关注云+社区