2013年7月8日 星期一

Use Document class to parse XML/HTML content

Recently I got an assignment which asks me to do the XML parsing on a Android device. It's part of a project which requires parsing redirected XML file and obtaining embedded streaming information and then stream them and being able to watch them on a Android device. So I start it off by searching for possible solution to XML parsing in Android and luckily, there are a couple of ways to achieve that goal. One way I found is to use XmlPullParser class, which I'll leave it to another post to discuss in the future because I'm not going to use it in this project, the other is what I'm going to discuss in this post, Document class, and I found it very easy and flexible to use compared to XmlPullParser class.

I've not dug through all the details of Document class so there will be no comprehensive illustration of this class here. You could refer to the reference link at the end of this post for some details. Document class can be used to parse XML/HTML content and establish a tree for users to search for desirable information. In order to get what I need from a XML file, the following is a code snippet I copied and modifed from one of the reference link I list at the end of this post:

public class XMLParser {

    public XMLParser() {

    }
   
    public String getXmlFromUrl(String url) {
        String xml = null;

        try {
           
//To obtain target XML file with a known URL thourgh HTTP client
            DefaultHttpClient httpClient = new DefaultHttpClient();
            HttpPost httpPost = new HttpPost(url);

            HttpResponse httpResponse = httpClient.execute(httpPost);
            HttpEntity httpEntity = httpResponse.getEntity();
            xml = EntityUtils.toString(httpEntity);

        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return xml;
    }
    

   //Transform XML string to Document structure
    public Document getDomElement(String xml){
        Document doc = null;
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        try {

            DocumentBuilder db = dbf.newDocumentBuilder();

            InputSource is = new InputSource();
                is.setCharacterStream(new StringReader(xml));
                doc = db.parse(is);

            } catch (ParserConfigurationException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            } catch (SAXException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            } catch (IOException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            }
                // return DOM
            return doc;
    }
    

   //Search and get a list composed of nodes with the same tag name as str, and pass in the first node
    public String getValue(Element item, String str) {   
        NodeList n = item.getElementsByTagName(str);     
        return this.getElementValue(n.item(0));
    }
     

    //return text value if exist
    public final String getElementValue( Node elem ) {
             Node child;
             if( elem != null){
                 if (elem.hasChildNodes()){
                     for( child = elem.getFirstChild(); child != null; child = child.getNextSibling() ){
                         if( child.getNodeType() == Node.TEXT_NODE  ){
                             return child.getNodeValue();
                         }
                     }
                 }
             }
             return "";
    }
}


The following code snippet shows how this class can be used:

private String xml = null;
private String url = null;
private Document doc = null;

XMLParser parser = new XMLParser();
url = "http://...";
xml = parser.getXmlFromUrl(url);
doc = parser.getDomElement(xml);
displayIndexXml();

private void displayNavXml() {
        NodeList nl = doc.getElementsByTagName("navigationItem");
   
        for (int i = 0; i < nl.getLength(); i++) {
            Element e = (Element) nl.item(i);
            String title = parser.getValue(e, "title");
            String url = parser.getValue(e, "url");
            System.out.println("<"+title+","+url+">");
        }
 }


Target XML in this example:

 <?xml version="1.0" encoding="UTF-8"?><atv>
    <head>
        <script src="http://trailers.apple.com/appletv/us/js/main.js"/>
    </head>
    <body>
        <viewWithNavigationBar id="com.trailers.navigation-bar">
            <navigation currentIndex="0">
                <navigationItem id="nav1">
                    <title>Top Trailers</title>
                    <url>http://trailers.apple.com/appletv/us/index.xml</url>
                </navigationItem>
                <navigationItem id="nav2">
                    <title>Calendar</title>
                    <url>http://trailers.apple.com/appletv/us/calendar.xml</url>
                </navigationItem>
                <navigationItem id="nav3">
                    <title>Browse</title>
                    <url>http://trailers.apple.com/appletv/us/browse.xml</url>
                </navigationItem>
                <navigationItem id="nav4">
                    <title>Search</title>
                    <url>http://trailers.apple.com/appletv/us/searchtrailers.xml</url>
                </navigationItem>
            </navigation>
        </viewWithNavigationBar>
    </body>
</atv>


logcat output:

<Top Trailers,http://trailers.apple.com/appletv/us/index.xml>
<Calendar,http://trailers.apple.com/appletv/us/calendar.xml>
<Browse,http://trailers.apple.com/appletv/us/browse.xml>
<Search,http://trailers.apple.com/appletv/us/searchtrailers.xml>

Reference
http://developer.android.com/reference/org/w3c/dom/Document.html
http://www.androidhive.info/2011/11/android-xml-parsing-tutorial/

沒有留言:

張貼留言