This discussion is archived
13 Replies Latest reply: May 7, 2009 12:58 AM by 843851 RSS

Parsing XML

843851 Newbie
Currently Being Moderated
Hi,

¿Which is the most suitable library for parsing a XML file in MHP? I've tried with NanoXML but the STB gets blocked for too much time, I think it's because nanoXML takes the whole file and creates his internal representation, in my case is a large file because I'm parsing a RSS and the memory is a limited resource in a STB. Then I've tried with kxml 2 (but the documentation about this is non-existentt) and kxml following the suggestion of this page: [http://developers.sun.com/mobility/midp/articles/parsingxml/|http://developers.sun.com/mobility/midp/articles/parsingxml/] . I also tried this
  adapting to MHP enviroment and it doesn't work too, seems that the thread that reads each item doesn't work.

Help please

thanks in advance
Greetings                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
  • 1. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Yes, nanoxml reads the whole file for the first time and it could take a while. Anyway, it shouldn't block the STB for too much time, maybe the XML is too big? You could use a thread, event with a synchronized method, maybe you can make the application work with other things as the GUI while loads the XML.
  • 2. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    thank for your answer. Yes the in fact xml is a RSS so the file is large. I've read about other type of parser and I think that pull or push model parsers have to be more suitable because I want only to show the latest 20 news and is not necessary to read the entire RSS for this purpose. nanoXML is the most common parser or is there any other? I have problem trying other so ¿could someone suggest my one for this purpose?¿why the MHP middelware doesn't include a library for that purpose?

    Greetings
  • 3. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    I think kxml has pull mode parsing, I've never tested it, so if you do, please tell us about loading times and sizes of the classes.

    I suposse that avoiding XML parsers in the MHP implementations is to save space and cpu. Nanoxml is the most common because it's the smallest parser you can find. But if someone know about anyone smaller, please tell!

    PD: apenz, are you spanish? Me too ;)
  • 4. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    i am working on the xml parsing the for a long time. and i have tried both nanoxml and kxml parser. this parser are larger in their size. i am currently using the parser which just 20kb in size and work fine. you can download this by downloading the mhp1.2 stubs from the following link.

    http://www.code4tv.com/res/stubs/code4tv-MHP-1.1.2-stubs-v1.1.rar

    just extract the file and the j2me_xml_ri.jar file and add to the class file of your application. their may be some problem for using it, but it easy to use. if any problem regarding this parser you ask me.
    cheer!
  • 5. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi,

    Thank you very much for your answers @Buddhdev.Sahu and @Zuarko.

    @Zuarko I've tested kxml and newer version of this library kxml2 (this is the forum thread [http://forums.sun.com/thread.jspa?threadID=5378272|http://forums.sun.com/thread.jspa?threadID=5378272] ,I haven't get any answer and there isn't any documentation) but I have unresolved exception with both, although for kxml I have just follow the sample code of the link I have post above,adapting it to the MHP enviroment. Answering your final question I'm from the Basque Country and more exactly from San Sebastian, osea que claro que se hablar español, por lo que he podido ver andamos mucha gente del estado español trabajando en temas relacionados con la iTV, MHP etc. . Espero que nos podamos ayudar entre nosotros para crear buenas aplicaciones interactivas para la televisión, aunque desconozco cuando el usuario doméstico podrá disfrutar de ellas, puesto que en la actualidad no se venden decodificadores que soportan MHP. Saludos.

    @Buddhdev.Sahu thank you very much I will as you say try j2me_xml_ri and I hope finally I have success with it. Because of your nickname I think you're not spanish but the link you pointed is of a catalan company, so ¿am I wrong?

    Cheers

    Edited by: apenz on Apr 21, 2009 7:59 AM
  • 6. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi!
    thanks for answering back. i am from India and am working in MHP technology. the parser link what i have given is based on the SAXParser to use this you need to extends DefaultHandler or implement the DocumentHandler. create a instance of SAXParser and file the xml FileReader object and handler class object. if you have any problem regarding the parsing by the parser api you can contact me at sahumail.sahu@gmail.com. most of time i am online in this id.
    good luck!
  • 7. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi @BBuddhdev.Sahu and others

    Following @Buddhdev.Sahu's advice I 'm using j2me_xml_ri library to parse a RSS file. But I get this error: "SAXException= Unexpected end of file after pubDate", when the parser arrives to a blank line.

    The Function to parse RSS file:
         try {
              
                   SAXParserFactory factory=SAXParserFactory.newInstance();
                   inpStrm=new FileInputStream(xmlFile);
                   parser=factory.newSAXParser();
                   handler=new SAXHandler();
                   Logger.debug(this, "parseNewsFile(): before parse()");
                   Logger.debug(this, "parseNewsFile(): isValidating= "+parser.isValidating());
                   parser.parse(inpStrm, handler);
                   news=handler.getValues();
              
              } catch (ParserConfigurationException parserConfExcept) {
                   Logger.error(this, "parseNewsFile(): ParserConfigurationException= "+parserConfExcept.getMessage());
                   
              } catch (SAXException SAXExcept) {
                   Logger.error(this,"parseNewsFile(): SAXException= "+SAXExcept.getMessage());
    SAXHandler code:
    public void startElement(String uri, String localName,
                                  String qName, Attributes attrs)
                                                 throws SAXParseException {
         Logger.trace(this, "startElement(): TAG= "+tag);
            tag = qName;
         }
    
         // Value
         public void characters(char[] ch, int start, int length)
                                                 throws SAXParseException {
              
                   Logger.trace(this, "characters(): VALUE= "+value);
                   value = new String(ch, start, length);
                 
         }
    
         // Closing tag
         public void endElement(String uri, String localName,
                                  String qName, Attributes attrs)
                                              throws SAXParseException {
    
              Logger.trace(this,"endElement(): uri= "+uri+", localName= "+localName+", qname= "+qName+", attributes= "+attrs.toString());
              if (qName.equals("channel")){
                   return;
              }
              if(qName.equals("item")){
                   Logger.trace(this, "endElement(): title");
                   currentTitle=qName;
                             
              }
              else if(qName.equals("link")){
                   Logger.trace(this, "endElement(): link");
                   currentLink=qName;
                             
              }
    
              else if(qName.equals("description")){
                   Logger.trace(this, "endElement(): description");
                   currentDescpt=qName;
                             
              }
              else if(qName.equals("pubDate")){
                   Logger.trace(this, "endElement(): pubDate");
                   currentPubDate=qName;
                        
              }
              else if(qName.equals("item")){
                   StringTokenizer hourTokenizer;
                   int dayMonth,monthNumb,year,hour,min,seg;
                   Date newDate;
                   Logger.trace(this, "endElement(): item");
                  StringTokenizer strTokenizer=new StringTokenizer(currentPubDate);
                   strTokenizer.nextToken();
                   dayMonth=Integer.parseInt(strTokenizer.nextToken());
                   monthNumb=0;
                   String month=strTokenizer.nextToken();
                   for(monthNumb=0;monthNumb<monthsEn.length;monthNumb++){
                        if(monthsEn[monthNumb].equals(month)){
                                break;
                    }
                    }
                   year=Integer.parseInt(strTokenizer.nextToken());
                   hourTokenizer = new StringTokenizer(strTokenizer.nextToken());
                   Logger.debug(this, "parseNewsFile(): before getting hour");
                   hour=Integer.parseInt(hourTokenizer.nextToken(":"));
                   min=Integer.parseInt(hourTokenizer.nextToken(":"));
                   seg=Integer.parseInt(hourTokenizer.nextToken(":"));
                   auxCal.set(year, monthNumb, dayMonth, hour, min, seg);
                   newDate=auxCal.getTime();
                   currentNew=new New(currentTitle,currentLink,currentDescpt,newDate);
                   news.add(currentNew);
              }     
         }
    It doesn't even arrive to endElement function.

    I get the RSS from this link: [http://www.gipuzkoa.net/rss/gnet_es.xml]


    How can I solve it?

    thanks in advance,
    Greetings

    Edited by: apenz on Apr 28, 2009 8:48 AM
  • 8. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi! Apenz,
    i have read your code and it looks fine. May be you have some little mistake or may be memory problem. Here i place the basic code what i have tried. it gets the content from the rss xml file. Just try it out it run fine in your system then look out what mistake you have done.
    SAXParser parser = null;
              try {
                   parser = SAXParserFactory.newInstance().newSAXParser();
                   System.out.println("validate: " + parser.isValidating());
              } catch (ParserConfigurationException e) {
                   e.printStackTrace();
              } catch (SAXException e) {
                   e.printStackTrace();
              } catch (FactoryConfigurationError e) {
                   e.printStackTrace();
              }
              SAXHandler handler = new SAXHandler();
              try {
                   parser.parse(new InputSource(new URL("http://www.gipuzkoa.net/rss/gnet_es.xml").openStream()), handler);
              } catch (MalformedURLException e) {
                   e.printStackTrace();
              } catch (SAXException e) {
                   e.printStackTrace();
              } catch (IOException e) {
                   e.printStackTrace();
              }
    In handler class i am just printing the content of the xml file. It print all the content correctly.
    public void startElement(String uri, String localName, String qName, Attributes attribs) throws SAXException {
              System.out.print("<" +  qName  +" "+attribs+">");
         }
         
         public void characters(char[] ch, int start, int lenght) throws SAXException {
              System.out.print(new String(ch, start, lenght).trim());
         }
         
         public void endElement(String uri, String localName, String qName) throws SAXException {
              System.out.println("</" +  qName  + ">");
         }
    In your handler class instead of comparing the tag as
    if(qName.equals("item")){
    //some code}
    try to compare it like
    if ( "item".equals(qName.trim())) {
                   //some code
              }
    some time qName contain extra character like Space and Enter so the time of comparision tag it will removing all extra character.
    just try out the above code and let me known the output. if you still have problem you can send me your code via mail i will try to solve your problem(sahumail.sahu@gmail.com).
  • 9. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi Buddhdev.Sahu ,

    Thank you very much for your help, finally I have succeed parsing RSS.
    But one last thing, do you know how to stop the parser after it has get the last 20 items?

    thanks,
    greeting

    Edited by: apenz on Apr 30, 2009 7:01 AM
  • 10. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    H!!
    TO read limited no of item content from XML file you can use any one of the following method, as i have tried. Just check that you have read enough no of item and throw a exception and handle it when you parse the content.
    public void startElement(String uri, String localName, String qName, Attributes attribs) 
    throws SAXException {
              if ("item".equals(qName.trim())) {
                   if (tempItemNo >= noofItems)
                        throw new SAXException("limit reached exception.");
                   else {
                        tempItemNo+=1;
                        System.out.print(tempItemNo + "<" + qName +" "+attribs+">");
                   }
              }
         }
    OR
    just store as much as content you want and rest of other will be ignore
    if (tempItemNo < noofItems) {
    //          handle xml content
                    tempItemNo += 1;
    }
    else {
    //          do nothing
    }
    Greeting
    Sahu

    Edited by: Buddhdev.Sahu on 30 Apr, 2009 4:07 PM
  • 11. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    I had taken the second option but I think that first must be more efficient.
    thanks you very much,
    cheers
  • 12. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi apenz,

    So how abt the speed when using the second option? faster a lot than nano?
  • 13. Re: Parsing XML
    843851 Newbie
    Currently Being Moderated
    Hi nedved.yang,

    Yes choosing any of the two options is faster than nan, because nano builts the entire XML tree and I think this is not recomendable for limited resources devices such as a STB. Between first and second, could be a little difference too, because first one go through all document that in my case (a RSS) is quite large and the second stops when it have been parsed the specified number of items. But after trying the second option, now I'm using another option which is this: when reaching the max parsed items number, to create an event instead of throwing an exception. Because I think exceptions are more advisable to use them with errors.

    greetings