developers

    Forum Stats

  • 3,873,872 Users
  • 2,266,625 Discussions
  • 7,911,648 Comments

Discussions

How to check the data for duplicates in xml

User_CROFU
User_CROFU Member Posts: 40 Blue Ribbon
edited Aug 10, 2013 3:51AM in XQuery

Hi all,

I have an xml similar to the below. In that I need an xquery which can remove the tags for which the data is same. For example in the below xml for first <customer> first <address> the <houseno>

and the second <address>'s <houseno> is same in that case there should be only one <houseno> tag with the data in the output xml. Please check the Input XML and Output XML xml formats below .

I am able to get the OUtputxml but with the same <houseno> repeating. I am not able to find a way in which I can chk the data and stop the tag getting created in the output.

Could you please suggest me the ways in which I can do. It would be of great help for me. Thanks a ton in advance.

Input XML

<customers>

     <customer>

          <address>

               <houseno>212</houseno>

                <phone>121221</phone>

          </address>

          <address>

               <houseno>212</houseno>

               <phone>42334</phone>             

          </address>

     <customer>

     <customer>

          <address>

               <houseno>3243</houseno>

               <phone>6565</phone>

          </address>

          <address>

               <houseno>3434</houseno>

                <phone>78778</phone>

          </address>

     </customer>

</customers>

Output XML Expected

<customers>

<customer>

          <address>

               <houseno>212</houseno>

                <phone>121221</phone>

                  <phone>42334</phone>              

          </address>

     <customer>

<customer>

          <address>

               <houseno>3243</houseno>

               <phone>6565</phone>

               <houseno>3434</houseno>

                <phone>78778</phone>

          </address>

     </customer>

</customers>

Output XML Which I am getting

<customers>

<customer>

          <address>

               <houseno>212</houseno>

                <houseno>212</houseno>        

                <phone>121221</phone>

                  <phone>42334</phone>              

          </address>

     <customer>

<customer>

          <address>

               <houseno>3243</houseno>

               <phone>6565</phone>

               <houseno>3434</houseno>

                <phone>6565</phone>

          </address>

     </customer>

</customers>

Regards

Answers

  • tsuji
    tsuji Member Posts: 179 Bronze Badge

    First of all the desired output.

    [quote]

    <customers>

    <customer>

              <address>

                   <houseno>212</houseno>

                    <phone>121221</phone>

                      <phone>42334</phone>              

              </address>

         <customer>

    <customer>

              <address>

                   <houseno>3243</houseno>

                   <phone>6565</phone>

                   <houseno>3434</houseno>

                    <phone>78778</phone>

              </address>

         </customer>

    </customers>

    [/quote]

    I don't think this is a very good choice and will be causing trouble no end in a future stage of using the data...

    I would rather propose a better choice to my thinking like this.

    [code]

    <customers>

        <customer>

              <address>

                   <house houseno="212">

                       <phone>121221</phone>

                       <phone>42334</phone>

                   </house>

              </address>

         </customer>

        <customer>

            <address>

                <house houseno="3243">

                    <phone>6565</phone>

                </house>

                <house houseno="3434">

                    <phone>78778</phone>

                </house>

           </address>

      </customer></customers>

    [/code]

    In that case, this is capable of producing the regrouped output.

    [code]

    <customers>{

        let $doc:=doc("your_data.xml")

        for $customer in $doc/customers/customer

        return

        <customer>{

            for $houseno in distinct-values($customer/address/houseno)

            return

            <house houseno="{$houseno}">{

               for $phone in $customer/address[houseno=$houseno]/phone

               return

               <phone>{data($phone)}</phone>

            }</house>

        }</customer>

    }</customers>

    [/code]

This discussion has been closed.
developers