1 2 Previous Next 27 Replies Latest reply on Sep 5, 2014 9:00 AM by chris227 Go to original post
      • 15. Re: Querying CLOB column (with XML content)
        chris227
        1. SELECT X.* 
        2. FROM  t 
        3.     , xmltable('(//DomainName[1])[1]' 
        4.         PASSING xmlparse(document t.XML_BLOB) 
        5.         COLUMNS DOMAINNAME    VARCHAR2(15)  PATH '.' 
        6.       ) X 
        7. where t.no=317663815 
        • 16. Re: Querying CLOB column (with XML content)
          chris227

          Adiitional your reuirement to omit domains with public

           

          2729533 wrote:

           

          Your 1st sql returln all domainname (p099 and public - seems searches all node in one xml). I only need 1st domain name (and omit domains with PUBLIC)


          1. SELECT X.*  
          2. FROM   t 
          3.      , xmltable('(//*[DomainName!="PUBLIC"]/DomainName)[1]'  
          4.          PASSING xmlparse(document t.XML_BLOB)  
          5.          COLUMNS DOMAINNAME    VARCHAR2(15)  PATH '.'  
          6.       ) X  
          7. where t.no=317663815 
          • 17. Re: Querying CLOB column (with XML content)
            chris227

            Sorry, just saw that you gave a neater xpath expression already.

             

            But why do you think it could be non-deterministic.

            Until now i thought that every xml-document has an intrinsic order given by the order of the tags itself.

            Isnt that what for example you expect from the order sax-parsers parse xml-documents?

            I would expect the same from the xpath expression in this case.

            I am wrong?


            Regards

            • 18. Re: Querying CLOB column (with XML content)
              padders

              Perhaps 'non-deterministic' isn't quite the right word.

               

              I see no guarantee that the sub-nodes/siblings will come in the same order. If two siblings can contain different DomainName and can come in any order then just picking the first DomainName isn't consistent.

               

              If the document can only ever contain 'PUBLIC' and one other DomainName anywhere then I guess this doesn't matter.

              • 19. Re: Querying CLOB column (with XML content)
                chris227

                I am not convinced yet ;-)

                 

                padders wrote:

                 

                If two siblings can contain different DomainName and can come in any order

                Isn't there at least order given by the occurance in the document? (In other words, the order the nodes appear in the xml-serialization of the xml-document).

                Let's take a look at the specs:

                 

                XQuery 1.0 and XPath 2.0 Data Model (XDM) (SecondEdition)

                 

                "Within a tree, document order satisfies the following constraints:

                1. The root node is the first node.
                2. Every node occurs before all of its children and descendants.
                3. Namespace Nodes immediately follow the Element Node with which they are associated. The relative order of Namespace Nodes is stable but implementation-dependent.
                4. Attribute Nodes immediately follow the Namespace Nodes of the element with which they are associated. If there are no Namespace Nodes associated with a given element, then the Attribute Nodes associated with that element immediately follow the element. The relative order of Attribute Nodes is stable but implementation-dependent.
                5. The relative order of siblings is the order in which they occur in the children property of their parent node.
                6. Children and descendants occur before following siblings."

                 

                The xml-document being subject in this thrread has, as far as i understand only one main tree, since there is exactly one root element.

                 

                So the follwoing might not be applicable to this kind of documents (but to be honest, i am not 100% sure)

                 

                "The relative order of nodes in distinct trees is stable but implementation-dependent, subject to the following constraint: If any node in a given tree, T1, occurs before any node in a different tree, T2, then all nodes inT1 are before all nodes in T2."

                 

                Therefore my conclusion would be (in simple words):

                The order of the "real" nodes (children, parents, root, siblings, descandents) is determinstic.

                The order of other snodes like attributes, namespace-nodes e.g. is deterministic but implementation dependent.

                 

                Am i wrong?

                • 20. Re: Querying CLOB column (with XML content)
                  padders

                  I have to say I was/am not 100% sure either but merely expressing concern.

                   

                  It still seems to me that (for example) under ShipmentHeader there are multiple (three) InvolvedParty nodes which contain DomainName. If these nodes can come in any order and can have different DomainName(s) then picking the first would not necessarily be correct.

                   

                  More generally without seeing the XSD we cannot be sure what nodes can occur more than once (or what nodes are not even present in this example), in which case we cannot be sure that we are not in the situation above.

                  • 21. Re: Re: Re: Querying CLOB column (with XML content)
                    odie_63

                    Your 1st sql returln all domainname (p099 and public - seems searches all node in one xml).

                    Yes, sorry, untested and too quickly posted example...

                     

                    I meant something like this (amended to match the additional requirement) :

                    /descendant::DomainName[.!="PUBLIC"][1]

                     

                    // is a short for descendant-or-self axis actually and //DomainName[1] doesn't select only one node but all descendants being the first DomainName child of its parent.

                     

                    I agree with padders, you have to be aware of some business rules you didn't tell us so that you can safely keep the first non-PUBLIC occurrence, otherwise it doesn't make much sense.

                    • 22. Re: Querying CLOB column (with XML content)
                      chris227

                      May be we are talking about different things.

                      I only refered to the xml-instance given in this thread.

                      And for this i think DomainName[1] will be deterministic.

                       

                      It's obvious that DomainName[1] might return another node for other xml-instances (although being valid to the same xsd as the former instance); if this particular node is absence for some reason or another one occurs somewhere before.

                       

                      But that will be true for many (relative) path expressions if the xml is not fixed defined.

                       

                      So if you did refer to that obvious thing, than i misunderstood you completely.

                      • 23. Re: Querying CLOB column (with XML content)
                        Veera_V

                        what I needed was if any option other than using DBMS_LOB.SUBSTR (as its very slow but will process 10000 records quicker).

                         

                        But the below sql using xmltable is very slow when I do a count/ when I do a group by based on status.

                         

                        SELECT COUNT(*) FROM (

                        SELECT I_NO,X.*, STATUS

                        FROM  t

                             , xmltable('(//*[DomainName!="PUBLIC"]/DomainName)[1]'

                                 PASSING xmlparse(document t.XML_BLOB)

                                 COLUMNS DOMAINNAME    VARCHAR2(15)  PATH '.'

                              ) X

                        WHERE STATUS  IN ('FRESH', 'REPRO','PROC')

                             AND INSERT_DATE BETWEEN TO_DATE('04-SEP-2014 02:00:00','DD-MON-YYYY HH24:MI:SS') AND TO_DATE('04-SEP-2014 02:30:00','DD-MON-YYYY HH24:MI:SS')

                        )

                        • 24. Re: Re: Querying CLOB column (with XML content)
                          odie_63

                          2729533 wrote:

                          But the below sql using xmltable is very slow when I do a count/ when I do a group by based on status.

                           

                          SELECT COUNT(*) FROM (

                          SELECT I_NO,X.*, STATUS

                          FROM  t

                              , xmltable('(//*[DomainName!="PUBLIC"]/DomainName)[1]'

                                  PASSING xmlparse(document t.XML_BLOB)

                                  COLUMNS DOMAINNAME    VARCHAR2(15)  PATH '.'

                                ) X

                          WHERE STATUS  IN ('FRESH', 'REPRO','PROC')

                              AND INSERT_DATE BETWEEN TO_DATE('04-SEP-2014 02:00:00','DD-MON-YYYY HH24:MI:SS') AND TO_DATE('04-SEP-2014 02:30:00','DD-MON-YYYY HH24:MI:SS')

                          )

                          This is slow because you're storing XML in a CLOB instead of XMLType.

                          Furthermore the XQuery expression is unnecessary complicated.

                          Try with the expression padders suggested earlier :

                          (//DomainName[.!="PUBLIC"])[1]


                          If the column is queried very frequently, I strongly suggest you migrate it to XMLTYPE (SECUREFILE BINARY XML) as soon as possible, and all performance issues will be gone.


                          • 25. Re: Querying CLOB column (with XML content)
                            chris227

                            I would do the filtering before querying the xml:

                            SELECT COUNT(*) FROM (

                            select

                              XML_BLOB

                            from t

                            WHERE STATUS  IN ('FRESH', 'REPRO','PROC')

                                 AND INSERT_DATE BETWEEN TO_DATE('04-SEP-2014 02:00:00','DD-MON-YYYY HH24:MI:SS') AND TO_DATE('04-SEP-2014 02:30:00','DD-MON-YYYY HH24:MI:SS')

                            ) t

                            , xmltable('(//DomainName[.!="PUBLIC"])[1]'

                                     PASSING xmlparse(document t.XML_BLOB)

                                     COLUMNS DOMAINNAME    VARCHAR2(15)  PATH '.'

                                  ) X

                            You should also use the simpler expression (//DomainName[.!="PUBLIC"])[1], although i suspect that the inline view is the key point.

                            • 26. Re: Querying CLOB column (with XML content)
                              odie_63

                              I would do the filtering before querying the xml

                              Yes, I've seen that before. For some reasons (unknown cardinalities?), the CBO may perform the XMLTable join first and the filter predicate later.

                              An explain plan should confirm that.

                               

                              If that is what happens here, a NO_MERGE hint might help too.

                               

                              Veera_V,

                              Could you post the explain plan of the query, including access and predicate filters?

                              • 27. Re: Querying CLOB column (with XML content)
                                chris227

                                odie_63 wrote:

                                 

                                I would do the filtering before querying the xml

                                Yes, I've seen that before. For some reasons (unknown cardinalities?), the CBO may perform the XMLTable join first and the filter predicate later.

                                An explain plan should confirm that.

                                 

                                If that is what happens here, a NO_MERGE hint might help too.

                                Absolutly! That's possible, thanks for remembering me. CBO will always take some constant guess on cardinality of the xquery if i remember correctly.

                                In my simple test case it does a nested loop driven by the inline view.

                                1 2 Previous Next