7 Replies Latest reply: Jul 31, 2012 8:11 PM by Brian Tkatch RSS

    Help with XQueriyng an AWS XML file

    Brian Tkatch
      --Oracle Database 11g Express Edition Release 11.2.0.2.0 - Beta

      I am trying to query an XML file returned from Amazon AWS. I'm learning this slowly, and i think examples would really help . I'm reading and looking through the documentation " [Using XQuery with Oracle XML DB|http://docs.oracle.com/cd/E11882_01/appdev.112/e23094/xdb_xquery.htm] ".

      I made an batch of 2 ItemLookup requests to AWS, which returned one item for the first lookup, and two for the second, which i think is a good example. I used a bogus account id (it's used for tracking) and a key pair which i deleted after using it, so it is the actual reply. It is too large to be passed as a literal, so i put it on pastebin. It shows a valid request. There are two levels of validity, one is no error, the other IsValid. Had there been an error, like a missing parameter, the following would be between the </Arguments> and <RequestProcessingTime> tags:
          <Errors>
            <Error>
              <Code>AWS.MissingParameters</Code>
              <Message>Your request is missing required parameters. Required parameters include ItemId.</Message>
            </Error>
          </Errors>
      Meaning, first the XML document must be checked for the Errors tag; second, the IsValid element must be True; third, actual data can be perused. The Errors tag mean there may be multiple errors. The IsValid is per Items (not Item). Then individual details may be grabbed, such as ListPrice.

      I'm guessing XQuery is the right way to go here, using XMLTABLE to create records to be queried. Here is what i have so far:
      SELECT
           *
      FROM
           XMLTABLE
           (
            XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
            '/ItemLookupResponse/OperationRequest/Errors/Error,
           for $Error in /ItemLookupResponse/OperationRequest/Errors/Error
             return
                <Error>
                {
                 $Error/Code,
                 $Error/Message
                }
                </Error>,
            for $Item in /ItemLookupResponse/Items
             return
                <Item>
                {
                 $Item/Request/IsValid
                }
                </Item>'
            PASSING
                (code to get XML document: e.g. ItemLookup('036500101794', 'UPC', 'OfficeProducts', '5011363525517', 'EAN', 'Toys'))
            COLUMNS
                Id FOR ORDINALITY,
                Error_Code     VARCHAR2(10) PATH 'Code',
                Error_Message     VARCHAR2(10) PATH 'Message',
                IsValid          VARCHAR2(05) PATH 'IsValid'
           );
      Of which the response is:
              ID ERROR_CODE ERROR_MESS ISVAL
      ---------- ---------- ---------- -----
               1                       True
               2                       True
      Ostensibly, there are no errors, so the first record is the IsValid element for the first Items tag, and the second for the second. Now, i need to loop inside each Items for all of its Item tags.

      1) Is this a good approach. I ask because this is really my first XQuery.
      2) Is the for loop for Errors good? I am assuming it will only have a record if there is an error.
      3) Do i nest for loops to get each Item in each Items?
      4) How do i know which Items is being used when in the (sub) Item tag?

      I would appreciate any help. It just hasn't "clicked" yet, and i am having a hard time knowing what to do.
        • 1. Re: Help with XQueriyng an AWS XML file
          odie_63
          Hi Brian,
          1) Is this a good approach. I ask because this is really my first XQuery.
          Yes, it's a good start.

          My first suggestion would be to store the XML response file in an XMLType table (could be a GTT if you want), and query from it. It'll greatly boost XQuery evaluation.
          CREATE TABLE tmp_xml OF XMLType;
          Since you're on 11.2.0.2, the default storage will be BINARY XML, which is optimized to work with XQuery.

          2) Is the for loop for Errors good? I am assuming it will only have a record if there is an error.
          Actually, you're doing the same thing twice :
          /ItemLookupResponse/OperationRequest/Errors/Error,
          for $Error in /ItemLookupResponse/OperationRequest/Errors/Error
          return
           ...
          You could also simplify it with the node-set union operator | :
          SQL> SELECT x.*
            2  FROM tmp_xml t
            3     , XMLTable (
            4         XMLNamespaces(default 'http://webservices.amazon.com/AWSECommerceService/2011-08-01')
            5       , '/ItemLookupResponse/OperationRequest/Errors/Error
            6        | /ItemLookupResponse/Items/Request'
            7       PASSING t.object_value
            8       COLUMNS
            9         Id             FOR ORDINALITY
           10       , Error_Code     VARCHAR2(10) PATH 'Code'
           11       , Error_Message  VARCHAR2(10) PATH 'Message'
           12       , IsValid                VARCHAR2(05) PATH 'IsValid'
           13           ) x
           14  ;
           
                  ID ERROR_CODE ERROR_MESSAGE ISVALID
          ---------- ---------- ------------- -------
                   1                          True
                   2                          True
           
          3) Do i nest for loops to get each Item in each Items?
          4) How do i know which Items is being used when in the (sub) Item tag?
          The usual approach is :
          SQL> SELECT x1.IdType
            2       , x1.ItemId
            3       , x2.ASIN
            4       , x2.Binding
            5       , x2.Brand
            6  FROM tmp_xml t
            7     , XMLTable (
            8         XMLNamespaces(default 'http://webservices.amazon.com/AWSECommerceService/2011-08-01')
            9       , '/ItemLookupResponse/Items'
           10         PASSING t.object_value
           11         COLUMNS
           12           IdType         varchar2(3)  path 'Request/ItemLookupRequest/IdType'
           13         , ItemId         varchar2(12) path 'Request/ItemLookupRequest/ItemId'
           14         , ItemList       xmltype      path 'Item'
           15           ) x1
           16     , XMLTable(
           17         XMLNamespaces(default 'http://webservices.amazon.com/AWSECommerceService/2011-08-01')
           18       , '/Item'
           19         PASSING x1.ItemList
           20         COLUMNS
           21           ASIN           varchar2(10) path 'ASIN'
           22         , Binding        varchar2(30) path 'ItemAttributes/Binding'
           23         , Brand          varchar2(30) path 'ItemAttributes/Brand'
           24       ) x2
           25  ;
           
          IDTYPE ITEMID       ASIN       BINDING                        BRAND
          ------ ------------ ---------- ------------------------------ ------------------------------
          UPC    036500101794 B004WL0L9S Office Product                 Georgia-Pacific
          EAN    501136352551 B0042ET8OO Toy                            Mattel
          EAN    501136352551 B00004TQMQ Toy                            Mattel
           
          <Items> elements are extracted first (x1), then the corresponding collection of <Item> elements is passed to a second XMLTable (x2) to be processed in the same manner.
          That way, the parent/child relationship is preserved.
          • 2. Re: Help with XQueriyng an AWS XML file
            Brian Tkatch
            Wow, thanx! That was very helpful. I didn't even realize one query could return an XMLTYPE for the second one to work on. That's pretty neat. And the set union operator, i didn't even see that. I guess there's a lot to read.

            So, now i'm confused (other than before..) when would i use a for? In my original tests, whenever i tried pull up a repeated element, i received an error, which is why i used the for loop. I see the query works, i'm just not "getting it" (yet).

            Should i converge the two queries--for error and for data--? They are separate in your example.

            As for storing it, thanx for the tip. While i am currently running it inline, it is for testing. Ultimately, it will be put to the speed test, and would likely end up in a TABLE (or AQ) to keep at a rate of one per second. Why is there such a difference?
            Actually, you're doing the same thing twice :
            I did it twice, because without the first line (which can be anything) the for loop failed. I do not know why.
            • 3. Re: Help with XQueriyng an AWS XML file
              odie_63
              So, now i'm confused (other than before..) when would i use a for?
              To quote from the documentation :
              XQuery Is About Sequences
              In particular, this :
              /ItemLookupResponse/OperationRequest/Errors/Error
              is equivalent to :
              for $i in /ItemLookupResponse/OperationRequest/Errors/Error
              return $i
              And this (in case it would be possible to have <Errors> and <Items> at the same time) :
              /ItemLookupResponse/OperationRequest/Errors/Error  | /ItemLookupResponse/Items/Request
              ultimately resolves to this sequence :
              <Error> ... </Error>
              <Error> ... </Error>
              <Error> ... </Error>
              ...
              <Request> ... </Request>
              <Request> ... </Request>
              <Request> ... </Request>
              Each item of the sequence is then processed as a relational row.

              Should i converge the two queries--for error and for data--? They are separate in your example.
              I don't know if it's a good idea.
              I guess it depends on how you plan to work with the extracted data.

              As for storing it, thanx for the tip. While i am currently running it inline, it is for testing. Ultimately, it will be put to the speed test, and would likely end up in a TABLE (or AQ) to keep at a rate of one per second. Why is there such a difference?
              When stored as binary XML, the data is available in a post-parsed format. Oracle therefore knows the data better and can perform streaming evaluation (way faster) instead of DOM-based processing.

              BTW, is there an XML schema available for AWS response?
              • 4. Re: Help with XQueriyng an AWS XML file
                Brian Tkatch
                Thanx. that helps.

                I do not know if there is an XML schema (pardon my ignorance here), but the programming guide refers to a WSDL: http://docs.amazonwebservices.com/AWSECommerceService/latest/DG/UnderstandingResponses.html

                I don't know if it's a good idea.
                I guess it depends on how you plan to work with the extracted data.
                The extracted data is to be stored with the item searched for. The actual XML file will likely be removed due to space constraints. That's a decision to be made later.

                If there is an error, i want to record it. If there is no error, i want to store the data. So, my first idea would be one query.
                When stored as binary XML, the data is available in a post-parsed format. Oracle therefore knows the data better and can perform streaming evaluation (way faster) instead of DOM-based processing.
                Each XML file will probably be queried just once, then tossed. If stored, it would be to shift the processing to another process, so the main process can keep to a faster rate.
                • 5. Re: Help with XQueriyng an AWS XML file
                  Brian Tkatch
                  Here's the XSD: http://webservices.amazon.com/AWSECommerceService/AWSECommerceService.xsd
                  • 6. Re: Help with XQueriyng an AWS XML file
                    Brian Tkatch
                    OK, so going further, writing one query. This returns any errors (and nothing else) if there are errors, otherwise, it returns one records for each Items tag.

                    The problem is, the XML file has two Items tags. But the first Items tag has one Item tag, and the second Items tag has two Item tags. This query returns two records instead of three. Also, how do i know which Item "belongs" to which Items?
                    COLUMN Code FORMAT A40
                    COLUMN Message FORMAT A50 WORD_WRAPPED
                    
                    WITH
                         XML(Document)
                    AS
                         (
                          SELECT
                              Amazon_PAPI.Get_Response(Amazon_PAPI.ItemLookup('036500101794', 'UPC', 'OfficeProducts', '5011363525517', 'EAN', 'Toys'))
                          FROM
                              Dual
                         )
                    SELECT
                         Error.Code,
                         Error.Message,
                         Items.Id,
                         Items.Isvalid,
                         Item.Id     Item_Id,
                         Item.ASIN
                    FROM
                         XMLTABLE
                         (
                          XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
                          '/ItemLookupResponse/OperationRequest/Errors/Error'
                          PASSING
                              (SELECT Document FROM XML)
                          COLUMNS
                              Code     VARCHAR2(0050) PATH 'Code',
                              Message     VARCHAR2(4000) PATH 'Message'
                         ) Error
                    FULL JOIN
                         XMLTABLE
                         (
                          XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
                          '/ItemLookupResponse/Items'
                          PASSING
                              (SELECT Document FROM XML)
                          COLUMNS
                              Id          FOR ORDINALITY,
                              IsValid          VARCHAR2(005)     PATH 'Request/IsValid',
                              Item          XMLTYPE          PATH 'Item'
                         ) Items
                     ON
                         Items.IsValid = Error.Code
                    FULL JOIN
                         XMLTABLE
                         (
                          XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
                          '/Item'
                          PASSING
                              Items.Item
                          COLUMNS
                              Id                    FOR ORDINALITY,
                              ASIN                    VARCHAR2(0010)     PATH 'ASIN'
                         ) Item
                     ON
                         Item.Id = Items.Id
                    ORDER BY
                         Items.Id,
                         Item.Id;
                    Results in:
                    CODE                                     MESSAGE                                                    ID ISVAL    ITEM_ID ASIN
                    ---------------------------------------- -------------------------------------------------- ---------- ----- ---------- ----------
                                                                                                                         1 True       1 B004WL0L9S
                                                                                                                         2 True       2 B00004TQMQ
                    I wanted the result something like:
                    CODE                                     MESSAGE                                                    ID ISVAL    ITEM_ID ASIN
                    ---------------------------------------- -------------------------------------------------- ---------- ----- ---------- ----------
                                                                                                                         1 True       1 B004WL0L9S
                                                                                                                         2 True       1 B0042ET8OO
                                                                                                                         2 True       2 B00004TQMQ
                    B0042ET8OO is returned in the XML file, but ignored by the query. (The actual query has many more COLUMNs in Item. I removed them for clarity.)
                    • 7. Re: Help with XQueriyng an AWS XML file
                      Brian Tkatch
                      OK, figured it out....The XMLTABLE does the join, so the FULL JOIN is simply not required:

                      Making the final code:
                           WITH
                                XML(Document)
                           AS
                                (
                                 SELECT
                                     Amazon_PAPI.Get_Response(Amazon_PAPI.ItemLookup('036500101794', 'UPC', 'OfficeProducts', '5011363525517', 'EAN', 'Toys'))
                                 FROM
                                     Dual
                                )
                           SELECT
                                Items.Id,
                                Items.Code,
                                Items.Message,
                                Items.Isvalid,
                                Item.Id Item_Id,
                                Item.ASIN
                           FROM
                                XMLTABLE
                                (
                                 XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
                                 '/ItemLookupResponse/OperationRequest/Errors/Error
                                 | /ItemLookupResponse/Items'
                                 PASSING
                                     (SELECT Document FROM XML)
                                 COLUMNS
                                      Id          FOR ORDINALITY,
                                     Code          VARCHAR2(0050)     PATH 'Code',
                                     Message          VARCHAR2(4000)     PATH 'Message',
                                     IsValid          VARCHAR2(005)     PATH 'Request/IsValid',
                                     Item          XMLTYPE          PATH 'Item'
                                ) Items
                           LEFT JOIN
                                XMLTABLE
                                (
                                 XMLNAMESPACES(DEFAULT 'http://webservices.amazon.com/AWSECommerceService/2011-08-01'),
                                 '/Item'
                                 PASSING
                                     Items.Item
                                 COLUMNS
                                     Id                    FOR ORDINALITY,
                                     ASIN                    VARCHAR2(0010)     PATH 'ASIN'
                                ) Item
                            ON
                                1 = 1
                           ORDER BY
                                Items.Id,
                                Item.Id;
                      With the result, when successful:
                              ID CODE                                     MESSAGE                                            ISVAL    ITEM_ID ASIN
                      ---------- ---------------------------------------- -------------------------------------------------- ----- ---------- ----------
                               1                                                                                             True       1 B004WL0L9S
                               2                                                                                             True       1 B0042ET8OO
                               2                                                                                             True       2 B00004TQMQ
                      And when not successful:
                              ID CODE                                     MESSAGE                                            ISVAL    ITEM_ID ASIN
                      ---------- ---------------------------------------- -------------------------------------------------- ----- ---------- ----------
                               1 AWS.InvalidEnumeratedParameter           The value you specified for IdType is invalid.
                                                                          Valid values include ['ASIN', 'SKU',
                                                                          'UPC', 'EAN','ISBN'].
                      
                               2 AWS.RestrictedParameterValueCombination  Your request contained a restricted parameter
                                                                          combination.  When IdType equals UPCa, SearchIndex
                                                                          cannot be present.
                      Still looking to test a no-error IsValid = False case though. :)