10 Replies Latest reply on Jun 28, 2020 4:09 PM by Tubby

    How to enriched data from one data source with another?

    steffi

      So within the same database but different schemas I have two sets of tables.

       

      Now I have a view defined  that looks like this

       

      A, B, C, D,

      func('field1',A, B, C, D) E,

      func('field2', A, B, C, D) F,

      func('field3', A, B, C, D) G

       

      func interrogates the second schema and performs queries using data values from the first schema as keys (A, B, C, D) to do effectively a kind of lookup.

       

      I have about 15 of these derived columns where their value is from a lookup from another datasource (different schema, same database)

       

      Now because of inconsistencies in the data these tables just aren't joined because if I did that I'd need a lot of conditional logic to control

      the join depending on the values of A, B, C, D and so instead multiple cursors are sometimes walked in order to find the data to ensure I never have to worry about cartesian products.

       

      It's done this way because the path to finding the data can vary or require different qualifiers so that code looks like this

       

      field value = lookup data using one set of parameters

      if no results

        field value = lookup data using a different set of parameters

          if no results

             field value = lookup data using yet another different set of parameters

       

      This means that each field is essentially a separate set of lookups but effectively all derived from the same row of data just different columns being returned. This is horribly inefficient as I should only need to get this row of data once and then use it to enrich the row of data I got from the first set of tables but there isn't really a join key between the two sets of data.

       

      So, the goal would be to still make use a view but rather than do a separate call for each field I'd rather return multiple fields from the row at the same time.

       

      I'm aware of TABLE views and the notion of parameterized views but what's not clear is  if I join to the table returned how I relate that row data back to the original row

      from the first set of tables which I assume I have to do in order to enrich A with B where values from A determine B and I'm expected to join A with B.

        • 1. Re: How to enriched data from one data source with another?
          John Thorton

          What exactly do you expect & desire after posting the saga above since you provided use no DDL to reproduce your tables, no DML to populate data into any table,

          and you failed to show or tell us what exactly the results that are needed or expected.

           

          Simply put we don't know what you really have or  what the results need to be; let alone the process requirements to transform what you have into what you need.

           

          Please click on URL below & provide details as stated in #5 - #9 inclusive

           

          How do I ask a question on the forums?


          • 2. Re: How to enriched data from one data source with another?
            Stefan Jager

            Well, as John says it's a bit difficult to say anything without more detailed information about your datamodels.

             

            That said: can you modify either both or at least one of the datamodels? If so, you could add definition columns and lookup-ID columns in one  or both of them. That might help you use case instead of a function, thereby turning the views into pure sql. Or use an intermediate table that has identifying info from both data models (which would enable 1-n relations too, if that is desireable). Both options would add more maintenance overhead though to keep everything in synch, which complicates matters.

            If you cannot add or modify either datamodel, then it becomes virtually impossible to say anything that makes sense, we'd need a sample if you can create it.

            • 3. Re: How to enriched data from one data source with another?
              steffi

              Yeah apologies for being as brief as I have been ...

               

              I was hoping what I'm doing is a typical use case/pattern that somebody else can see I could do a better way so I only wanted to conceptually get across what I was doing.

               

              In general though the lookup logic is very arbitrary it just tries to do a number of queries to find the data element in question.

              • 4. Re: How to enriched data from one data source with another?
                Tubby

                steffi wrote:

                 

                Now because of inconsistencies in the data these tables just aren't joined because if I did that I'd need a lot of conditional logic to control

                the join depending on the values of A, B, C, D and so instead multiple cursors are sometimes walked in order to find the data to ensure I never have to worry about cartesian products.

                That portion scares me quite a bit. You should never have inconsistent data, because you should only store data in a single place. Unless perhaps you are working in a data mart or warehouse?

                 

                As others have mentioned it's not entirely clear what you are working with. Perhaps you could spend 10 minutes and mock up a simple example of the data we can see to better understand and better advise you.

                 

                Cheers,

                • 5. Re: How to enriched data from one data source with another?
                  steffi

                  The inconsistencies refer to the fact that different qualifiers are needed to fetch the data from the second data set which is where there's multiple fetch attempts to fetch the data.

                   

                  I can break my problem down to a simpler question.

                   

                  I have a driving table that has (A,B,C,D) columns let's call it TABLE T1

                   

                  and I want to enriched rows from T1 with a table function that returns multiple values (instead of having to have a separate function for each field)

                   

                  how do I use T1 as my driving table in the Select and also have the table function in the FROM clause and still be able to relate the two back again

                  in the join?

                   

                  Am I expected to pass the PK of the row of the driving table as a parameter to the table function and make it available in the Object Type so I can relate the two sets back again?

                   

                   

                  I cannot just have

                   

                  SELECT A, B, C, D from T1, TABLE(SOME_TABLE_FUNCTION(A, B, C, D)) TF

                   

                  as that's a cartesian product.

                  • 6. Re: How to enriched data from one data source with another?
                    Stefan Jager

                    If I'm honest that does not help very much. Multiple values as in multiple columns? Or as in multiple rows?

                     

                    I have worked with situations like this, but they are really data-dependend so I can't just give you the solutions I chose in those situations because they are guaranteed not to work for yours (unless you are actually working for the same clients I worked for at the time...). And in some situations the solution wasn't SQL at all.

                     

                    So a simplified example of BOTH tables, with one or two examples of what matches and why, and why not, would go a long way. It may be you are over-complicating things, or we are discussing an xy-problem. But there's not much more to say without actually seeing some representative data.

                    • 7. Re: How to enriched data from one data source with another?
                      Tubby

                      steffi wrote:

                       

                      The inconsistencies refer to the fact that different qualifiers are needed to fetch the data from the second data set which is where there's multiple fetch attempts to fetch the data.

                       

                      I can break my problem down to a simpler question.

                       

                      I have a driving table that has (A,B,C,D) columns let's call it TABLE T1

                       

                      and I want to enriched rows from T1 with a table function that returns multiple values (instead of having to have a separate function for each field)

                       

                      how do I use T1 as my driving table in the Select and also have the table function in the FROM clause and still be able to relate the two back again

                      in the join?

                       

                      Am I expected to pass the PK of the row of the driving table as a parameter to the table function and make it available in the Object Type so I can relate the two sets back again?

                       

                       

                      I cannot just have

                       

                      SELECT A, B, C, D from T1, TABLE(SOME_TABLE_FUNCTION(A, B, C, D)) TF

                       

                      as that's a cartesian product.

                      Sorry but that doesn't simplify things for me I know you likely think it does because you're been working with the data you have for a long time but to others not familiar with your environment we'll likely need to see some actual examples.

                       

                      To your point of "that's a cartesian product" it's not. You have 1 row in to the table function and X rows out (we have no idea what your proposed table function is doing), it's a basic parent child relationship if anything.

                       

                      Cheers,

                      • 8. Re: How to enriched data from one data source with another?
                        steffi

                        today I have a function that has a parameter that says which field to return and input parameters used as keys for the lookup.

                         

                        if I have 10 fields in my output

                         

                        that's 10 function calls in my select clause.

                         

                        where the values of the input parameters are from the row of T1

                         

                        select a,b,c,d,

                        func(field1, a, b, c, d),

                        func(field2, a, b, c ,d)

                        func(field3, a, b, c, d)

                        ....

                        func(field10, a b, c, d)

                        from T1

                         

                        the func has takes the keys of a, b, c, d and narrowed to the same row of T2 for each field invocation.

                         

                        func(field_param, a, b, c, d)

                        {

                        select field_param from

                        (

                        select field1, field2, field3, field4, field5, field6, field7, field8, field9, field10

                        from T2

                        where some condition derived from field1, field2, field3, field4 ,field5, field6, field7, field8, field9, field10

                        )

                        }

                         

                        today the function returns any field as a varchar2

                         

                        the idea is how can I make this function be more like

                         

                        select table_object(field1, field2, field3, field4, field5, field6, field7, field8, field9, field10)

                        from T2

                        where some condition derived from field1, field2, field3, field4 ,field5, field6, field7, field8, field9, field10

                         

                        where eventually I don't need a separate invocation per field.

                         

                        select a,b,c,d,

                        eT2.field1,

                        eT2.field2,

                        eT2.field3

                        ....

                        eT3.field10

                         

                        from T1, table(func(a, b, c, d)) eT2

                        where T1...... = eT2 ......

                         

                        however there is not direct 1:1 relation between T1 and eT2

                         

                        the only way to relate the two would be to pass something common between T1 and T2 to the function.

                        ......

                         

                        Again, the current implementation uses a function to look up each fields value but never uses join outside of the function so I don't have to worry about how the values relate to each other they just do because what comes back from the function is one value per field related to the keys I passed the function.

                         

                        Soon as you change it into a set problem then it comes more complicated to manage the join but the goal is to have the function return more than one value to limit the number of times it has to be called because this function actually performs N fetches using N cursors depending on the contents of a, b, c, d.

                         

                        Conceptually imagine if my function instead of returning one specified field value today instead returned a string of values instead.

                        crude implementation being concatenated string of values.

                         

                        select a,b,c,d,

                        func(a, b, c, d) all_fields_seperated_by_special_nonoccuring_character_for_a_given_row_of_T1

                        from T1

                         

                        Not unlike what's discussed as a "how not to do it" in

                         

                        http://stevenfeuersteinonplsql.blogspot.com/2015/04/table-functions-returning-complex-non.html

                         

                        from a performance perspective this would still be a significant improvement of what I have right now.

                        • 9. Re: How to enriched data from one data source with another?
                          steffi

                          It looks like I should look into xml or json as that also appears to be way to return multiple column values in one row from a function and still be able to parse it in plain SQL.

                          • 10. Re: How to enriched data from one data source with another?
                            Tubby

                            Thanks for the additional details. Please don't take this offensively, but it sounds like there is something fundamentally flawed with the data model here, do you have the ability to change the data model or are you stuck with what you have? I ask because from what you've described it sounds like this is going to be a performance and maintenance nightmare regardless of what you end up doing.

                             

                            If you're stuck with what you have, please try posting a very small example of the data you're working with (2 columns would be enough to demonstrate the example, along with some data from the table you are currently employing within the function).

                             

                            Cheers,