3 Replies Latest reply: Nov 20, 2012 11:24 PM by Alexus67 RSS

    Best way to get several rows from SCD for each key

    Alexus67
      Hi All!

      We have large (millions of records) Slow changind dimension (SCD) type 2 (see "Creating another dimension record " - http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/10g/r2/owb/owb10gr2_gs/owb/lesson3/slowlychangingdimensions.htm)

      We need to get several rows from this SCD for each key (AGREEMENT_ID) in a SQL query - to join to facts table and get several data points of each agreement (on several different points in time) stored in SCD.

      Here is SCD table structure:

      CREATE TABLE AGREEMENT
      (
      "AGREEMENT_ID" NUMBER(*,0) NOT NULL ENABLE,
      "ACTUAL_DATE" DATE NOT NULL ENABLE,
      "ACTUAL_END_DATE" DATE NOT NULL ENABLE,
      "OPEN_DATE" DATE NOT NULL ENABLE,
      "LIMIT" NUMBER(23,8)
      --++ a lot of other fields not needed for this task ....
      CONSTRAINT "PK_MD_AGREEMENT" PRIMARY KEY ("AGREEMENT_ID", "ACTUAL_DATE") USING INDEX
      )
      ;

      The 1st simple approach would be to join facts to SCD as many (N) times as many different points of time you need - resulting in N Full Table Scans for SCD:

      select ...
      from fact,
      AGREEMENT agr1,
      AGREEMENT agr2,
      AGREEMENT agr3
      where
      fact.AGREEMENT_ID = agr1.AGREEMENT_ID and agr1.open_date between actual_date and actual_end_date
      and fact.AGREEMENT_ID = agr2.AGREEMENT_ID and :dateBOP between actual_date and actual_end_date
      and fact.AGREEMENT_ID = agr3.AGREEMENT_ID and :dateEOP between actual_date and actual_end_date



      2nd approach: 1 Full Table Scan for SCD + group by:

      select ...
      from fact,
      (
      Select AGREEMENT_ID,

      max(case when open_date between actual_date and actual_end_date then LIMIT end) LIMIT_At_Open_DATE,

      max(case when :dateBOP between actual_date and actual_end_date then LIMIT end) LIMIT_At_BeginOfPeriod_DATE,

      max(case when :dateEOP between actual_date and actual_end_date then LIMIT end) LIMIT_At_EndOfPeriod_DATE

      from agreement

      -- ++optionally WHERE for those 3 dates, but possibly with no effect on non-partitioned table? Or WHERE could help to put less data on MAX() input (3 row for each agreement instead of 4...1000 without WHERE?)

      group by AGREEMENT_ID

      ) agr

      where fact.AGREEMENT_ID = agr.AGREEMENT_ID


      Which is better? Why?

      Are there any other approaches discussed and proven?

      Re: Best way to get several rows from SCD for each key [message #571034 is a reply to message #571033] Tue, 20 November 2012 05:41
      Alexus12
      Messages: 2
      Registered: November 2012 Junior Member

      This is DWH.
      Full scan (for both facts and SCD) is the only approach allowed - we have millions of records in both tables, so any index/NL will give hours of calculations instead of minutes.

      Edited by: Alexus67 on 20.11.2012 3:42
        • 1. Re: Best way to get several rows from SCD for each key
          Toon.Koppelaars2-Oracle
          I do not fully understand what it is you are asking.
          But, isn't this query:
          select ...
          from fact,
          AGREEMENT agr1,
          AGREEMENT agr2,
          AGREEMENT agr3
          where fact.AGREEMENT_ID = agr1.AGREEMENT_ID and agr1.open_date between actual_date and actual_end_date
            and fact.AGREEMENT_ID = agr2.AGREEMENT_ID and :dateBOP between actual_date and actual_end_date
            and fact.AGREEMENT_ID = agr3.AGREEMENT_ID and :dateEOP between actual_date and actual_end_date
          Rewritable into / the same as, this query:
          select ...
          from fact,
          AGREEMENT agr1
          where fact.AGREEMENT_ID = agr1.AGREEMENT_ID and agr1.open_date between actual_date and actual_end_date
                                                      and :dateBOP between actual_date and actual_end_date
                                                      and :dateEOP between actual_date and actual_end_date
          • 2. Re: Best way to get several rows from SCD for each key
            Solomon Yakobson
            Toon.Koppelaars2 wrote:
            But, isn't this query rewritable into / the same as, this query:
            I don't think so. It looks like OP wants each match in a separate column. Your query is pretty much what OP would use in his 2nd approach and group by results.

            SY.
            • 3. Re: Best way to get several rows from SCD for each key
              Alexus67
              no, I need 3 points in different periods of time.
              you propose to combine them all in one witn ANDs, while only ORs give the result needed (my 2nd example)