This discussion is archived
1 2 Previous Next 28 Replies Latest reply: Dec 19, 2012 7:18 PM by user13328581 Go to original post RSS
  • 15. Re: creating your own materialized view utlizing tables and a job
    user13328581 Explorer
    Currently Being Moderated
    Alrite...sorry about wasting your time....I will close the thread and have it answered...
  • 16. Re: creating your own materialized view utlizing tables and a job
    rp0428 Guru
    Currently Being Moderated
    >
    however though, what is the best way to check if changes have occurred in a table...
    >
    You said you wanted to manually implement a materialized view but you didn't say anything about creating your own log process.

    So create a materialized view log on the table and it will log all of the changes. Then you can try to create your own code to apply those changes to your 'replica' MV.

    That will give you an understanding of what has to be done to keep a table up to date.
  • 17. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    user13328581 wrote:

    This is just for learning purposes. What if I would like to create my own materialized view utilizing only tables and a job...is this possible and a simple example would help. thank you
    I have my own custom version of materialised views. The reasons are basically performance and management of the materialised object.

    I have lots of data streaming into the database. This data needs to be processed and aggregated. E.g. "raw" data is processed into various hourly aggregated tables. Hourly tables are aggregated into daily tables. Etc.

    The most optimal way, in our case, to do this aggregation, is to use a CTAS (create table as select) to aggregate data into a staging table (that would for example contain an aggregation for a specific hour). The next step is then to use a partition exchange to swap that staging table's content into the hourly partitioned table (the "materialised" object).

    The custom PL/SQL interface that does this only needs the following parameters. The source code SQL select statement for doing the aggregation. The target table name and partition name. That's it. It then:
    - uses a (dynamic SQL) CTAS to create the staging table
    - inspects the indexes and constraints of the target partition table
    - applies the same indexes and constraints to the staging table
    - performs the partition exchange
    - drops the staging table

    As no DMLs are used (DDLs only), and as it supports Parallel Query for the CTAS and creation of indexes, performance in comparison with using actual materialised views, is significantly faster.

    E.g. 93 seconds to process and aggregate around 30 million rows into a staging table of 9.6 million rows, add the indexes, and then perform a partition swap to replace the existing partition with the new refresh of 9.6 million rows.

    A materialised view using delete and insert DMLs, even if done using parallel DML, would have been a lot slower with a much greater resource footprint.

    So this custom approach of "materialising" aggregation tables works well in our case for the type of data we need to process.
  • 18. Re: creating your own materialized view utlizing tables and a job
    rp0428 Guru
    Currently Being Moderated
    >
    I have my own custom version of materialised views. The reasons are basically performance and management of the materialised object
    >
    Nothing in what you describe is similar to an MV. You are merely using staging tables the way they have been used for years.
    >
    The custom PL/SQL interface that does this only needs the following parameters. The source code SQL select statement for doing the aggregation. The target table name and partition name. That's it. It then:
    - uses a (dynamic SQL) CTAS to create the staging table
    - inspects the indexes and constraints of the target partition table
    - applies the same indexes and constraints to the staging table
    - performs the partition exchange
    - drops the staging table
    >
    Why are you using CTAS? From what you described I don't see where it is necessary.

    You describe only using the NAME of the target table which means you must already know the structure. So why not just create a permanent table with the correct structure and do a direct-path insert into it?
    >
    A materialised view using delete and insert DMLs, even if done using parallel DML, would have been a lot slower with a much greater resource footprint
    >
    Yes - MVs would be slower for the use you describe. But that use isn't what MVs were designed for. So using staging tables and NOT using MVs isn't the same thing as creating an MV replacement. MVs are designed to be used over and over again for the problem of slowly changing data.

    You've presented a nice solution for your particular use case but I don't see what it has to do with the question ask in this thread.
  • 19. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    rp0428 wrote:
    >
    I have my own custom version of materialised views. The reasons are basically performance and management of the materialised object
    >
    Nothing in what you describe is similar to an MV. You are merely using staging tables the way they have been used for years.
    Not really. A MV is often used in data warehousing to maintain summarised/aggregated data sets. And that is the functionality we need. We used MVs originally for this system. However, we ran into a bunch of issues ito usability and performance.
    Why are you using CTAS? From what you described I don't see where it is necessary.
    A MV is defined by a SQL statement. E.g.
    select 
      deptno, avg(salary) as AVG_SAL, count(*) as EMPLOYEES, ...etc
    from emp
    group by deptno
    And this is the SQL statement that needs to be supplied to our customised interface. A MV DDL also defines the name of the MV - which is the target table in our interface call. The only additional parameter is the name of the target partition. As the SQL defines the summarised data set for a partition in the table, and not the summarised data set for the entire table - as a MV's SQL would.

    A MV would use this SQL to determine the new data set and replace the existing data set wit the new data set (full refresh). It does so by using SQL insert.

    We need to do the exact same steps. But we identify the new data set via a CTAS. And instead of using an insert, we do a partition exchange to to refresh the table partition.
    You describe only using the NAME of the target table which means you must already know the structure. So why not just create a permanent table with the correct structure and do a direct-path insert into it?
    Because then you need to first truncate or delete the data in that table (or table partition). And this is the problem with using a MV in our case.

    We can have the MV do a truncate and then insert. The insert may ran 20 minutes. Which means between the truncate and commit of the insert, there is no data in that partition. Which is not acceptable from a reporting perspective.

    We can have the MV do a delete and insert. Which means tons more I/O for the delete, even a longer time for the partition (or table) to be refreshed, and significantly more expensive. Also not acceptable.
    Yes - MVs would be slower for the use you describe. But that use isn't what MVs were designed for. So using staging tables and NOT using MVs isn't the same thing as creating an MV replacement. MVs are designed to be used over and over again for the problem of slowly changing data.
    Beg to differ. We have a classic data warehouse issue of a massive fact being updated with data, and requiring summarised and aggregated tables to be refreshed to reflect the changes to the fact table.

    And this is exactly what MVs are intended to be used for - as described in Oracle® Database 2 Day + Data Warehousing Guide.
  • 20. Re: creating your own materialized view utlizing tables and a job
    rp0428 Guru
    Currently Being Moderated
    >
    We can have the MV do a truncate and then insert. The insert may ran 20 minutes. Which means between the truncate and commit of the insert, there is no data in that partition. Which is not acceptable from a reporting perspective.

    We can have the MV do a delete and insert. Which means tons more I/O for the delete, even a longer time for the partition (or table) to be refreshed, and significantly more expensive. Also not acceptable.
    >
    Yes - you could do that. But you can also use an MV and do your own management of it. You get the benefits of the MV (Oracle has special knowledge for things like query rewrite) and avoid the issues you just described.

    Sounds like you haven't benchmarked or used partitioned MV's. The MV in many aspects is just another table. Go ahead and create your new summary data in a staging table and then partition exchange it with the MV.
    >
    I have my own custom version of materialised views. The reasons are basically performance and management of the materialised object.
    >
    Yes - and it isn't necessary to have your own 'custom version' since you can create an on-demand MV and manage it manually the way you describe without giving up the benefits of it being an MV. They have used MV's in that way for years at the large financial institution where I implemented that and there hasn't been any performance or management issue.
  • 21. Re: creating your own materialized view utlizing tables and a job
    user13328581 Explorer
    Currently Being Moderated
    Hi billy what exactly do you mean by aggregated?...and i really dont understand why you still need a ctas...you can create several permanent staging table ahead of time with the associated index and partitions from their associated target table and then just dump the aggregate data in them and finally transfer the data to the target table and then delete the contents of the permanent staging table... ALso, if any of the indexes changes or partition changes in the target table, you can always check and update the permanent staging table with the new change


    Edited by: user13328581 on Dec 6, 2012 6:16 PM

    Edited by: user13328581 on Dec 6, 2012 6:20 PM

    Edited by: user13328581 on Dec 6, 2012 6:23 PM

    Edited by: user13328581 on Dec 6, 2012 6:29 PM

    Edited by: user13328581 on Dec 6, 2012 6:31 PM
  • 22. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    You are missing the point that a MV refresh (even if it touches a single partition), can only be done using a truncate of the data to refresh, or a delete of the data to refresh.

    The truncate is a problem as it zaps data still needed for reporting up to the second that the new refreshed data becomes available. The delete is a problem as it adds serious overheads. A single partition can easily contain a 100 million rows.

    Yes, we have tried partitioned materialised views. No, they benchmarked/performed poorly. (in this specific situation and requirements)

    We use MVs extensively... but not in this case as the MV concept is needed (refreshing of aggregated data), but it failed to meet requirements.
  • 23. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    user13328581 wrote:
    Hi billy what exactly do you mean by aggregated?...and i really dont understand why you still need a ctas...you can create several permanent staging table ahead of time with the associated index and partitions from their associated target table and then just dump the aggregate data in them and finally transfer the data to the target table and then delete the contents of the permanent staging table... ALso, if any of the indexes changes or partition changes in the target table, you can always check and update the permanent staging table with the new change
    Unworkable and inefficient approach. If there is a single staging table as a permanent table, then a single partition can be refreshed. If this is done dynamically, then 30 refreshes can be done at the same time on the same table with different partitions.

    Think flexibility.

    Edited by: Billy Verreynne on Dec 10, 2012 5:02 AM
  • 24. Re: creating your own materialized view utlizing tables and a job
    user13328581 Explorer
    Currently Being Moderated
    what if the refresh takes a while....wouldn't cause table deadlock or are you prevent deadlocks from occuring.......this is why i was suggesting a single permanent staging table instead

    Edited by: user13328581 on Dec 7, 2012 6:58 AM
  • 25. Re: creating your own materialized view utlizing tables and a job
    APC Oracle ACE
    Currently Being Moderated
    Billy  Verreynne  wrote:
    Unworkable and efficient approach.
    I think you mean "Unworkable and +in+ efficient approach. "

    Cheers, APC
  • 26. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    APC wrote:
    Billy  Verreynne  wrote:
    Unworkable and efficient approach.
    I think you mean "Unworkable and +in+ efficient approach. "

    Cheers, APC
    Thanks Andrew.

    These days it happens more and more that I think one thing and muscle memory types something else. :-)
  • 27. Re: creating your own materialized view utlizing tables and a job
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    user13328581 wrote:
    what if the refresh takes a while....wouldn't cause table deadlock or are you prevent deadlocks from occuring.......this is why i was suggesting a single permanent staging table instead
    Deadlocks are self (application) induced. So I'm not sure what exactly you have in mind as a potential problem.

    Simple example of what I'm talking about. Marketing/sales warehouse. Fact table is the actual sales that happen. Aggregated table contains sales per salesperson, per product category, per day, and includes measures such as gross sale, net sale, discount, volume, customer count, etc.

    The aggregated table is designed as a daily range partition table (partition per day).

    A MV can be build on this pre-build table. And would work fine most of the time.

    However, if sales figures from some distant stores are typically a few days late, then a MV refresh will need to refresh a specific day multiple times.

    And that in a nutshell is a problem. In our case the aggregated table's partition is hourly. Each hour can have over 200 million fact rows that need to be aggregated. The aggregation needs to be done as soon as possible after the hour, to make that hour's aggregated data available for reporting. Some hours may need to be re-aggregated for a number of reasons (such as data arriving late, changes in the business layer, etc).

    In this case a MV is not able to efficiently deal with the data volumes and meet the requirements.

    So our custom approach is being able to aggregate/re-aggregate any hour's data at any time - and if need be, run a number of these aggregations in parallel.

    So if hours 5AM and 6AM need to be re-aggregated, two processes run at the same. Each using a CTAS. And then each doing a partition exchange, one with the 5AM hourly partition and the other with the 6AM hourly partition. All this can happen at the exact same time - without impacting one another, as the processes do not step across partition boundaries.
  • 28. Re: creating your own materialized view utlizing tables and a job
    user13328581 Explorer
    Currently Being Moderated
    Thanks billy for the explanation so far...If I have anymore questions I would let you know...
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points