Forum Stats

  • 3,780,546 Users
  • 2,254,409 Discussions
  • 7,879,377 Comments

Discussions

session lived synonym

Dmitrii Dunaev
Dmitrii Dunaev Member Posts: 76
edited May 22, 2019 12:17PM in SQL & PL/SQL

Hello everyone.

We use Oracle 11.2 on a Linux machine.

I have a very large table with millions of rows, lets call it "trx". Based on that table we build an aggregate table - "trx_hours", which is partitioned by hours. We also have a buffer table - trx_buffer. We fill the buffer table for a specific hour and then exchange it with a relevant partition. Everything works great but the transaction table gets updated very often. Every time it gets updated we rebuild all relevant hours. Recently we have added a few workers to be able to simultaneously rebuild multiplle hours. Since each worker calls the same procedure i had to create multiple copies of trx_buffer table, one for each worker. I also had to rewrite the code to use dynamic SQL to be able to insert into an appropriate buffer table to avoid contention. Now since its dynamic its very hard to maintain and add changes. The only way i could think of to work around the problem is to create additinional GTT buffer table, fill it with non-dynamic SQL and then have small dynamic sql that copies everything from GTT into an appropriate buffer table and exchange it. But that would add an additional overhead. Materialized view is not a solution either. They are very slow. Is there a better way?

before the change:

begin  insert into trx_buffer    (src_code    ,dst_code    ,amount)    select src_code          ,dst_code          ,sum(amount)      from trx     where trx.date >= to_date('01.01.2019','dd.mm.yyyy')       and trx.date <  to_date('01.01.2019','dd.mm.yyyy') + 1 / 24     group by src_code             ,dst_code;               execute immediate 'alter table TRX_HOURS exchange partition ' || l_part_name || ' with table trx_buffer';end;

after the change

begin  execute immediate 'insert into ' || l_trx_buffer || '                          (src_code                          ,dst_code                          ,amount)                          select src_code                                ,dst_code                                ,sum(amount)                            from trx                           where trx.date >= to_date(''01.01.2019'',''dd.mm.yyyy'')                             and trx.date <  to_date(''01.01.2019'',''dd.mm.yyyy'') + 1 / 24                           group by src_code                                   ,dst_code';               execute immediate 'alter table TRX_HOURS exchange partition ' || l_part_name || ' with table ' || l_trx_buffer;end;

Thank you in advance.

Dmitrii Dunaev

Best Answer

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 11:44AM Accepted Answer

    You could possibly do this:

    Maintain the code in one place.

    Run the code into 4 seperate schemas. Have a synonym in each that points to the appropriate buffer table.

    Have the code that runs up the workers call the appropriate code in one of the 4 schemas.

    Dmitrii Dunaev
«1

Answers

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 10:41AM

    I think you're going to need to explain the process in more detail - specifically how you rebuild hours.

    It's not all obvious to me why you needed multiple copies of the buffer table or dynamic SQL.

    Examples of the tables in question would help.

  • Dmitrii Dunaev
    Dmitrii Dunaev Member Posts: 76
    edited May 22, 2019 10:59AM

    I have added an example to the original question. I have 4 workers which may run simultaneously. When that happens they all start inserting into trx_buffer at the same. Then whchever session gets to the exchange statement would exchanges it with its relevent partition.

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 11:01AM

    Your example does the partition exchange on trx, I assume it should be trx_hours. Simplified examples should still make sense.

    How many rows in trx?

    How many rows in trx_hours?

    How many partitions in trx_hours?

    How many partitions do you routinely have to rebuild at once?

    how often do you have to rebuild them?

    How long is it taking at the moment?

    Why can't you use a materialized view?

  • John Thorton
    John Thorton Member Posts: 14,493 Silver Crown
    edited May 22, 2019 11:14AM

    The design & implementation needs to be majorly modified to accommodate actual multi-user real time processing.

    Instead of duplicating & moving data, consider to use VIEW to access the desired rows.

  • Dmitrii Dunaev
    Dmitrii Dunaev Member Posts: 76
    edited May 22, 2019 11:27AM

    You are right, it's trx_hours.

    1. trx holds about 8 billions rows. Is also partitioned by hours and we do not keep all the data around for long. We drop old partitions. So 8 billions is pretty much a constant

    2. Each hour partition is about 5 million rows

    3. trx_hours is relatively small too. It holds only about 7 days worth of data, so it has about 5 * 24 * 7 = 840 millions rows across 168 partitions

    4. Every time trx table is changed we enqueue that specific hour into a queue table. Every 5 minutes or so we span 4 jobs. Each job starts to dequeue and rebuild one hour at a time until the queue is empty. The queue sometimes holds up to 20 hours for rebuild. So it's not uncommon to see all 4 workers run at the same time

    5. It takes a single worker a minute or two to rebuild one hour

    6. We have tried materialized views, partitioned them in different ways, but they just wont work as fast as manual load.

    I have also thought about partitioning trx_buffer table, but I think its impossible to exchange a partition from one table with another partition from a different table

  • Dmitrii Dunaev
    Dmitrii Dunaev Member Posts: 76
    edited May 22, 2019 11:24AM

    The volume is very large. It would take very long time to query against trx table. We have even created trx_days table on top of trx_hours, because its not fast enought to query against cdr_hours either.

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 11:31AM

    Others may be able to come up with better general approaches.

    Based on your current approach I would say this:

    There is no upside to adding a GTT.

    If the dynamic SQL is that big a pain (and is it really that big a deal?) then the other approach would be to have 4 separate procedures that all have the same insert but using different tables.

    From a code maintenance point of view I'd rather have the dynamic SQL, and the way you're using it there is going to be no performance overhead vs static.

  • Dmitrii Dunaev
    Dmitrii Dunaev Member Posts: 76
    edited May 22, 2019 11:39AM

    The only reason I posted this quesition is because its a big pain to maitain the dynamic sql. Its very large with lots of parameters and spans across multiple screens. Its about 1000 lines of code. Is there session based synonym or something like that? Maybe session context could help me somehow?

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 11:42AM

    1000 lines?

    That's some massive oversimplification you've done.

    Is the insert/select 1000 lines?

    Or is there other stuff going on that doesn't necessarily need to be dynamic?

    And if it is just insert/select can you move the select into a view? It doesn't change does it? just the table you're inserting into?

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 22, 2019 11:44AM Accepted Answer

    You could possibly do this:

    Maintain the code in one place.

    Run the code into 4 seperate schemas. Have a synonym in each that points to the appropriate buffer table.

    Have the code that runs up the workers call the appropriate code in one of the 4 schemas.

    Dmitrii Dunaev