Forum Stats

  • 3,769,605 Users
  • 2,252,991 Discussions
  • 7,875,117 Comments

Discussions

Performance tuning for tables without indexes

4fcc5f03-2cb2-49a2-9716-55327ad4fa8c
edited Aug 2, 2017 3:13PM in SQL & PL/SQL

Hello All,

I am a newbie in performance tuning. I am facing following issue:

1. I have main table with around 300K records, this table has no constraints or indexes (client is reluctant to add any).

2. I have staging table which receives around 10-15K records every 3 hours and I have to insert and/or update these in main table.

3. I can not use MERGE as they have data issues and client is not ready to change the data so I have to modify the logic from MERGE to Insert-Update.

4. I tried following ways but both took more than 60 minutes.

Update main_table m

set (column list) = (select column_list from staging s where m.id = s.id) ;

Update main_table m

set (column list) = (select column_list from staging s where m.id = s.id)  where exists (select 1 from staging s where s.id = m.id); /* Added exists clause to get records present in staging table only */

I am on Oracle 12.1.0.2

Can you please help me to get out of this?

LazarGary_A

Answers

  • John Thorton
    John Thorton Member Posts: 14,493 Silver Crown
    edited Aug 2, 2017 10:50AM
    4fcc5f03-2cb2-49a2-9716-55327ad4fa8c wrote:Hello All,I am a newbie in performance tuning. I am facing following issue:1. I have main table with around 300K records, this table has no constraints or indexes (client is reluctant to add any).2. I have staging table which receives around 10-15K records every 3 hours and I have to insert and/or update these in main table.3. I can not use MERGE as they have data issues and client is not ready to change the data so I have to modify the logic from MERGE to Insert-Update.4. I tried following ways but both took more than 60 minutes.Update main_table mset (column list) = (select column_list from staging s where m.id = s.id) ;Update main_table mset (column list) = (select column_list from staging s where m.id = s.id) where exists (select 1 from staging s where s.id = m.id); /* Added exists clause to get records present in staging table only */I am on Oracle 12.1.0.2Can you please help me to get out of this?

    You tie your left wrist to your right ankle & then ask how to improve the time it take you to run the 100 meter dash.

    You can't push a string!

    3. How to improve the performance of my query? / My query is running slow.

    Lazar
  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited Aug 2, 2017 10:47AM

    Well the first update will update all the rows in main_table and if the sub-query doesn't find a match in staging it'll set all the columns to null. So never mind how long that one takes, it's trashing your data, don't do it.

    Why can't you use merge? If you can make it work as an insert-update you should generally be able to do merge. It might well be that it's possible even though you don't know how. So if you explain exactly what is stopping you we may be able to show you how to get around the issue.

    Have you tried tracing the session to see where the time is being spent. Updating 10-15K ought to take a lot less than an hour on half decent hardware.

  • 4fcc5f03-2cb2-49a2-9716-55327ad4fa8c
    edited Aug 2, 2017 11:07AM

    I can't use merge as there are duplicate values for joining condition. Strange but true. I have email_id as joining condition and they have two different names for same email id.

    When I first saw this, it took me couple of minutes to believe that such data exists in real life. I tried by best to convince client to have unique name for a person but it was of no use. I hate the name "Cathy" so much now because of this issue. 

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited Aug 2, 2017 11:10AM

    Isn't the joining condition id? And if id has duplicates then that update should throw ORA-01427: single-row subquery returns more than one row

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited Aug 2, 2017 11:55AM

    Or are there only duplicates in main and never in staging? In which case merge should work just fine

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited Aug 2, 2017 12:14PM
    4fcc5f03-2cb2-49a2-9716-55327ad4fa8c wrote:I can't use merge as there are duplicate values for joining condition. Strange but true. I have email_id as joining condition and they have two different names for same email id. When I first saw this, it took me couple of minutes to believe that such data exists in real life. I tried by best to convince client to have unique name for a person but it was of no use. I hate the name "Cathy" so much now because of this issue. 

    If there were duplicates then your scalar subquery method would fail.

    You have no index so the only way to read the data is with a full table scan.

    If you use a scalar subquery and it doesn't get transformed (and I doubt it would without a suitable unique constraint), your query has to do a full scan of staging for each row in main_table that gets updated.

    If you use a merge then it could do a hash join to join the tables together to perform the update.

  • Gary_A
    Gary_A Member Posts: 624 Bronze Badge
    edited Aug 2, 2017 12:37PM
    1. I have main table with around 300K records, this table has no constraints or indexes (client is reluctant to add any).

    Sounds like another clueless client. There is no shortage of those.

  • Paulzip
    Paulzip Member Posts: 8,494 Blue Diamond
    edited Aug 2, 2017 12:55PM

    You have ridiculous restrictions that you shouldn't have to work around.  Clients shouldn't govern solutions you provide!  Sort out duplicates in the main and staging tables, whack PK on the id cols of both tables, then run Stew Ashton's delta merge approach (should take seconds / a minute or so)...

    Can be extended to n fields

    merge into main_table o

    using (

      select *

      from (

        select id, field1, field2,

              count(*) over (partition by id) - sum(z##_cnt) z##iud_flag

        from (

          select id, field1, field2, -1 z##_cnt

          from main_table

          union all

          select id, field1, field2, 1 z##_cnt

          from  staging

        )

        group by id, field1, field2

        having sum(z##_cnt) != 0

      )

      where z##iud_flag in (0, 1) -- 0 = INSERT – in staging, not in main, 1 = UPDATE – put this in main

    ) n

    on (o.id = n.id)

    when matched then

      update set

        o.field1 = n.field1,

        o.field2 = n.field2

    when not matched then

      insert (o.id, o.field1, o.field2)

      values (n.id, n.field1, n.field2)

  • Unknown
    edited Aug 2, 2017 3:13PM
    I am a newbie in performance tuning. I am facing following issue:

    The FIRST 'rule' you need to learn is that ONLY ONE entity can be in charge of things.

    1. I have main table with around 300K records, this table has no constraints or indexes (client is reluctant to add any).

    So tell us who is responsible for performance tuning? You (your org)? Or the client?

    If YOU (your org) is in charge of tuning then the client HAS NO SAY as to whether indexes or constraints are used.

    If your client is in charge of tuning then you (your org) should tell the client they are on their own since it is NOT possible for you to help them if they reject steps that are necessary to do the tuning.

    Can you please help me to get out of this?

    I just did that above. It is NOT POSSIBLE to tune queries that must use large tables that have no indexes or primary keys.

    That is because Oracle must do a full table scan to find even a single row in the table. Then to find the next row it must do another full table scan.

    The only way I know of to do any such 'tuning' is using the old-style 'tape merge' method and use pl/sql to:

    1. sort the tables involved

    2. sort the change records to match the table sort

    3. perform slow-by-slow (row by row) processing to manually perform key matching, updating and rewriting of the data to a new table.

    Gary_A
This discussion has been closed.