8 Replies Latest reply: Mar 19, 2012 12:00 PM by Gamblesk RSS

    compute statistics performance problems

    855577
      Hi all,

      we are running compute statistics on one table which is completely deleted every day, filled and then analyzed. However, the compute statistics part takes more and more time every day. Does the table keep some kind of log/history for statistics that were computed before on the deleted data? Or is there any reason why computing statistics on approximately the same amount of data every day should take longer and longer time? We are running 10g.

      Some notes:
      I cannot truncate the table because I need to rollback in case of problems. I could drop and recreate it once if it would help.
      Before deletion and filling I disable all indexes and then recreate them.
      We transfer around 10 million rows each day to this table.
      One year ago the whole process (filling + analyze) took 3 hours, now it takes 18 hours.
      We also got this error on the table: ORA-01555: snapshot too old: rollback segment number with name °° too small ORA-02063: preceding line from T_SRC


      This is the script in procedure:
          execute immediate 'alter index IND_TAB1 unusable';
      
          delete from TAB1;
      
          insert /*+ APPEND */ into TAB1
          select
            COL1
           ,COL2
            from V_SRC1
           where change_date   between sysdate-1 and sysdate
              or creation_date between sysdate-1 and sysdate;
      
          execute immediate 'alter index IND_TAB1  rebuild';
      
          execute immediate 'analyze table TAB1 compute statistics';
      V_SRC1 is view to table T_SRC on remote system.

      Thank you for ideas.
        • 1. Re: compute statistics performance problems
          Nikolay Savvinov
          Hi,

          1) what's your Oracle version?
          2) any particular reason for COMPUTE instead of ESTIMATE?
          3) why ANALYZE and not DBMS_STATS?
          4) what problems are you referring to when you're saying you need to be able to rollback?
          5) is flashback enabled on the database and accessible to you?

          Now regarding your question -- one of the reasons for performance degradation of the stats job
          could be not resetting the high watermark. If that's the case then yes, re-creating tables would help.

          There may be other reasons as well, but unless you keep AWR reports for a year ago (which is always a good idea
          to do) it's impossible to tell. Maybe your database is becoming busier, maybe your disks or storage are, who can tell?

          Best regards,
          Nikolay
          • 2. Re: compute statistics performance problems
            855577
            Hi Nikolay, thanx for reply and ideas.

            1) version 10g
            2) no reason, just that all other tables use COMPUTE
            3) Is there some advantage to DBMS_STATS that could help me?
            4) When I delete and fill the table in procedure in case it fails I need to be able to rollback changes (=old data) for users to access
            5) no idea, i'm not DBA

            I don't know what is high watermark and how to reset it, could you please provide a short instruction what to do? I'm afraid that going in search of something this new will just mislead me.

            yavvie
            • 3. Re: compute statistics performance problems
              Nikolay Savvinov
              Hi,

              when you insert and delete a lot of data into a table, it becomes fragmented and there may accumulated a significant amount of unused disk space. When doing a full table scan, Oracle scans the segment up to the high watermark (HWM) which points to the location of the highest used block, i.e. it scans all the empty blocks as well. In order to reset the HWM you can issue ALTER TABLE <table_name> MOVE. Despite what the name of the command suggest, your table won't be moved anywhere, it will just rebuild it and reclaim all unused free space. If there are any indexes defined on the table, they would have to be rebuild after that as well (ALTER INDEX <index name> REBUILD).

              Alternatively, you can consider range partitioning by day (if your license allows that). With partitioning everything would become much simpler:

              1) you load your data into the table and check everything
              2) if ok, you just drop the previous partition; if not, rollback.

              Regarding COMPUTE -- it goes through all of the data to calculate statistics, instead of using a small sample (typically, a few percent). It's slower than ESTIMATE and in most cases doesn't provide any benefits over it.

              Regarding DBMS_STATS vs ANALYZE -- using ANALYZE for stats collection is now obsolete and not recommended. DBMS_STATS offers same functionality and much more, since your version is 10g you should be using DBMS_STATS and not ANALYZE.

              Best regards,
              Nikolay
              • 4. Re: compute statistics performance problems
                855577
                Nikolay thank you very much for detailed information, I'll try MOVE on the table and subsequently change procedure to DBMS_STATS with ESTIMATE and hopefully it helps.

                yavvie
                • 5. Re: compute statistics performance problems
                  Gamblesk
                  I think you will find your table is a lot bigger than it needs to be which would make the compute statistics take longer as it is scanning a lot of empty blocks.

                  Your load process of deleting all the rows then using insert append will never reuse that space you just deleted. Insert append puts data above the high water mark.

                  Essentially what is happening is you delete data then insert new data above the high water mark (table just got bigger) then you delete again and again insert above the high water mark (table got bigger again). You never reuse that space you just deleted.

                  You might want to rethink the option of truncating the table and if it really is that important for rollback when it sounds like you are replacing all the data anyway.

                  Dropping and recreating the table (or alter table move as someone else suggested) will solve the short term problem but you unless you change the delete/insert append process it will begin taking longer and longer again and you will end up dropping/moving the table again eventually.
                  • 6. Re: compute statistics performance problems
                    855577
                    889915,
                    thanx for information.

                    If I change from delete to truncate will I be able to release my allocated disk space and write in the freed space, thus without increasing the filesize of my table?

                    Rollback is important for users - in case something goes wrong during transfer they will have day-1 data available instead of empty table. But I'll probably try loading into TEMP table which will after successful loading replace the production table. Then I'll have a fresh table every day and no excess allocated space.
                    • 7. Re: compute statistics performance problems
                      Sven W.
                      yavvie wrote:
                      If I change from delete to truncate will I be able to release my allocated disk space and write in the freed space, thus without increasing the filesize of my table?
                      Yes. And the truncate is way faster than any delete.
                      >
                      Rollback is important for users - in case something goes wrong during transfer they will have day-1 data available instead of empty table. But I'll probably try loading into TEMP table which will after successful loading replace the production table. Then I'll have a fresh table every day and no excess allocated space.
                      That is a good idea and standard practice for many ETL jobs.
                      1) Load into a temp table (can be a permanent table, that is used only during the load process and that can be truncated at the start of the load).
                      2) If the load was ok, then Truncate the target table
                      3a) insert /*+append */ into target_table select * from temp_table;
                      3b) if you have the partitioning option licensed , consider to use EXCHANGE PARTITION instead insert + select
                      4) commit
                      • 8. Re: compute statistics performance problems
                        Gamblesk
                        Truncate essentially lowers the high water mark so yes you would not have this problem.

                        If the table is larger than it needs to be and you want to get rid of the extra space its taking up you might have to use the 'drop storage' clause. I cannot remember what the default is.