1 2 Previous Next 20 Replies Latest reply: May 2, 2012 8:49 AM by Ishan RSS

    Gather Stats taking time

    Ishan
      Hi All,

      I know this question has been asked many times but there is a brand new problem that I am facing.

      We pre-create the parititions for one month for some of the tables. But the issue is that when we gather stats, even after specifiying partition name, it even gather stats for the empty parititions too. And it takes a lot of time to gathers stats for those empty partitions. Not sure why this happens.

      I am not against of gathering stats for empty partitions but why does it take a lot of time to gather stats for them.

      This process in-turn is taking a lot of time to complete.

      Any ideas how to stop this?

      Regards,
      Ishan
        • 1. Re: Gather Stats taking time
          sybrand_b
          Ok,

          Let's see what we don't have and what would be required to say really anything

          1 four digit database version
          2 exact dbms_stats call which takes 'lot of time'
          3 What 'lot of time' is
          4 definition of the table which takes 'lot of time'
          5 setting of pga_aggregate_target.


          If you are not prepared to provide this, please don't ask this question. Also please specify this information upfront and don't have people to 'interview' you to get the information out of you.
          You are wasting everyone's time.

          ----------
          Sybrand Bakker
          Senior Oracle DBA
          • 2. Re: Gather Stats taking time
            Ishan
            You could have said these things in a nice way.. rather than accusing me as if I have done some crime by asking the question...

            If it's wasting your time, please don't bother to reply.....

            PS: Currently, I don't have as much knowledge as you have(the reason why I am asking questions here) but neither does it mean that I can never reach there....
            • 3. Re: Gather Stats taking time
              sb92075
              Ishan wrote:
              Hi All,

              I know this question has been asked many times but there is a brand new problem that I am facing.

              We pre-create the parititions for one month for some of the tables. But the issue is that when we gather stats, even after specifiying partition name, it even gather stats for the empty parititions too. And it takes a lot of time to gathers stats for those empty partitions. Not sure why this happens.

              I am not against of gathering stats for empty partitions but why does it take a lot of time to gather stats for them.

              This process in-turn is taking a lot of time to complete.

              Any ideas how to stop this?

              Regards,
              Ishan
              You: Doctor is hurts when I poke myself in the eye. How do I stop the pain?
              Doctor: You need to stop poking yourself in the eye.

              By the way with V10+ Oracle collects needed statistics once every 24 hours;
              so you don't need to do anything.
              • 4. Re: Gather Stats taking time
                Karan
                You really need to provide what code you are executing at what version is it .. Because it depends what version you are using, like estimate_percent value in 10G changed from 100% sample to auto_sample_size, just like this lot of things matter, not even this may be there is some other job running on your system which is eating your resources like anything, causing this job to hang as you said, please atleast mention what options you are using, and even in partitioning tables you have several options for gathering stats like for a single part or sub part..

                Also in 10G if the data in some partitions change, not only do we need to scan the changed partitions to gather the partition level statistics, but also we have to scan the entire table again to gather the table level statistics. Bit in oracle 11g there is one pass distinct sampling based method which can accurately derive the table level statistics from the partition level statistic

                Regards
                Karan
                • 5. Re: Gather Stats taking time
                  Karan
                  And can you tell also that why you need to gather statistics on empty partitions. Must be any specific reason.. Please provide the reason.. Moreover if there are still 1 or 2 partitions which are volatile not the empty ones, delete the statistics for the table and let the optimizer dynamic sampling kick in, which is from 9i onwards i guess.. And we are still waiting for the version information from you ...

                  Regards
                  Karan
                  • 6. Re: Gather Stats taking time
                    Ishan
                    This is the code that we are using to gather stats.

                    Collecting stats for partitioned tables runs daily and runs on Sunday for the complete schema.

                    @Karan: We are not collecting stats on empty partition, it's Oracle who is doing that. We are specifiying the partition name but still it collect stats for the pre-created partition. And even if it collects why does it take lot of time to complete the gathering stats process.

                    Below is the code that we are using:

                    Oracle Version No: 11.1.0.7.0


                    create or replace procedure Gather_stats
                    (IN_STAT_OPTION in varchar2)
                    is
                    IF IN_STAT_OPTION = 'ALL'
                         THEN


                              DBMS_STATS.GATHER_SCHEMA_STATS (
                                                  OWNNAME =>V_OWNNAME
                                                  ,ESTIMATE_PERCENT=>20
                                                  ,DEGREE=>4
                                                  ,CASCADE=>TRUE
                                                  ,OPTIONS=>'GATHER EMPTY'
                                                  ,NO_INVALIDATE=>FALSE
                                                  ) ;
                              

                              DBMS_STATS.GATHER_SCHEMA_STATS     (
                                                  OWNNAME =>V_OWNNAME
                                                  ,ESTIMATE_PERCENT=> DBMS_STATS.AUTO_SAMPLE_SIZE
                                                  ,DEGREE=>4
                                                  ,GRANULARITY => 'ALL'
                                                  ,OPTIONS => 'GATHER'
                                                  ,CASCADE =>TRUE
                                                  ,NO_INVALIDATE =>FALSE
                                                  ) ;

                              

                         ELSIF IN_STAT_OPTION = 'PARTITION'
                         THEN

                         

                         STAT_TABLE_NAMES(1) := 'A';
                         STAT_TABLE_NAMES(2) := 'B';
                         STAT_TABLE_NAMES(3) := 'C';
                         STAT_TABLE_NAMES(4) := 'D';
                         STAT_TABLE_NAMES(5) := 'E';


                              FOR I IN STAT_TABLE_NAMES.FIRST .. STAT_TABLE_NAMES.LAST
                              LOOP

                                        FOR J IN(SELECT TABLE_NAME,
                                   PARTITION_NAME,
                                   HIGH_VALUE-1 HIGH_VALUE
                                   FROM (SELECT      TABLE_NAME,
                                        PARTITION_NAME,
                                        TO_DATE (TRIM ('''' FROM REGEXP_SUBSTR( EXTRACTVALUE (
                                   DBMS_XMLGEN.GETXMLTYPE (
                                   ' SELECT HIGH_VALUE '
                              || ' FROM USER_TAB_PARTITIONS '
                              || ' WHERE TABLE_NAME = ''' || TABLE_NAME
                              ||''' AND PARTITION_NAME = ''' || PARTITION_NAME
                              ||''''
                                   )
                                        , '//text()'
                                                                                          )
                                   , '''.*?'''
                                                                                )
                                             ), 'SYYYY-MM-DD HH24:MI:SS') HIGH_VALUE
                              FROM USER_TAB_PARTITIONS
                              WHERE TABLE_NAME = STAT_TABLE_NAMES(I)
                                             )
                                   WHERE TO_DATE(TO_CHAR((HIGH_VALUE - 1), 'DD-MON-RRRR')) = '31-MAR-2012'
                                   AND PARTITION_NAME LIKE '%SYS_%'
                                   ORDER BY HIGH_VALUE DESC
                                             )

                                        LOOP

                                             V_RETURNCODE := 'H' ;

                                             DBMS_STATS.SET_TABLE_PREFS(V_OWNNAME,STAT_TABLE_NAMES(I),'INCREMENTAL','TRUE');

                                             V_RETURNCODE := 'I' ;

                                             DBMS_STATS.GATHER_TABLE_STATS(OWNNAME           => V_OWNNAME
                                                            ,TABNAME          => STAT_TABLE_NAMES(I)
                                                            ,PARTNAME          => J.PARTITION_NAME
                                                            ,GRANULARITY     => 'AUTO'
                                                            );

                                        END LOOP;

                              END LOOP;
                    END IF;


                    @SB: Going by your logic, i should say that if we are not getting the solution we should forget the question.

                    @Karan: I know that was a mistake on my part where I didnn't provide the complete information. But there are better ways to communicate the same.

                    Thanks,
                    Ishan
                    • 7. Re: Gather Stats taking time
                      sybrand_b
                      Your remarks are incorrect, as I didn't accuse you of anything.

                      I request you try to not be offended without sufficient reason.

                      You had NO reason to feel offended!!!
                      ---------
                      Sybrand Bakker
                      Senior Oracle DBA
                      • 8. Re: Gather Stats taking time
                        Karan
                        Ok, fine .. i saw your code... The thing you have to understand is that if tables are big statistics generation might even take days, buy the way how much is the workload when you gather statistics for these tables, or else let say you a job which executes your given procedure, how much resources you allocate to this job by resource manager, and let me tell you even for small estimates some required operations can take large time for large tables. An interesting technique to speed things up is to gather only "stale" stats usually... and also compare this bad period with the last known good period, and see what has changed.... and who will tell what you wrote in IN_STAT_OPTION of your above procedure.. Also mention the long wait events you see in your reports...

                        Also mention why you are using these fixed values, any specific reason ?? why you are deviating from the default of estimate percent... Mention that too

                        Regards
                        Karan
                        • 9. Re: Gather Stats taking time
                          Karan
                          Can you post the size of all partitions in the number of blocks...lets see the details of your all partitions including your empty ones....

                          Regards
                          Karan
                          • 10. Re: Gather Stats taking time
                            Ishan
                            Hi Karan,

                            I think I got what the issue is. I'll give u explanation. Please confirm if that is a valid one or not.

                            We have 12 Partitioned table.
                            Total No of partitions: 4188(for all the tables)
                            Avg Size of Each Partition: 1 GB

                            And I think this is the cause of gathering stats taking lot of time.

                            It's the second part of the code, 'PARTITION', is taking time inspite of coding it the best possible way....

                            Only reason could be the no's of partitions that it has to scan to gather the global stats...

                            Thanks,
                            Ishan

                            Edited by: Ishan on Apr 30, 2012 10:17 PM
                            • 11. Re: Gather Stats taking time
                              AdamMartin
                              1. Stop gathering global stats when you only need stats for one partition. Check the stats preferences on your table. You can avoid always gathering global stats if incremental global stats is enabled. You need to use AUTO for granularity.

                              Quote from Oracle documentation:

                              INCREMENTAL - This value determines whether or not the global statistics of a partitioned table will be maintained without doing a full table scan. With partitioned tables it is very common to load new data into a new partition. As new partitions are added and data loaded, the global table statistics need to be kept up to date. Oracle will update the global table statistics by scanning only the partitions that have been changed instead of the entire table if the following conditions hold:

                              * the INCREMENTAL value for the partitioned table is set to TRUE;
                              * the PUBLISH value for the partitioned table is set to TRUE;
                              * the user specifies AUTO_SAMPLE_SIZE for ESTIMATE_PERCENT and AUTO for GRANULARITY when gathering statistics on the table.

                              If the INCREMENTAL value for the partitioned table was set to FALSE (default value), a full table scan is used to maintain the global statistics which is a much more resource intensive and time-consuming operation for large tables.


                              2. Consider gathering stats in parallel if they are still taking a long time.
                              • 12. Re: Gather Stats taking time
                                Ishan
                                I need both the partition-level and global level stats.

                                By using the INCREMENTAL option, Oracle creates synopsis of each partition and stores it in some table in SYSAUX tablespace. And then while computing global statistics, it simply aggregates the values from the synopsis.

                                But the issue which we are facing is, that whenever I gather stats using INCREMENTAL option, the LAST_ANALYZED date for all the table partitions is today's date(effectively the date on which I gather the stats), which means it's gathering stats for all the partitions and not just the newly added partition.

                                This is my real concern. Why all the partitions get analyzed while using INCREMENTAL option. It should simply gather stats for that partition and add it to the global statistics.

                                We are giving the partition name as we thought it might just gather stats for only that partition(I know if we use INCREMENTAL option, giving partition name is of no use).

                                PS: The main culprit is the PARTITON block. Please focus on this to fine tune it.

                                In a nutshell, at the end of the day what I want is this.

                                Global and Partition Level stats for all the partitioned tables.

                                Inspite of using the INCREMENTAL option, it's taking more than a day to complete the process. Am I missing something or is there some other better way to achieve this.?

                                Regards,
                                Ishan

                                Edited by: Ishan on May 1, 2012 1:18 AM
                                • 13. Re: Gather Stats taking time
                                  AdamMartin
                                  Are you sure you are gathering incremental global stats?

                                  Are you using AUTO_SAMPLE_SIZE for ESTIMATE_PERCENT? In the 'ALL' section you specify 20 percent, which is fine, but I do not see the estimate_percent for the 'PARTITION' section. Is the default stat pref still AUTO_SAMPLE_SIZE?

                                  It is fine to specify the partition name or to leave it off. That should not be the issue.

                                  How often do you run the 'ALL' option? The reason I ask is because the way I understand incremental global stats is that the first time it runs, it must gather stats for every historical partition. After that the incremental stats gathering really becomes incremental (usually just gathering stats on a single new partition).

                                  You did not address running stats in parallel. Whether the incremental stats issue is resolved or not, if you have the available hardware resources, you may as well use them.
                                  • 14. Re: Gather Stats taking time
                                    Ishan
                                    Adam Martin wrote:
                                    Are you sure you are gathering incremental global stats?
                                    Yes
                                    >
                                    Are you using AUTO_SAMPLE_SIZE for ESTIMATE_PERCENT? In the 'ALL' section you specify 20 percent, which is fine, but I do not see the estimate_percent for the 'PARTITION' section. Is the default stat pref still AUTO_SAMPLE_SIZE?
                                    Yes, the AUTO_SAMPLE_SIZE is set for ESTIMATE_PERCENT as the default value in the 'PARTITION' section.
                                    >
                                    It is fine to specify the partition name or to leave it off. That should not be the issue.

                                    How often do you run the 'ALL' option? The reason I ask is because the way I understand incremental global stats is that the first time it runs, it must gather stats for every historical partition. After that the incremental stats gathering really becomes incremental (usually just gathering stats on a single new partition).
                                    ALL option runs every Sunday. But that's not the issue. The partition section itself is taking lot of time to complete and as a result we have to keep it on hold as it doesn't seem to complete. At present, I am not worried about the 'ALL' section. It's the PARTITION section along that is a bottleneck.

                                    Even I think so that incremental option should use historical stats and thats why it was introduced in 11g. And also it should not depend on the amount of data in all the partitions except the new partition that is being loaded. But then I am unable to understand why it's taking eternity to gather stats for partitioned tables.
                                    >
                                    You did not address running stats in parallel. Whether the incremental stats issue is resolved or not, if you have the available hardware resources, you may as well use them.
                                    I have read somewhere that while using INCREMENTAL option, only valid parameters are SCHEMA_NAME and the TABLE_NAME. Rest all are ignored. Can't specify the degree here.

                                    Edited by: Ishan on May 1, 2012 11:35 AM
                                    1 2 Previous Next