1 2 Previous Next 19 Replies Latest reply on Mar 26, 2016 9:55 AM by Andrew Sayer Go to original post
      • 15. Re: Re: What is the best index for the below update query?
        3131275

        Lewis,

         

        After doing some research I realized that,

         

        The statement where the claim_status is set to Rej_this_3 is quicker because of two reasons the number of rows processed is less than 10% hence the oracle chooses Index scan. Also the number of rows associated with  table used in the not exists is very less.

         

        Where as the scenario is different with the statement where we set claim_status to Rej_this_4 . In this case though the number of rows processed was 60M (Total rows in table 600 M) there was no index in place hence the table went for a Full Table Scan. Additionally the number of rows associated with table used in the not exists was pretty high (145M).

         

        I believe because of this the 4th update was taking pretty long!

         

        Thanks!

        • 16. Re: What is the best index for the below update query?
          Jonathan Lewis

          The 10% figure is irrelevant, and there's no point in telling us that "there is no index" when the sample code you've supplied shows us that the access predicates for the first of the two statements I quoted are identical to the predicates for the second.  It may be significant that originally you gave us an indication that the second statement included an IN list, perhaps the first doesn't. Perhaps what you're really telling us is that the first query has an index for the subquery that allows Oracle to do a nested loop anti-join or filter subquery.

           

          How do you know how many rows are selected by the simple (equality) predicates, and how many rows are subsequently discarded by the subquery effect, and how many are actually updated ?  Seeing 60M in one plan and a smaller number (why not tell us how many) in the other isn't necessarily proof of anything important.

           

          It is reasonably likely that the most significant difference in performance is related to the size of the table in the subquery combined with the mechanism Oracle uses for updates through a tablescan. The underlying tablescan itself is NOT going to be the reason why one update takes 30 minutes and the other takes days.

           

          Please supply the execution plan for the first of the two queries, and give us an indication of how many rows each statement updates.

          (In passing - if this is a system for processing claims I would guess that VERY FEW rows are makred as bad claims because they have been made by someone who is not a "member; and in general I would hope that your data collection was such that at every stage the number of rows chaning from "good claim" to something else would be very small.  What should that tell you about these update statements ?)

           

           

          Regards

          Jonathan Lewis

          • 17. Re: What is the best index for the below update query?
            3131275

            Jonathan Lewis wrote:

             

            The 10% figure is irrelevant, and there's no point in telling us that "there is no index" when the sample code you've supplied shows us that the access predicates for the first of the two statements I quoted are identical to the predicates for the second.  It may be significant that originally you gave us an indication that the second statement included an IN list, perhaps the first doesn't. Perhaps what you're really telling us is that the first query has an index for the subquery that allows Oracle to do a nested loop anti-join or filter subquery.

             

            How do you know how many rows are selected by the simple (equality) predicates, and how many rows are subsequently discarded by the subquery effect, and how many are actually updated ?  Seeing 60M in one plan and a smaller number (why not tell us how many) in the other isn't necessarily proof of anything important.

             

            It is reasonably likely that the most significant difference in performance is related to the size of the table in the subquery combined with the mechanism Oracle uses for updates through a tablescan. The underlying tablescan itself is NOT going to be the reason why one update takes 30 minutes and the other takes days.

             

            Please supply the execution plan for the first of the two queries, and give us an indication of how many rows each statement updates.

            (In passing - if this is a system for processing claims I would guess that VERY FEW rows are makred as bad claims because they have been made by someone who is not a "member; and in general I would hope that your data collection was such that at every stage the number of rows chaning from "good claim" to something else would be very small.  What should that tell you about these update statements ?)

             

             

            Regards

            Jonathan Lewis

             

            Lewis,

             

            Yup the first of the two statements are identical how ever, In the first couple of update statements the select clause inside the exists had more access predicates than the REJ_MEM update statement. 

             

            And for the first of two update statement we had a composite index including the columns in the where clause of the outer query and the columns we used inside the exists clause.

             



            • 18. Re: What is the best index for the below update query?
              3131275

              Guys,

               

              Now that we decided to use a temporary table. So, that all the claims which we receive on a monthly basis are inserted to this table (via SQl*loader) and we would run the process( set of updates on the temporary table). Once we have the process finished we are finally going to insert all the claims to the historic table (Ref_uhc_Cos_phy_clms). Also, we would be truncating the temporary table every month before we start a new load.

               

              So, we expect this process to be faster. Also, to fine tune even more we also decided to create appropriate indexes on the temporary table for all the updates in the process.

               

              But the problem is every time we run each update statement in the process considerable amount of data in the temporary table is updated. So, do we need to collect the statistics on the temporary table in between each update?

               

              Thanks in Advance!

              • 19. Re: What is the best index for the below update query?
                Andrew Sayer

                3131275 wrote:

                 

                Guys,

                 

                Now that we decided to use a temporary table. So, that all the claims which we receive on a monthly basis are inserted to this table (via SQl*loader) and we would run the process( set of updates on the temporary table). Once we have the process finished we are finally going to insert all the claims to the historic table (Ref_uhc_Cos_phy_clms). Also, we would be truncating the temporary table every month before we start a new load.

                 

                So, we expect this process to be faster. Also, to fine tune even more we also decided to create appropriate indexes on the temporary table for all the updates in the process.

                 

                But the problem is every time we run each update statement in the process considerable amount of data in the temporary table is updated. So, do we need to collect the statistics on the temporary table in between each update?

                 

                Thanks in Advance!

                As I've said before. Indexes probably wont help you when you are expecting to visit most of the table. They will probably just make your updates slower.

                 

                For the stats, it completely depends. If your main predicates on your main table don't change (you always filter on a certain status and date) and the expected return doesn't change that much (being vague on purpose) then you may not need to gather stats.
                I suspect that each update will only remove a few rows (if any) from being candidates for insertion so I doubt your stats will need to be regathered

                 

                Do remember what I said about combining updates into less update statements. If each full table scan takes x seconds then doing it 6 times will be 6x seconds, you can probably take off 5x by combining updates . You'd have to use case statements to get the correct updated values but its definitely possible and more efficient.

                Please post your new execution plans with row source statistics so we can actually see what we're to make judgements on.

                1 2 Previous Next