8 Replies Latest reply: Nov 6, 2007 4:35 AM by Jonathan Lewis RSS

    Long db file sequential read wait

    Luis Cabral
      Hello

      I have Oracle 10.2.0 running on Solaris.

      Please take a look at the query below. The table customer has around 2 million rows and customer_totals 50 million rows.

      When I tried to run it, it waited on "db file sequencial read wait" event for 30 min before starting to do anything!

      Is this normal? Is it possible to know what could be causing this wait?

      I faced this problem before, in the same database but with other queries.

      On other hand, other much more complex queries, envolving much bigger tables (100+ million rows) run without problems.

      Sorry if this is not much information, but I an not an expert on this area and I would like simply a general opinion (if possible) about this situation.

      Thanks
      Luis


      SELECT a1.user_no, a1.user_name
          FROM customer a1
          WHERE a1.language_id = 'es'
            AND a1.created >= to_date('01/01/2006', 'dd/mm/rrrr')
            AND a1.created < to_date('31/05/07', 'dd/mm/rrrr') + 1
            AND EXISTS (SELECT 1
                            FROM customer_totals a3
                            WHERE a3.site = a1.site
                              AND a3.key = a1.key
                              AND a3.ref_date >= to_date('01/01/2006', 'dd/mm/rrrr')
                              AND a3.ref_date <= to_date('31/05/07', 'dd/mm/rrrr')
                              AND a3.value > 0)
            AND NOT EXISTS (SELECT 1
                            FROM customer_totals a2
                            WHERE a2.site = a1.site
                              AND a2.key = a1.key
                              AND a2.ref_date >= to_date('01/06/07', 'dd/mm/rrrr')
                              AND a2.value > 0)
      Explain plan:
      SELECT STATEMENT 
        FILTER 
          NESTED LOOPS ANTI 
            NESTED LOOPS SEMI 
              TABLE ACCESS FULL CUSTOMER TABLE  
              TABLE ACCESS BY INDEX ROWID CUSTOMER_TOTALS
                INDEX RANGE SCAN CUSTOMER_TOTALS_PK INDEX (UNIQUE)
            TABLE ACCESS BY INDEX ROWID CUSTOMER_TOTALS TABLE
              INDEX RANGE SCAN CUSTOMER_TOTALS_I1 INDEX
        • 1. Re: Long db file sequential read wait
          153119
          What one would need is information on indexes on this table, and explain plan output using dbms_xplan (demos at http://www.psoug.org 'Morgans Library')

          General opinions aka Silver Bullets aka Rules of Thumb are dangerous as they so often don't apply in your situation.

          --
          Sybrand Bakker
          Senior Oracle DBA
          • 2. Re: Long db file sequential read wait
            Luis Cabral
            Thanks Sybrand for the reply.

            I will take a look at that website (to be honest I have no idea what dbms_xplan is) and try to get more details about this problem -- if I am able to duplicate it!

            Cheers,
            Luis
            • 3. Re: Long db file sequential read wait
              Jonathan Lewis
              This looks roughly like a query for "customers who have made payments in the past, but have not paid recently".

              What does that mean in terms of the business - is it a few out of a large number, or is it most of your customers.

              Based on the execution plan and your comments about "waiting a long time" for the first output, I'm going to guess that it's the former.

              Possibly what's happening is that Oracle is scanning through thousands of customers who have payments in the distant past (exists) and finding that they also have paymens in the recent past (failing the not-exists).

              Driven by the business requirement, it may be more efficient to swap the order of the two subqueries (Oracle executes them top to bottom) so that you first look for customers without recent payments - and then check to see if just those customers have also got payments in the more distant past. This way you run the 'not exists' once for each customer, and the exists for the interesting customers. As it stands you may be running both subqueries for nearly all customers - and if customers have a long history in the TOTALS table, the exists subquery is the expensive one.

              Note - if my assumption about business intent and actual data are wrong, this analysis is completely invalid.

              Given that you're using 10.2, I'm slightly surprised that the optimizer hasn't switched into anti-joins and semi-joins for the subqueroes - are any of the correlation columns allowed to be null?


              If you're going to show an execution plan, it's best to use dbms_xplan to get a full plan that includes predicate information.


              Regards
              Jonathan Lewis
              http://jonathanlewis.wordpress.com
              http://www.jlcomp.demon.co.uk
              • 4. Re: Long db file sequential read wait
                108476
                Hi Luis,
                it waited on "db file sequencial read wait" event for 30 min before starting to do anything!
                It's probably doing the subqueries . . .

                In some cases, you can rewrite a correlated subquery with an outer join:

                http://www.dba-oracle.com/oracle_tips_subq_rewrite.htm

                Also, check out subquery un-nesting, that may help:

                http://www.dba-oracle.com/t_subquery_un_nesting.htm

                Hope this helps. . .

                Don Burleson
                Oracle Press author
                Author of “Oracle Tuning: The Definitive Reference”
                • 5. Re: Long db file sequential read wait
                  Jonathan Lewis
                  Ignore my previous comment about Oracle not switching into anti-joins and semi-joins. It has - as the execution plan clearly shows.

                  The critical change you need to test (based on my guess about the business target) is to make the anti-join (not-exists) happen before the semi-join (exists).

                  Another option that might be beneficial is to consider the impact of forcing Oracle out of nested loop joins into hash joins.

                  If forcing changes into the optimizer's choice of execution plan can't make the query run sufficiently quickly then you may have to consider restructuring the query completely - but you do have to be a little careful when doing manual rewrites as it is quite easy to introduce logical errors.

                  For example, the following may be a good starting point:
                  select     a.site, a.key
                  from     customer_totals a
                  where     
                       a.ref_date >= to_date('01/01/2006', 'dd/mm/rrrr')
                  and     a.ref_date <= to_date('31/05/2007', 'dd/mm/rrrr')
                  and     a.value > 0
                  minus
                  select
                       a.site, a.key
                  from     customer_totals a
                  where     a.ref_date >= to_date('01/06/2007', 'dd/mm/rrrr')
                  and     a2.value > 0
                  )
                  It gives you the (site, key) values for the customers you want to identify. Possibly putting this into an inline view and joining (maybe with a hash join) to the customers table will work well for your data.

                  Just as an additional thought. dbms_xplan is easy to use -in many cases you can just do:

                  explain plan for {your statement};
                  select * from table(dbms_xplan.display);

                  When you do it, check what the FILTER is doing in the second line of the plan - it may be that your use of the 'rrrr' format for the year (especially in the case of the two-digit years) is making Oracle do more work than it really needs to.


                  Regards
                  Jonathan Lewis
                  http://jonathanlewis.wordpress.com
                  http://www.jlcomp.demon.co.uk
                  • 6. Re: Long db file sequential read wait
                    Luis Cabral
                    Hi all

                    Thanks for all replies. See below the dbms_xplan output:
                    -------------------------------------------------------------------------------------------------
                    | Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)|
                    -------------------------------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT               |                           |     1 |    99 |   816K  (1)|
                    |   1 |  FILTER                        |                           |       |       |            |
                    |   2 |   NESTED LOOPS ANTI            |                           |     1 |    99 |   816K  (1)|
                    |   3 |    NESTED LOOPS SEMI           |                           |     1 |    79 |   816K  (1)|
                    |   4 |     TABLE ACCESS BY INDEX ROWID| CUSTOMER                  | 28517 |  1643K|   748K  (1)|
                    |   5 |      INDEX RANGE SCAN          | IDX_CUSTOMER_CREATED      |   748K|       |  2549   (1)|
                    |   6 |     TABLE ACCESS BY INDEX ROWID| CUSTOMER_TOTALS           |     5 |   100 |     5   (0)|
                    |   7 |      INDEX RANGE SCAN          | CUSTOMER_TOTALS_PK        |     3 |       |     2   (0)|
                    |   8 |    TABLE ACCESS BY INDEX ROWID | CUSTOMER_TOTALS           |     2 |    40 |     3   (0)|
                    |   9 |     INDEX RANGE SCAN           | CUSTOMER_TOTALS_I1        |     9 |       |     2   (0)|
                    -------------------------------------------------------------------------------------------------
                     
                    Note
                    -----
                       - 'PLAN_TABLE' is old version
                    Jonathan, you are right. In this query I want to know the "lapsed customers" for a specific market. And, in fact, we didn't know if this was a large or small number, that is one of the reasons they asked me to run the query!

                    Anyway, this morning I ran exactly the same query... and got results immediatelly (no db file sequential read wait)! So it looks like that, on Friday, Oracle was waiting for something and not actually running the query (I guess).

                    Though I got the first results almost immediatelly, the query was tremendously slow to retrieve all the rows, so I changed it to the following:
                    SELECT c.user_no, 
                           c.user_name, 
                           sum(case when t.ref_date >= to_date('01/01/2006', 'dd/mm/rrrr')
                                     AND t.ref_date <= to_date('31/05/07', 'dd/mm/rrrr') then 1 else 0 end) before,
                           sum(case when t.ref_date > to_date('01/06/2007', 'dd/mm/rrrr') then 1 else 0 end) after
                        FROM customer c,
                             customer_totals t
                        WHERE c.language_id = 'es'
                          AND c.created >= to_date('01/01/2006', 'dd/mm/rrrr')
                          AND c.created < to_date('31/05/07', 'dd/mm/rrrr') + 1
                          AND t.site = c.site
                          AND t.key = c.key
                    GROUP BY c.user_no, c.user_name
                    
                    -----------------------------------------------------------------------------------------------
                    | Id  | Operation            | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)|
                    -----------------------------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT     |                           |   400K|    31M|       | 74309   (1)|
                    |   1 |  HASH GROUP BY       |                           |   400K|    31M|    71M| 74309   (1)|
                    |   2 |   FILTER             |                           |       |       |       |            |
                    |   3 |    NESTED LOOPS      |                           |   400K|    31M|       | 66619   (1)|
                    |   4 |     TABLE ACCESS FULL| CUSTOMER                  | 28517 |  1810K|       |  9512   (3)|
                    |   5 |     INDEX RANGE SCAN | CUSTOMER_TOTALS_PK        |    14 |   238 |       |     2   (0)|
                    -----------------------------------------------------------------------------------------------
                    Now it runs as fast I expect a query to run, but it does not explain the 30-min wait I got on Friday... And it seems I can't reproduce the problem anymore.

                    So my question is: if I face the same situation (a query waiting forever in a wait event), where can I look at to have more details about what is going on?

                    Thanks,
                    Luis
                    • 7. Re: Long db file sequential read wait
                      546494
                      hi,
                      Try looking in the Oracle alert.log file for last friday.
                      where any trace files generated around the time of the query run last friday?

                      regards

                      Alan
                      • 8. Re: Long db file sequential read wait
                        Jonathan Lewis
                        Notice that the original (slow) plan showed a "table access full" on customer, and the newer (faster initial response) plan showed "table accessed by index rowid" based on an index named idx_customer_created.

                        The difference in (initial) performance may simply be there because the two plans accessed the customer rows in a different order - and the indexed access just happened to get a few hits early on, while the tablescan had to work through a lot of data before getting a hit.

                        Smart move to rewrite the query with the CASE statement - but what you've got is currently not logically equivalent to the original. Most significantly you've dropped the predicate 'value > 0' - which is why you've got a primary key index access as the only access into customer_totals, and don't do a table access on that table. This would have a big impact on the total number of physical reads not to mention the effectiveness of the buffer cache which would probably go a long way to explaining the improved performance.

                        One detail with dbms_xplan. You've got a warning about the plan_table being an old version. Since you're on 10.2, you can probably just issue 'drop table plan_table' to get access to the 10g public synonym. The old table may be something constantly re-created by (e.g.) an older version to TOAD, or it may be left over from an 8i upgrade. Just check that it's not really needed before you drop it, but you really need to use the latest plan_table structure to get the best information.

                        You still need to correct this: to_date('31/05/07', 'dd/mm/rrrr') to use a 4 digit year (better still use 'yyyy' rather than 'rrrr' as well if possible)

                        Queries are rarely "stuck". The simplest quick check in 10.2 is to query v$session_wait_history for the session. This gives you a list of the last 10 waits. This would probably have shown Oracle constantly moving from block to block as your original query ran - and given you some idea of how slow the block reads were, and where the work was being done.

                        If you want to recreate the slow startup of last Friday, try running the query with a /*+ full(customer) */ hint - but replace "customre" in the hint with the alias that you've given the customer table in the query.

                        Regards
                        Jonathan Lewis
                        http://jonathanlewis.wordpress.com
                        http://www.jlcomp.demon.co.