This discussion is archived
1 2 Previous Next 20 Replies Latest reply: Nov 20, 2012 1:16 PM by Roger Ford RSS

Tune this "contains" query in Oracle 11g EE 11.2.0.3.0

932383 Newbie
Currently Being Moderated
I have a query like below. It runs for ever. The size of this table is not very big, it has around 30 millions rows. The owner column is very selective, and I've added "owner' column in the "filter by" clause of the domain index against description. Do you have any recommendation how I can tune this query? Any help would be greatly appreciated.

select count(*) from items s where contains(s.description, '%111%' ) > 0 and s.owner = 1234;

Thanks.
  • 1. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    You need to use SDATA to access the owner column through the index. You may also get better performance on your wildcard searches by using a wordlist with attributes as shown below. Please see the demonstration below.
    SCOTT@orcl_11gR2> create table items as
      2  select object_id as owner, object_name as description
      3  from   all_objects
      4  /
    
    Table created.
    
    SCOTT@orcl_11gR2> insert into items values (1234, 'A111B')
      2  /
    
    1 row created.
    
    SCOTT@orcl_11gR2> select count(*) from items
      2  /
    
      COUNT(*)
    ----------
         75048
    
    1 row selected.
    
    SCOTT@orcl_11gR2> begin
      2    ctx_ddl.create_preference('mywordlist', 'BASIC_WORDLIST');
      3    ctx_ddl.set_attribute('mywordlist','PREFIX_INDEX','TRUE');
      4    ctx_ddl.set_attribute('mywordlist','PREFIX_MIN_LENGTH', 1);
      5    ctx_ddl.set_attribute('mywordlist','SUBSTRING_INDEX', 'YES');
      6    ctx_ddl.set_attribute('mywordlist', 'wildcard_maxterms', 50000) ;
      7  end;
      8  /
    
    PL/SQL procedure successfully completed.
    
    SCOTT@orcl_11gR2> create index items_idx
      2  on items (description)
      3  indextype is ctxsys.context
      4  filter by owner
      5  parameters ('wordlist  mywordlist')
      6  /
    
    Index created.
    
    SCOTT@orcl_11gR2> set autotrace on explain
    SCOTT@orcl_11gR2> select count(*) from items
      2  where  contains (description, '%111% and (sdata (owner = 1234))') > 0
      3  /
    
      COUNT(*)
    ----------
             1
    
    1 row selected.
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1238254566
    
    ------------------------------------------------------------------------------
    | Id  | Operation        | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT |           |     1 |    29 |    12   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE  |           |     1 |    29 |            |          |
    |*  2 |   DOMAIN INDEX   | ITEMS_IDX |    39 |  1131 |     4   (0)| 00:00:01 |
    ------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("CTXSYS"."CONTAINS"("DESCRIPTION",'%111% and (sdata
                  (owner = 1234))')>0)
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    SCOTT@orcl_11gR2>
  • 2. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    Barbara,

    Thanks. I followed your suggestion of using custom wordlist and recreated the index. There was an improvement, however the query is still very slow, it took about 6 minutes. It took almost the same amount of time even when there is no owner matching the condition.

    A couple of other interesting things I observed are:
    1. The variation (begin/end with) of this query runs pretty fast.
    select count(1) from items where contains(description, '111% and (sdata (owner = 1234))' ) > 0 and owner = 1234 ; -- less than 1 second.
    select count(1) from items where contains(description, '%111 and (sdata (owner = 1234))' ) > 0 and owner = 1234 ; -- about 7 seconds.
    2. The same query worked on on Oracle 10g 10.2.0.3.

    It seems like it is a new problem with 11.2.0.3.0. Are you aware of any such issue? Do you have any other suggestions?

    Here was the plan for the slow query which took about 6 minutes.
    SQL> set autotrace on explain;
    SQL> select count(1) from items where contains(description, '%111% and (sdata (owner = 1234))' ) > 0 and owner = 1234 ;

    COUNT(1)
    ----------
         0


    Execution Plan
    ----------------------------------------------------------

    -------------------------------------------------------------------------
    | Id | Operation     | Name      | Rows | Bytes | Cost (%CPU)|
    -------------------------------------------------------------------------
    | 0 | SELECT STATEMENT |          |     1 |     32 | 22846 (0)|
    | 1 | SORT AGGREGATE |          |     1 |     32 |          |
    | 2 | DOMAIN INDEX     | ITEM_TEXT_INDEX | 3043 |     | 22846 (0)|
    -------------------------------------------------------------------------

    Note
    -----
    - 'PLAN_TABLE' is old version

    Edited by: 929380 on Nov 14, 2012 8:58 PM

    Edited by: 929380 on Nov 14, 2012 9:05 PM

    Edited by: 929380 on Nov 14, 2012 9:06 PM
  • 3. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    Get rid of the extra "and owner = 1234" at the end of the query. That just causes an extra filter. The owner should be accessed through sdata alone. Also, make sure you are using bind variables, as below. Queries with leading wildcards are always going to be slow and should be avoided if possible. Queries with both leading and trailing wildcards will be even slower. I am not aware of any problems with 11.2.0.3, but I am using 11.2.0.1. I see that your plan_table is outdated. You should run <your_oracle_home>\rdbms\admin\utlxplan.sql to get the new version that will allow you to see what is being accessed through an index and what is being filtered without an index as in the bottom portion of my previous post.

    variable g_desc varchar2(100)
    exec :g_desc := '%111%'
    variable g_own number
    exec :g_own := 1234
    select count(*) from items
    where contains (description, :g_desc || ' and (sdata (owner = ' || :g_own || '))') > 0
    /
  • 4. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Roger Ford Expert
    Currently Being Moderated
    In addition to Barbara's comments:

    How many rows are returned by your count(*)?
    Have you used either the SUBSTRING_INDEX or PREFIX_INDEX options in your index?

    Can you try doing this instead:
    select * from 
    ( select /*+ FIRST_ROWS(10) */ from items where contains(description, '%111% and (sdata (owner = 1234))' ) > 0 )
    where rownum <= 10
    
    select ctx_query.count_hits( 'your-index-name', '%111%') from dual;
    And what happens if you combine your two "partial" queries like this:
    select count(1) from items where contains(description, '(%111 or 111%) and (sdata (owner = 1234)) > 0
    EDIT: Forget the second question - of course this isn't the same as %111% at all - it misses out, for example, A111B. So it's not surprising that your partial queries run a lot faster.

    Edited by: Roger Ford on Nov 15, 2012 1:36 AM
  • 5. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    Thank you for your prompt response.

    Prefix index and substring index are enabled. The count from the query is zero. Here is the plan. Could you help take a look to see if something goes wrong?

    P.S. Using binding variable didn't make any difference.
    SQL> select count(0) from items where contains(description, '%111% and (sdata (owner = 1234))' ) > 0 ;
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1172958428
    
    --------------------------------------------------------------------------------
    ----
    
    | Id  | Operation      | Name         | Rows  | Bytes | Cost (%CPU)| Time
       |
    
    --------------------------------------------------------------------------------
    ----
    
    |   0 | SELECT STATEMENT |             |      1 |     26 | 22846   (0)| 00:09:
    11 |
    
    |   1 |  SORT AGGREGATE  |             |      1 |     26 |          |
       |
    
    |*  2 |   DOMAIN INDEX      | ITEM_TEXT_INDEX |      1 |     26 | 22846   (0)| 00:09:
    11 |
    
    --------------------------------------------------------------------------------
    ----
    
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("CTXSYS"."CONTAINS"("DESCRIPTION",'%111% and (sdata (owner =
               1234))')>0)
    
    
    Statistics
    ----------------------------------------------------------
          83084  recursive calls
           0  db block gets
         239445  consistent gets
           0  physical reads
           2388  redo size
         525  bytes sent via SQL*Net to client
         524  bytes received via SQL*Net from client
           2  SQL*Net roundtrips to/from client
           0  sorts (memory)
           0  sorts (disk)
           1  rows processed
    Edited by: 929380 on Nov 15, 2012 8:58 AM
  • 6. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Roger Ford Expert
    Currently Being Moderated
    On the face of things, that looks fine. It's doing all the work in the text index as it should.

    Can you tell us the output from
    select sum(token_count) from dr$item_text_index$i where token_text like '%111%';
    and also
    select count(distinct(token_text)) from dr$item_text_index$i where token_text like '%111%';
  • 7. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    Here it is.
    SQL> select sum(token_count) from dr$item_text_index$i where token_text like '%111%';
    
    SUM(TOKEN_COUNT)
    ----------------
           555251
    
    SQL> select count(distinct(token_text)) from dr$item_text_index$i where token_text like '%111%';
    
    COUNT(DISTINCT(TOKEN_TEXT))
    ---------------------------
                    72630
    
    SQL> 
  • 8. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    You might check to see if your index is fragmented. If so, try either optimizing or rebuilding or dropping and recreating your index.

    You can run ctx_report.index_stats to determine the degree of fragmentation. Please see the section of the documentation using the link below.

    http://docs.oracle.com/cd/E11882_01/text.112/e24436/crptpkg.htm#i996935
  • 9. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    I just recreated it. It is still slow with query string as "%111%".

    I then ran another query with "1%", which has more tokens, but it returns immediately. So it doesn't seem that it's related to how many tokens match the query string. Is there any way that I can instruct the optimizer to filter by owner first and then by "%111%"?
    SQL> select count(distinct(token_text)) from dr$item_text_index$i where token_text like '1%';
    
    COUNT(DISTINCT(TOKEN_TEXT))
    ---------------------------
                  1238719
    
    
    SQL> select sum(token_count) from dr$item_text_index$i where token_text like '1%';
    
    SUM(TOKEN_COUNT)
    ----------------
         40612029
    
    SQL> 
    
    SQL> select count(1) from bc_item_spec_details where ( contains(description,  '1%' ) > 0 ) and owner = 1234 ;
    
      COUNT(1)
    ----------
          0
    
    SQL> 
  • 10. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    929380 wrote:
    Is there any way that I can instruct the optimizer to filter by owner first and then by "%111%"?
    It may already be doing that. What is the result of the following?

    select count(*) from bc_item_spec_details where owner = 1234;
  • 11. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    I just noticed a discrepancy in table names between items and bc_item_spec_details. Is this just because you were using items to provide a simplified example on the forums and the actual table name is bc_item_spec_details or are you confusing two different tables?
  • 12. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    It is zero.
    SQL> select count(*) from bc_item_spec_details where owner = 1234;
    
      COUNT(*)
    ----------
          0
  • 13. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    932383 Newbie
    Currently Being Moderated
    Sorry for the confusion. I just use items table to simplified things. Please consider them as same.
  • 14. Re: Tune this "contains" query in Oracle 11g EE 11.2.0.3.0
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    I don't know if this will make a difference or not, but what happens if you reverse the order like below?

    select count(*) from items
    where contains (description, '(sdata (owner = ' || :g_own || ')) and ' || :g_desc) > 0
    /
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points