This discussion is archived
1 2 Previous Next 26 Replies Latest reply: Jan 3, 2013 1:37 PM by 675093 RSS

CONTEXT index and search getting slower

675093 Newbie
Currently Being Moderated
Hi,

I'm having a situation that I don't know what to do anymore ...

What we have:
- We have Oracle 11.2.0.0.0 and context index on one table with 5 columns..
- We have a .NET application that is executing SELECT queries for 550 customers with bind variables for 6000 times a day...
- Some queries have more then 550 bound variables but there is maybe 1-2 like that big .... otherwise bellow 100
- we have set a JOB for index optimization for FULL mode, gather stats like this :
exec ctx_ddl.optimize_index('ORATEXT_ART_IDX','FULL');
exec DBMS_stats.gather_index_stats(ownname=>TEST,indname=>'ORATEXT_ART_IDX');
exec DBMS_stats.gather_table_stats('TEST','S_ARTICLE',cascade=>TRUE);
This is the code for index:
BEGIN
CTX_DDL.CREATE_PREFERENCE('S_ARTICLE_LEX','BASIC_LEXER');
CTX_DDL.SET_ATTRIBUTE('S_ARTICLE_LEX','SKIPJOINS','+&-');
CTX_DDL.CREATE_PREFERENCE('DATASTORE_S_ARTICLE','MULTI_COLUMN_DATASTORE');
CTX_DDL.SET_ATTRIBUTE(' DATASTORE_S_ARTICLE ','columns','TITLE,SUBTITLE,AUTHOR,SUMMARY,FULLTEXT'); 
END;

CREATE INDEX ORATEXT_ART_IDX ON S_ARTICLE(ORATEXT) INDEXTYPE IS CTXSYS.CONTEXT  
FILTER BY ID 
PARAMETERS (' LEXER S_ARTICLE_LEX STOPLIST CTXSYS.EMPTY_STOPLIST sync(ON COMMIT) DATASTORE DATASTORE_S_ARTICLE');
Scenario is like this :
When ever a new news article comes we :
- insert this ONE article into DB and save it so that the index syncs
- run 550 queries to find out if any of the queries has a match...
- query is of a form : SELECT * FROM ARTICLES WHERE ID=xxxx AND TYPE=xx AND (MEDIA=yy OR MEDIA=yy) AND (CONTAINS(a) OR CONTAINS(b) AND NOT CONTAINS(c) .........) etc ....
- so different types are used but always on this current article ...
- because query always starts with exactly specific ID (the last inserted one) we set FILTER BY ID when creating CONTEXT index ... so ID is always part of a query, but other TYPE,MEDIA,CATEGORY,WORDS are user defined and in user defined way...
-every article has from 10 - 3000 or more words .... like usual newspaper articles ...

So query is always created dynamically upon user criterias ...

Where is our problem ..... When we started the times were at 9-12sec for this search .... after 1 day we are at 30-40sec.....
If we do an index optimization every 3-5 hours then we are at 20-23sec ... but this is not a solution for this ... because times are getting bigger

Our computer is a new I7.. 8core 16GBRam Windows server 2008 ....

The CPU is running all the time between 97-100% when doing search ... and we saw that we have a lot of some sort of WAITs ...
we have increased SGA to 6Gbram... but this didn't help, increased cursors, we have swiched on/off Auto memory management ...

What ever we have done, nothing has helped ...

SQLArea shows that the most running sql is the oracle generated query for Dynamic sampling of the context index ...

Another question would be ... is there any better way to do the query search instead of using Context index ...

would normal search like INSTR be a better way to do the search ?

Thank you.
Kris
  • 1. Re: CONTEXT index and search getting slower
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    Theoretically, your query should be faster if you combine everything into one contains clause:

    select * from articles
    where contains (oratext, ((a or b) not c) and sdata (id = ''xxxx'')) > 0;

    Edited by: Barbara Boehmer on Jan 2, 2013 6:35 PM
  • 2. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    Hi,

    Yes I've read that, but this would be really hard to get , cause user can put in all sort of variations of criteria which are then mixed .... some are true, some should be false and not only keywords ...

    Looking at your query .... " sdata (id = ''xxxx''));" is this the right way how primary key should be written ? cause I just write "where id=123 AND ....."

    Thank you.
    Kris
  • 3. Re: CONTEXT index and search getting slower
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    I was just trying to add parentheses to group things logically, but your original query does not have them, so the following would be similar. If you want to reduce query time, it is important to get everything in one contains clause. When you use "filter by" during index creation, it automatically creates the sdata, so using sdata in the manner that I have in the query below causes it to use only one index hit for everything. You can include your other criteria after that.

    select * from articles
    where contains (oratext, (a or b and not c) and sdata (id = ''xxxx'')) > 0
    and ...;

    Edited by: Barbara Boehmer on Jan 2, 2013 6:34 PM
  • 4. Re: CONTEXT index and search getting slower
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    Please see the following example and compare the execution plans. The first query is similar to your original. The second query uses only one index hit. Hopefully, the example with legitimate syntax is clearer than the previous pseudocode.
    SCOTT@orcl_11gR2> create table s_article
      2    (id      varchar2(4),
      3       type      varchar2(4),
      4       media      varchar2(4),
      5       title      varchar2(5),
      6       subtitle varchar2(8),
      7       author      varchar2(6),
      8       summary  varchar2(7),
      9       fulltext varchar2(8),
     10       oratext  varchar2(1))
     11  /
    
    Table created.
    
    SCOTT@orcl_11gR2> insert into s_article values
      2    ('xxxx', 'xx', 'yy', 'a', 'b', 'd', 'e', 'f', null)
      3  /
    
    1 row created.
    
    SCOTT@orcl_11gR2> insert into s_article (id, type, media)
      2  select substr (table_name, 1, 4),
      3           substr (tablespace_name, 1, 4),
      4           substr (cluster_name, 1, 4)
      5  from   all_tables
      6  /
    
    3312 rows created.
    
    SCOTT@orcl_11gR2> BEGIN
      2    CTX_DDL.CREATE_PREFERENCE('S_ARTICLE_LEX','BASIC_LEXER');
      3    CTX_DDL.SET_ATTRIBUTE('S_ARTICLE_LEX','SKIPJOINS','+&-');
      4    CTX_DDL.CREATE_PREFERENCE('DATASTORE_S_ARTICLE','MULTI_COLUMN_DATASTORE');
      5    CTX_DDL.SET_ATTRIBUTE
      6        (' DATASTORE_S_ARTICLE ',
      7         'columns',
      8         'TITLE,SUBTITLE,AUTHOR,SUMMARY,FULLTEXT');
      9  END;
     10  /
    
    PL/SQL procedure successfully completed.
    
    SCOTT@orcl_11gR2> CREATE INDEX ORATEXT_ART_IDX
      2  ON S_ARTICLE(ORATEXT)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  FILTER BY ID
      5  PARAMETERS
      6    ('LEXER S_ARTICLE_LEX
      7        STOPLIST CTXSYS.EMPTY_STOPLIST
      8        sync(ON COMMIT)
      9        DATASTORE DATASTORE_S_ARTICLE')
     10  /
    
    Index created.
    
    SCOTT@orcl_11gR2> set autotrace on explain
    SCOTT@orcl_11gR2> SELECT * FROM s_article
      2  WHERE ID='xxxx'
      3  AND TYPE='xx'
      4  AND (MEDIA='yy' OR MEDIA='yy')
      5  AND CONTAINS(oratext, 'a') > 0
      6  OR  CONTAINS(oratext, 'b') > 0
      7  AND NOT CONTAINS(oratext, 'c') > 0
      8  /
    
    ID   TYPE MEDI TITLE SUBTITLE AUTHOR SUMMARY FULLTEXT O
    ---- ---- ---- ----- -------- ------ ------- -------- -
    xxxx xx   yy   a     b        d      e       f
    
    1 row selected.
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2946228051
    
    -----------------------------------------------------------------------------------------------------
    | Id  | Operation                         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                  |                 |     1 |    52 |    12   (0)| 00:00:01 |
    |*  1 |  TABLE ACCESS BY INDEX ROWID      | S_ARTICLE       |     1 |    52 |    12   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS     |                 |       |       |            |          |
    |   3 |    BITMAP OR                      |                 |       |       |            |          |
    |   4 |     BITMAP CONVERSION FROM ROWIDS |                 |       |       |            |          |
    |   5 |      SORT ORDER BY                |                 |       |       |            |          |
    |*  6 |       DOMAIN INDEX                | ORATEXT_ART_IDX |       |       |     4   (0)| 00:00:01 |
    |   7 |     BITMAP AND                    |                 |       |       |            |          |
    |   8 |      BITMAP CONVERSION FROM ROWIDS|                 |       |       |            |          |
    |   9 |       SORT ORDER BY               |                 |       |       |            |          |
    |* 10 |        DOMAIN INDEX               | ORATEXT_ART_IDX |       |       |     4   (0)| 00:00:01 |
    |  11 |      BITMAP CONVERSION FROM ROWIDS|                 |       |       |            |          |
    |  12 |       SORT ORDER BY               |                 |       |       |            |          |
    |* 13 |        DOMAIN INDEX               | ORATEXT_ART_IDX |       |       |     4   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("ID"='xxxx' AND "TYPE"='xx' AND "MEDIA"='yy' AND
                  "CTXSYS"."CONTAINS"("ORATEXT",'a')>0 OR "CTXSYS"."CONTAINS"("ORATEXT",'b')>0 AND
                  "CTXSYS"."CONTAINS"("ORATEXT",'c')<=0)
       6 - access("CTXSYS"."CONTAINS"("ORATEXT",'a')>0)
           filter("ID"='xxxx')
      10 - access("CTXSYS"."CONTAINS"("ORATEXT",'b')>0 AND "CTXSYS"."CONTAINS"("ORATEXT",'c')<=0)
      13 - access("CTXSYS"."CONTAINS"("ORATEXT",'c')<=0)
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    SCOTT@orcl_11gR2> select * from s_article
      2  where  contains
      3             (oratext,
      4              '((a or b) not c) and
      5            sdata (id = ''xxxx'')') > 0
      6  and    type = 'xx'
      7  and    media = 'yy'
      8  /
    
    ID   TYPE MEDI TITLE SUBTITLE AUTHOR SUMMARY FULLTEXT O
    ---- ---- ---- ----- -------- ------ ------- -------- -
    xxxx xx   yy   a     b        d      e       f
    
    1 row selected.
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1507989626
    
    -----------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |                 |     1 |    52 |     4   (0)| 00:00:01 |
    |*  1 |  TABLE ACCESS BY INDEX ROWID| S_ARTICLE       |     1 |    52 |     4   (0)| 00:00:01 |
    |*  2 |   DOMAIN INDEX              | ORATEXT_ART_IDX |       |       |     4   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("TYPE"='xx' AND "MEDIA"='yy')
       2 - access("CTXSYS"."CONTAINS"("ORATEXT",'((a or b) not c) and            sdata (id
                  = ''xxxx'')')>0)
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    SCOTT@orcl_11gR2>
  • 5. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    Hi Barbara,

    Thank you very much for your answers ....

    Let me just ask you one more thing ..... Do you think there is any "better" solution to my problem .... what I mean is, would it be

    more efficient if I would do the search on a single article using some other stuff then Context index ? Like Regex, Instr ..... inside or outside Oracle ? ...

    I don't know, maybe someone would use some other technology to do the search because I always really need to do the search on one single article .... just maybe the context index is not the right choice here ? or Is it ?

    If context index is taking XX amount of time ,now that we have 3000 articles .... will this change when we have 1.000.000 articles ? And I'm doing the same search on one single article ...


    Cause I'm really scared what will happen in the future if there is 1.000.000 articles in DB .... today we had to stop the system, cause we had times over 100sec :((...

    Thank you very much for all your help.

    Kris
  • 6. Re: CONTEXT index and search getting slower
    Roger Ford Expert
    Currently Being Moderated
    Have you looked at using a CTXRULE index?

    It seems to me you're trying to run many queries against one document at a time - which is exactly the scenario that CTXRULE was designed for.

    The indextype is covered in the Technical Overview here:
    http://www.oracle.com/technetwork/database/enterprise-edition/default-1590763.html#rule
  • 7. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    Yes exactly ....I'll check this out, right now..

    Thank you.
    Kris
  • 8. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    Hi,

    I have created and tried CTXRULE in comparison to CONTEXT index .... here are the results ...

    For CTXRULE:
    select * from s_article where id=4572 
    and fk_media_code='sa' and fk_type=1 
    and contains(oratext,'(dobro or jutro)  not (otroke and imeti2) and near((zakaj,otroke),1)>0 and df and 6df and 5df and 43df and df5 and df45 and df345 and d4f  ')>0
    Plan hash value: 449592537
     
    ------------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |                  |     1 |   558 |    32   (0)| 00:00:01 |
    |*  1 |  TABLE ACCESS BY INDEX ROWID| S_ARTICLE        |     1 |   558 |    32   (0)| 00:00:01 |
    |*  2 |   DOMAIN INDEX              | ORATEXT_ART_IDX2 |     1 |       |     0   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       1 - filter("FK_MEDIA_CODE"='sa' AND TO_NUMBER("FK_TYPE")=1 AND "ID"=4572)
       2 - access("CTXSYS"."MATCHES"("ORATEXT",'(dobro or jutro)  not (otroke and imeti2) 
                  and near((zakaj,otroke),1)>0 and df and 6df and 5df and 43df and df5 and df45 and df345 
                  and d4f  ')>0)
    For CONTEXT index the same query:
    select * from s_article where id=4572 
    and fk_media_code='sa' and fk_type=1 
    and contains(oratext,'(dobro or jutro)  not (otroke and imeti2) and near((zakaj,otroke),1)>0 and df and 6df and 5df and 43df and df5 and df45 and df345 and d4f  ')>0
    Plan hash value: 646002122
     
    ---------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |               |     1 |   558 |     2   (0)| 00:00:01 |
    |*  1 |  TABLE ACCESS BY INDEX ROWID| S_ARTICLE     |     1 |   558 |     2   (0)| 00:00:01 |
    |*  2 |   INDEX UNIQUE SCAN         | S_ARTICLE_IDX |     1 |       |     1   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       1 - filter("FK_MEDIA_CODE"='sa' AND TO_NUMBER("FK_TYPE")=1 AND 
                  "CTXSYS"."CONTAINS"("ORATEXT",'(dobro or jutro)  not (otroke and imeti2) and 
                  near((zakaj,otroke),1)>0 and df and 6df and 5df and 43df and df5 and df45 and df345 
                  and d4f  ')>0)
       2 - access("ID"=4572)
    So what I would say is that CPU cost is lower when using CONTEXT ... CTXRULE index was created with the same parameters as CONTEXT except FILTERBY

    But according to plan, CTXRULE does the filtering in step 1 and then the search for words where CONTEXT in first searching the whole index and then filtering in step 2 .... or am I wrong ?

    So according to this my logic would say that CTXRULE is better ?? Is it ?? :)

    Thank you,
    Kris
  • 9. Re: CONTEXT index and search getting slower
    Roger Ford Expert
    Currently Being Moderated
    That doesn't look right. The CTXRULE query should be using the MATCHES operator, not CONTAINS. And the MATCHES clause should be quite different from the CONTAINS clause.

    With a CTXRULE index you should be indexing the set of queries, then running the incoming document against those queries, so the overall query should like quite different.
  • 10. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    sorry my mistake ... yes i have used MATCHES instead of CONTAINS ... can be seen from explain plan ...

    Edited by: mackrispi on Jan 3, 2013 1:10 PM
  • 11. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    sorry i don't understand this :
    Roger Ford wrote:
    With a CTXRULE index you should be indexing the set of queries, then running the incoming document against those queries, so the overall query should like quite different.
    What do you mean indexing queries ? Select statements ? Could you hive me a short example on my demo data from above please ...

    Thank you.
    Kris
  • 12. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    I have checked Oracle docs ... for an example and I think I know what to change now ...

    Thank you.
    Kris
  • 13. Re: CONTEXT index and search getting slower
    Roger Ford Expert
    Currently Being Moderated
    Here's an example:
    set echo on
    
    drop table searches
    /
    create table searches( search_terms varchar2(2000), search_area varchar2(30), owner_name varchar2(30) )
    /
    
    insert into searches values( 'barack obama', 'US Politics', 'John' )
    /
    insert into searches values( 'washington', 'US Politics', 'John' )
    /
    insert into searches values( 'iraq or iran', 'Middle East', 'Peter' )
    /
    insert into searches values( 'finance or financial' , 'Economics', 'Mike' )
    /
    insert into searches values( 'NEAR( (financial, US) )', 'US Economics', 'Mike' )
    /
    
    create index search_index on searches( search_terms ) indextype is ctxsys.ctxrule
    /
    
    select search_area, owner_name 
    from searches 
    where matches(search_terms, 'Barack Obama yesterday announced that he flying to Iraq to discuss the financial status of US interests' ) > 0
    /
    
    select search_area, owner_name 
    from searches 
    where matches(search_terms, 'Yesterday in Washington nothing of interest happened.' ) > 0
    /
    The output from the queries looks like:
    SQL> select search_area, owner_name
      2  from searches
      3  where matches(search_terms, 'Barack Obama yesterday announced that he flying to Iraq to discuss the financial status of US interests' ) > 0
      4  /
    
    SEARCH_AREA                 OWNER_NAME
    ------------------------------ ------------------------------
    US Politics                 John
    Middle East                 Peter
    Economics                 Mike
    US Economics                 Mike
    
    SQL> 
    SQL> select search_area, owner_name
      2  from searches
      3  where matches(search_terms, 'Yesterday in Washington nothing of interest happened.' ) > 0
      4  /
    
    SEARCH_AREA                 OWNER_NAME
    ------------------------------ ------------------------------
    US Politics                 John
  • 14. Re: CONTEXT index and search getting slower
    675093 Newbie
    Currently Being Moderated
    Thank you A LOT .... you all saved my life :)

    Really thank you for all the help to you all.
    Kris
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points