Forum Stats

  • 3,825,924 Users
  • 2,260,580 Discussions
  • 7,896,737 Comments

Discussions

ctxsys.context

User_94EQ8
User_94EQ8 Member Posts: 11 Blue Ribbon
edited Jul 11, 2017 12:36PM in Text

Database Version 12.1.0.2.0

I created a MULTI_COLUMN_DATASTORE ctxsys.context on a very large table, ie:

ctx_ddl.create_preference('multi_store','MULTI_COLUMN_DATASTORE');

ctx_ddl.set_attribute('multi_store',

                      'columns',

                      'c1,c2,c3,c4'

                     );

CREATE INDEX "TEST"."CTX_SRCH_IDX" ON "TEST"."BIGTABLE" ("FIRSTNAME")

   INDEXTYPE IS "CTXSYS"."CONTEXT"  PARAMETERS ('DATASTORE multi_store section group ctxsys.auto_section_group sync (on commit)') ;

When I search, the results are blazing fast, ie:

  select  *

  from

    TEST.BIGTABLE t

  where contains (t.FIRSTNAME, 'kenneth and band') > 0

This query would search 50 million records in 1 second, and return 11 results as expected.  But whenever i add another limiting factor on another column (that has normal btree index, the 'filtered' result takes about 20 seconds, ie:

  select  *

  from

    TEST.BIGTABLE t

  where contains (t.FIRSTNAME, 'kenneth and band') > 0

and t.user = 'JOHNSMITH'

I understand that the btree index on 't.user' is slower, but why wouldn't oracle prioritize the CTXSYS index, then filter THOSE results?  It seems to trying to use the slower index first.  Any help?

Saubhik

Best Answer

  • Barbara Boehmer
    Barbara Boehmer Member Posts: 4,625 Gold Trophy
    edited Jul 10, 2017 5:51PM Answer ✓

    Ideally, the optimizer always chooses the best plan, but sometimes not.  It needs to have current statistics in order to do so.  Even then, it may not choose the best plan.  You may try an index hint for the index that you want it to use and see if that is sufficient to influence it.  When you have text data and structured data, a better solution is to create your index including FILTER BY the structured data, which then allows you to use SDATA to reference the structured data as part of the parameter to the CONTAINS operator, so that it can use just one index hit to get both text and structured data.  Please see the simplified demonstration below.

    [email protected]_12.1.0.2.0> select banner from v$version
      2  /

    BANNER
    --------------------------------------------------------------------------------
    Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
    PL/SQL Release 12.1.0.2.0 - Production
    CORE 12.1.0.2.0 Production
    TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
    NLSRTL Version 12.1.0.2.0 - Production

    5 rows selected.

    [email protected]_12.1.0.2.0> create table bigtable
      2    (firstname  varchar2(15),
      3  c1    varchar2(15),
      4  c2    varchar2(15),
      5  the_user  varchar2(15))
      6  /

    Table created.

    [email protected]_12.1.0.2.0> insert into bigtable values ('Ken', 'Kenneth', 'band', 'JOHNSMITH')
      2  /

    1 row created.

    [email protected]_12.1.0.2.0> insert into bigtable
      2  select substr(owner,1,15), substr(object_name,1,15), substr(object_type,1,15), status
      3  from  all_objects
      4  /

    92363 rows created.

    [email protected]_12.1.0.2.0> create index btree_idx on bigtable (the_user)
      2  /

    Index created.

    [email protected]_12.1.0.2.0> begin
      2    ctx_ddl.create_preference('multi_store','MULTI_COLUMN_DATASTORE');
      3    ctx_ddl.set_attribute('multi_store', 'columns', 'c1,c2');
      4  end;
      5  /

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> CREATE INDEX "CTX_SRCH_IDX" ON "BIGTABLE" ("FIRSTNAME") INDEXTYPE IS "CTXSYS"."CONTEXT"
      2    FILTER BY the_user
      3    PARAMETERS
      4    ('DATASTORE  multi_store
      5      section group  ctxsys.auto_section_group
      6      sync    (on commit)')
      7  /

    Index created.

    [email protected]_12.1.0.2.0> exec dbms_stats.gather_table_stats (USER, 'BIGTABLE')

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> set autotrace on explain
    [email protected]_12.1.0.2.0> select  *
      2  from    BIGTABLE t
      3  WHERE  CONTAINS
      4          (t.firstname,
      5    'kenneth and band AND SDATA(the_user=''JOHNSMITH'')') > 0
      6  /

    FIRSTNAME      C1              C2              THE_USER
    --------------- --------------- --------------- ---------------
    Ken            Kenneth        band            JOHNSMITH

    1 row selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 211787951

    --------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name        | Rows  | Bytes | Cost (%CPU)| Time    |
    --------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT            |              |    1 |    37 |    1  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID| BIGTABLE    |    1 |    37 |    1  (0)| 00:00:01 |
    |*  2 |  DOMAIN INDEX              | CTX_SRCH_IDX |      |      |    1  (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - access("CTXSYS"."CONTAINS"("T"."FIRSTNAME",'kenneth and band AND
                  SDATA(the_user=''JOHNSMITH'')')>0)

    [email protected]_12.1.0.2.0>

    Saubhik

Answers

  • Barbara Boehmer
    Barbara Boehmer Member Posts: 4,625 Gold Trophy
    edited Jul 10, 2017 5:51PM Answer ✓

    Ideally, the optimizer always chooses the best plan, but sometimes not.  It needs to have current statistics in order to do so.  Even then, it may not choose the best plan.  You may try an index hint for the index that you want it to use and see if that is sufficient to influence it.  When you have text data and structured data, a better solution is to create your index including FILTER BY the structured data, which then allows you to use SDATA to reference the structured data as part of the parameter to the CONTAINS operator, so that it can use just one index hit to get both text and structured data.  Please see the simplified demonstration below.

    [email protected]_12.1.0.2.0> select banner from v$version
      2  /

    BANNER
    --------------------------------------------------------------------------------
    Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
    PL/SQL Release 12.1.0.2.0 - Production
    CORE 12.1.0.2.0 Production
    TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
    NLSRTL Version 12.1.0.2.0 - Production

    5 rows selected.

    [email protected]_12.1.0.2.0> create table bigtable
      2    (firstname  varchar2(15),
      3  c1    varchar2(15),
      4  c2    varchar2(15),
      5  the_user  varchar2(15))
      6  /

    Table created.

    [email protected]_12.1.0.2.0> insert into bigtable values ('Ken', 'Kenneth', 'band', 'JOHNSMITH')
      2  /

    1 row created.

    [email protected]_12.1.0.2.0> insert into bigtable
      2  select substr(owner,1,15), substr(object_name,1,15), substr(object_type,1,15), status
      3  from  all_objects
      4  /

    92363 rows created.

    [email protected]_12.1.0.2.0> create index btree_idx on bigtable (the_user)
      2  /

    Index created.

    [email protected]_12.1.0.2.0> begin
      2    ctx_ddl.create_preference('multi_store','MULTI_COLUMN_DATASTORE');
      3    ctx_ddl.set_attribute('multi_store', 'columns', 'c1,c2');
      4  end;
      5  /

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> CREATE INDEX "CTX_SRCH_IDX" ON "BIGTABLE" ("FIRSTNAME") INDEXTYPE IS "CTXSYS"."CONTEXT"
      2    FILTER BY the_user
      3    PARAMETERS
      4    ('DATASTORE  multi_store
      5      section group  ctxsys.auto_section_group
      6      sync    (on commit)')
      7  /

    Index created.

    [email protected]_12.1.0.2.0> exec dbms_stats.gather_table_stats (USER, 'BIGTABLE')

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> set autotrace on explain
    [email protected]_12.1.0.2.0> select  *
      2  from    BIGTABLE t
      3  WHERE  CONTAINS
      4          (t.firstname,
      5    'kenneth and band AND SDATA(the_user=''JOHNSMITH'')') > 0
      6  /

    FIRSTNAME      C1              C2              THE_USER
    --------------- --------------- --------------- ---------------
    Ken            Kenneth        band            JOHNSMITH

    1 row selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 211787951

    --------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name        | Rows  | Bytes | Cost (%CPU)| Time    |
    --------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT            |              |    1 |    37 |    1  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID| BIGTABLE    |    1 |    37 |    1  (0)| 00:00:01 |
    |*  2 |  DOMAIN INDEX              | CTX_SRCH_IDX |      |      |    1  (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - access("CTXSYS"."CONTAINS"("T"."FIRSTNAME",'kenneth and band AND
                  SDATA(the_user=''JOHNSMITH'')')>0)

    [email protected]_12.1.0.2.0>

    Saubhik
  • Bud Light
    Bud Light Member Posts: 70 Blue Ribbon
    edited Jul 10, 2017 6:28PM

    Before you go around changing the context index, pay attention to the one line from Barbara's example:  Make sure your statistics are up to date.  Many performance issues are because of stale stats.

    Sadly, I'm on 12.2 and using Barbara's setup minus the FILTER piece, I cannot reproduce the results.  Everything I've tried uses the context index.

    I'm guessing that you want many more columns other than just one or two like the user column.  I'm not sure how feasible the SDATA solution would be if you had a few columns that needed this capability.  I can guess there is a noticeable size increase in the Text index.

    I do agree that the CBO does get it wrong some of the time.  When it does, a hint will almost always correct the behaviour.  Since I cannot reproduce the issue, I cannot provide a hint that works for me but I would start with the INDEX hint and provide the Text index.

This discussion has been closed.