6 Replies Latest reply: May 28, 2013 4:15 PM by thgoum RSS

    Oracle Text: CTXCAT

    thgoum
      Hi all, I have the following table

      CREATE TABLE SEM_TRAJS OF SEM_TRAJECTORY
      NESTED TABLE EPISODES STORE AS EPISODES_NESTEDTAB ...
      and the following types
      CREATE OR REPLACE
      type SEM_TRAJECTORY as object
      (
      -- Attributes
      sem_trajectory_tag varchar2(50),
      srid integer,
      episodes sem_episode_tab,
      o_id integer,
      semtraj_id integer,

      and CREATE OR REPLACE
      type EPISODE as object
      (
      -- Attributes
      defining_tag varchar2(4),
      episode_tag varchar2(50),
      activity_tag varchar2(50),
      ......
      )
      /

      Now i want to create a pattern matching query with Oracle Text so I can search in defining_tag, episode_tag, activity_tag ... How can I create a CTXCAT index and use it?



      Thanks in advance.
        • 1. Re: Oracle Text: CTXCAT
          Roger Ford-Oracle
          I don't think you'll be able to use CTXCAT for that. You'll probably need a CONTEXT index with a USER_DATASTORE.

          I'm not familiar enough with nested tables and objects to produce sample code to do this, so you'll need to figure out the datastore procedure yourself.

          You'll also need to figure out what column to put the actual index on - probably a dummy column in the SEM_TRAJS table.
          • 2. Re: Oracle Text: CTXCAT
            Barbara Boehmer
            You won't be able to create a Text index (ctxcat or context) on an object table directly, so I would create a materialized view, then create the index on the materialized view. The following is a fairly simple example, using a materialized view, a multi_column_datastore, and a context index. You could use a user_datastore with a procedure instead of the multi_column_datastore, but that would be a bit more complicated. You will need to deal with refreshing your materialized view and synchronizing your index and you could add sections if you want to search within individual attributes of the episodes. This is just a simple starter example. It looks like your problem is an interesting combination of Text, Objects, and Spatial. The data that I used is just for the simple demonstration and is not valid spatial data.
            SCOTT@orcl_11gR2> CREATE OR REPLACE TYPE episode AS OBJECT
              2    (defining_tag  VARCHAR2( 4),
              3       episode_tag   VARCHAR2(50),
              4       activity_tag  VARCHAR2(50));
              5  /
            
            Type created.
            
            SCOTT@orcl_11gR2> CREATE OR REPLACE TYPE sem_episode_tab AS TABLE OF episode
              2  /
            
            Type created.
            
            SCOTT@orcl_11gR2> CREATE OR REPLACE TYPE sem_trajectory AS OBJECT
              2    (sem_trajectory_tag  VARCHAR2(50),
              3       srid              INTEGER,
              4       episodes         sem_episode_tab,
              5       o_id              INTEGER,
              6       semtraj_id         INTEGER)
              7  /
            
            Type created.
            
            SCOTT@orcl_11gR2> SHOW ERRORS
            No errors.
            SCOTT@orcl_11gR2> CREATE TABLE sem_trajs OF sem_trajectory
              2  NESTED TABLE episodes STORE AS episodes_nestedtab
              3  /
            
            Table created.
            
            SCOTT@orcl_11gR2> INSERT INTO sem_trajs
              2  VALUES
              3    ('semtrajtag1',
              4       1,
              5       sem_episode_tab
              6         (episode ('dt1a', 'etag1a', 'atag1a'),
              7          episode ('dt1b', 'etag1b', 'atag1b')),
              8       1,
              9       1)
             10  /
            
            1 row created.
            
            SCOTT@orcl_11gR2> INSERT INTO sem_trajs
              2  VALUES
              3    ('semtrajtag2',
              4       2,
              5       sem_episode_tab
              6         (episode ('dt2a', 'etag2a', 'atag2a'),
              7          episode ('dt2b', 'etag2b', 'atag2b')),
              8       2,
              9       2)
             10  /
            
            1 row created.
            
            SCOTT@orcl_11gR2> CREATE MATERIALIZED VIEW episodes_mv
              2  AS
              3  SELECT st.sem_trajectory_tag, st.srid, st.o_id, semtraj_id,
              4           e.defining_tag, e.episode_tag, e.activity_tag,
              5           ' ' AS episodes
              6  FROM   sem_trajs st,
              7           TABLE (st.episodes) e
              8  /
            
            Materialized view created.
            
            SCOTT@orcl_11gR2> COLUMN sem_trajectory_tag FORMAT A18
            SCOTT@orcl_11gR2> COLUMN defining_tag         FORMAT A12
            SCOTT@orcl_11gR2> COLUMN episode_tag         FORMAT A11
            SCOTT@orcl_11gR2> COLUMN activity_tag         FORMAT A12
            SCOTT@orcl_11gR2> SELECT * FROM episodes_mv
              2  /
            
            SEM_TRAJECTORY_TAG       SRID       O_ID SEMTRAJ_ID DEFINING_TAG EPISODE_TAG ACTIVITY_TAG E
            ------------------ ---------- ---------- ---------- ------------ ----------- ------------ -
            semtrajtag1                 1          1          1 dt1a         etag1a      atag1a
            semtrajtag1                 1          1          1 dt1b         etag1b      atag1b
            semtrajtag2                 2          2          2 dt2a         etag2a      atag2a
            semtrajtag2                 2          2          2 dt2b         etag2b      atag2b
            
            4 rows selected.
            
            SCOTT@orcl_11gR2> BEGIN
              2    CTX_DDL.CREATE_PREFERENCE ('emv_mcds', 'MULTI_COLUMN_DATASTORE');
              3    CTX_DDL.SET_ATTRIBUTE ('emv_mcds', 'COLUMNS', 'defining_tag, episode_tag, activity_tag');
              4  END;
              5  /
            
            PL/SQL procedure successfully completed.
            
            SCOTT@orcl_11gR2> CREATE INDEX indexname
              2    ON episodes_mv (episodes)
              3    INDEXTYPE IS CTXSYS.CONTEXT
              4    PARAMETERS ('DATASTORE emv_mcds')
              5  /
            
            Index created.
            
            SCOTT@orcl_11gR2> SELECT * FROM episodes_mv
              2  WHERE  CONTAINS (episodes, 'dt1a') > 0
              3  /
            
            SEM_TRAJECTORY_TAG       SRID       O_ID SEMTRAJ_ID DEFINING_TAG EPISODE_TAG ACTIVITY_TAG E
            ------------------ ---------- ---------- ---------- ------------ ----------- ------------ -
            semtrajtag1                 1          1          1 dt1a         etag1a      atag1a
            
            1 row selected.
            
            SCOTT@orcl_11gR2> SELECT * FROM episodes_mv
              2  WHERE  CONTAINS (episodes, 'atag2b') > 0
              3  /
            
            SEM_TRAJECTORY_TAG       SRID       O_ID SEMTRAJ_ID DEFINING_TAG EPISODE_TAG ACTIVITY_TAG E
            ------------------ ---------- ---------- ---------- ------------ ----------- ------------ -
            semtrajtag2                 2          2          2 dt2b         etag2b      atag2b
            
            1 row selected.
            • 3. Re: Oracle Text: CTXCAT
              rp0428
              Can you expand on that?

              I see this
              ' ' AS episodes
              And then you create an index on that column, which appears to have no content.

              Could you provide more of a 25-words-or-less summary of what that does and why it helps with OPs problem? I get using the MV to expand the UDT so that the attributes could be indexed but that section above I don't understand.

              Also, any 'guesstimate' as to how your MV solution would compare to creating a DOMAIN index using data cartridge functionality to solve the indexing issue?
              • 4. Re: Oracle Text: CTXCAT
                Barbara Boehmer
                I could just as well have used ' ' as dummy and created the index on episodes_mv(dummy) and searched using contains(dummy ...). The index can be created on any character column. The usage of the multi_column_datastore determines what columns it actually searches. I generally prefer to use a more meaninful name than dummy, so that using contains(episodes ...) says more about what you are searching than contains(dummy ...). It may have caused some confusion because it was the same as one of your column names. It is a common practice to use such a separate column so that you could use that column for triggering updates and such, but you could just create the index on an existing column, like defining_tag, but then contains(defining_tag ...) tends to give the misimpression that you are only searching that one column, instead of all three columns in the multi_column_datastore. If you search for multi_column_datastore in the online Oracle Text Reference and Oracle Text Application Developer's Guide, you will find more information, syntax, and examples.

                Oracle Text indexes are domain indexes. When you create an Oracle Text index, it creates a set of domain index tables, including dr$<your_index_name>$i. I am not sure what other alternative you are trying to compare to.
                • 5. Re: Oracle Text: CTXCAT
                  rp0428
                  >
                  If you search for multi_column_datastore in the online Oracle Text Reference and Oracle Text Application Developer's Guide, you will find more information, syntax, and examples.
                  >
                  Thanks for expounding on that and for the reference for more info. I've never really had a need to get into the Text area being more a high-end ETL architecture and performance person.
                  >
                  Oracle Text indexes are domain indexes. When you create an Oracle Text index, it creates a set of domain index tables, including dr$<your_index_name>$i. I am not sure what other alternative you are trying to compare to.
                  >
                  No need to dwell on it but I was referring to user-defined indexes and methods as outlined in the Data Cartridge Dev Guide
                  http://docs.oracle.com/cd/B28359_01/appdev.111/b28425/ext_idx_frmwork.htm#BABDJAJG
                  >
                  When to Use Extensible Indexing
                  . . .
                  Indexing attributes of column objects

                  The built-in facilities cannot index column objects or the elements of a collection type.
                  . . .
                  >
                  It isn't hard to create your own 'index' that can dive into a UDT and access the attributes and index them. That's doable but not something that would make a lot of sense. Most cases I run across of people creating object tables or nested object types are usually of the 'wrong solution for the problem' variety. I'm probably a dinosaur but I still believe that simpler is better unless you have a demonstrated need for a more complext solution.
                  • 6. Re: Oracle Text: CTXCAT
                    thgoum
                    @Barbara Boehmer      Thank you very much for the explanation I will try it.