Discussions
Categories
- 17.9K All Categories
- 3.4K Industry Applications
- 3.3K Intelligent Advisor
- 62 Insurance
- 536.1K On-Premises Infrastructure
- 138.2K Analytics Software
- 38.6K Application Development Software
- 5.7K Cloud Platform
- 109.4K Database Software
- 17.5K Enterprise Manager
- 8.8K Hardware
- 71.1K Infrastructure Software
- 105.2K Integration
- 41.6K Security Software
Oracle Text index information
Hi,
I have an Oracle Text index that has been created as:
create index my_docs_doc_idx on my_docs(doc)
indextype is ctxsys.context;
So, the table is MY_DOCS, the indexed column is DOC.
But, when I query the views dba_tables I find tables with that names DR$MY_DOCS_DOC_IDX$K and DR$MY_DOCS_DOC_IDX$N.
that are (regarding the naming) related to the index MY_DOCS_DOC_IDX.
Furthetmore, when I query the table dba_indexes, I find for the above tables, indexes of the type SYS_IL, SYS_IOT.
Does someone know what means these tables and index names?
Does someone know I to get Oracle Text information: name, corresponding table, size, free space, allocated (used) space?
Thanks by advance for any tip, if you have some experience with it.
Kind Regards
Best Answer
-
The various sub-tables are used internally by Oracle Text. They can vary between releases, and depend on the options chosen for the index.
The basic tables are:
$I : the "token table". Lists all the tokens (words) found in the text with a binary postings list telling the system which document id and word position the token is to be found
$K : maps base table ROWID to internal document id (DOCID)
$R : maps internal DOCID to ROWID - same information as $K but in a different format. Which we use depends on whether we know DOCID and need ROWID, or vice versa,
$N : so called "negative row" table. Contains a list of deleted DOCID values which need to be cleaned up next time we run optimise_index.
Other common tables that may be seen:
$P : used by the SUBSTRING_INDEX option to contain rotated versions of tokens
$S : used by SDATA sections
These tables should never be modified by the user. The only table which is generally useful for the user is the $I table - doing a query such as
SELECT token_type, token_info FROM dr$myindex$i
Can let you see which words are indexed - quite useful on small indexes to see how words are tokenised if you use various index options.
For example here we can see that when we index "the cat sat on the mat", the words "the" and "on" are not indexed since they are default stop words:
SQL> create table test (text varchar2(60));Table created.SQL> insert into test values ('the cat sat on the mat');1 row created.SQL> create index testindex on test(text) indextype is ctxsys.context;Index created.SQL> select token_type, token_text from dr$testindex$i; 0 CAT 0 MAT 0 SATSQL>
Token_type is 0 for normal words. Various other indexed tokens like themes, stems and section information have different token_type values.
Answers
-
The various sub-tables are used internally by Oracle Text. They can vary between releases, and depend on the options chosen for the index.
The basic tables are:
$I : the "token table". Lists all the tokens (words) found in the text with a binary postings list telling the system which document id and word position the token is to be found
$K : maps base table ROWID to internal document id (DOCID)
$R : maps internal DOCID to ROWID - same information as $K but in a different format. Which we use depends on whether we know DOCID and need ROWID, or vice versa,
$N : so called "negative row" table. Contains a list of deleted DOCID values which need to be cleaned up next time we run optimise_index.
Other common tables that may be seen:
$P : used by the SUBSTRING_INDEX option to contain rotated versions of tokens
$S : used by SDATA sections
These tables should never be modified by the user. The only table which is generally useful for the user is the $I table - doing a query such as
SELECT token_type, token_info FROM dr$myindex$i
Can let you see which words are indexed - quite useful on small indexes to see how words are tokenised if you use various index options.
For example here we can see that when we index "the cat sat on the mat", the words "the" and "on" are not indexed since they are default stop words:
SQL> create table test (text varchar2(60));Table created.SQL> insert into test values ('the cat sat on the mat');1 row created.SQL> create index testindex on test(text) indextype is ctxsys.context;Index created.SQL> select token_type, token_text from dr$testindex$i; 0 CAT 0 MAT 0 SATSQL>
Token_type is 0 for normal words. Various other indexed tokens like themes, stems and section information have different token_type values.
-
Hi Roger,
Thanks for this interesting feedback.
I suspected the table $I,K had something to do with the implementation of the Text indexes.
So, you mean that th query "select token_type, token_text from dr$testindex$i;" only shows the non-default token indexed?
But, how can I get Oracle Text indexes information rearding: name, corresponding table, size, free space, allocated (used) space?
Thanks and Regards