Forum Stats

  • 3,815,817 Users
  • 2,259,093 Discussions
  • 7,893,255 Comments

Discussions

Query Relaxation Order

chris227
chris227 Member Posts: 3,517 Bronze Crown
edited Nov 8, 2019 9:21AM in Text

Hi,

In the 12.2 docs on oracle text in 4.1.15 Query Relaxation on oracle text the below is stated:

"Query hits are returned in this sequence with no duplication as long as the application needs results."

Usually the thumb of rule on ordering is always to use an order by (exception connect by e.g.).

Does the above quote mean that we don't have to use an order by since we will receive the rows implicitly in order of the searches defined?

And how will the order by defined for the row-outcome from the searches itselfs. Will it be on score?

And finally: will the DOMAIN_INDEX_SORT will have any effect without an order by?

Thanks in advance

Regards

Chris

Best Answer

  • Roger Ford-Oracle
    Roger Ford-Oracle Member Posts: 1,132 Employee
    edited Oct 31, 2019 4:32PM Answer ✓

    The results will be returned from the text layer to the kernel in sequence order, so all the results from the first <SEQ> will be returned before the results from the second <SEQ>.  I *think* that the rows will be sorted by score within the sequence, but I'm not absolutely sure of that.

    For a simple query, such as SELECT * FROM ... WHERE CONTAINS ... then the kernel will return the results to the calling API in the same order that they're returned by the text layer. However, if your query is more complex, perhaps with joins or other indexed columns then it's possible (though not common) that the rows could come back in a different order. If in doubt, add ORDER BY SCORE(n) DESC.

    DOMAIN_INDEX_SORT won't have any effect if you don't use ORDER BY SCORE(n) DESC, but it won't be needed either.  If you do use the order by, then the hint will prevent it from fetching all the hits back from the text layer in order to sort them.

    Hope that helps.

    Roger

    chris227

Answers

  • Roger Ford-Oracle
    Roger Ford-Oracle Member Posts: 1,132 Employee
    edited Oct 31, 2019 4:32PM Answer ✓

    The results will be returned from the text layer to the kernel in sequence order, so all the results from the first <SEQ> will be returned before the results from the second <SEQ>.  I *think* that the rows will be sorted by score within the sequence, but I'm not absolutely sure of that.

    For a simple query, such as SELECT * FROM ... WHERE CONTAINS ... then the kernel will return the results to the calling API in the same order that they're returned by the text layer. However, if your query is more complex, perhaps with joins or other indexed columns then it's possible (though not common) that the rows could come back in a different order. If in doubt, add ORDER BY SCORE(n) DESC.

    DOMAIN_INDEX_SORT won't have any effect if you don't use ORDER BY SCORE(n) DESC, but it won't be needed either.  If you do use the order by, then the hint will prevent it from fetching all the hits back from the text layer in order to sort them.

    Hope that helps.

    Roger

    chris227
  • chris227
    chris227 Member Posts: 3,517 Bronze Crown
    edited Nov 8, 2019 8:07AM

    Thank you for the detailed answer.

    In fact i was in doubt and still be, since from my results the order by score must not match the order of the query outcome.

    Furthermore the order in the partial searches of the progression seems even not to be in order of the score.

    ANNER-----------------------------------------------------------------------------Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit ProductionPL/SQL Release 12.2.0.1.0 - ProductionCORE    12.2.0.1.0      ProductionTNS for Solaris: Version 12.2.0.1.0 - ProductionNLSRTL Version 12.2.0.1.0 - Production>set define off>column cv format a15>create table test_table as  2  select column_value cv from table(sys.odcivarchar2list(  3     'bar & cafe'  4  ,  'cafe bar'  5  ,  'cafe restaurant bar'  6  ,  'bar'  7  ,  'cafe'  8  ,  'cafe with bar'  9  ))10  /Table created.>create index idx_test_table_ot on test_table(cv)  2  INDEXTYPE IS ctxsys.context parameters (  3  'stoplist ctxsys.empty_stoplist'  4  )  5  /Index created.>select  2  cv, score(1)  3  from test_table  4  where contains (cv,'  5  <query>  6    <textquery>  7      <progression>  8        <seq>{cafe} {bar}</seq>  9        <seq>{cafe},{bar}</seq>10      </progression>11    </textquery>12    <score datatype="FLOAT"/>13  </query>14  ', 1) > 015  /CV                SCORE(1)--------------- ----------cafe bar           51,6187bar & cafe         25,8094cafe restaurant    25,8094barbar                  ,8094cafe                 ,8094cafe with bar      25,80946 rows selected.>drop table test_table  2  /Table dropped.

    Regards Chris

  • Roger Ford-Oracle
    Roger Ford-Oracle Member Posts: 1,132 Employee
    edited Nov 8, 2019 8:15AM

    The overall scores are correct. In case you're not sure why, here's the explanation:


    Your first query is a phrase search {cafe} {bar} and will only match those two words in order with no intervening words. 'cafe with bar' is not a match for the first query.

    The second query is an ACCUM (the ',' operator), which matches any combination of 'cafe' and 'bar'. All the other rows match that. Within the ACCUM, the rows which have both terms score higher than the rows which only have one.

    Now your results are coming back in SEQ order, with '{cafe bar}' coming back before '{cafe}, {bar}'.  However, you have answered my point I wasn't sure about:

    > I *think* that the rows will be sorted by score within the sequence, but I'm not absolutely sure of that.

    Your results show I was not correct. The scores within a sequence are not sorted.  If you want that to be true, you WILL need to add "ORDER BY score(1) DESC" to the end of the query.

    I apologize for not testing my assumption!

  • chris227
    chris227 Member Posts: 3,517 Bronze Crown
    edited Nov 8, 2019 9:21AM
    Roger Ford-Oracle wrote:The overall scores are correct. In case you're not sure why, here's the explanation:

    Yes, my point was to show that the order within a sequence is not necessarily by score.

    But, to be honest, doesn't matter much, i guess, at last we are interested in the score, aren't we?

    Having said that, i wondered why the below shows "cafe bar" as second with some score same as some accum results.

    Now i got it, in this sequence order it is just not the result of the phrase search, it's a result of the accum search.

    So probably the sequence order of accum, phrase never makes sense, since the outcome of phrase is a subset of accum per definition.

    > set define off>create table test_table as  2  select column_value cv from table(sys.odcivarchar2list(  3     'bar & cafe'  4  ,  'cafe bar'  5  ,  'cafe restaurant bar'  6  ,  'bar'  7  ,  'cafe'  8  ,  'cafe with bar'  9  ))10  /Table created.>create index idx_test_table_ot on test_table(cv)  2  INDEXTYPE IS ctxsys.context parameters (  3  'stoplist ctxsys.empty_stoplist'  4  )  5  /Index created.>select  2  cv, score(1)  3  from test_table  4  where contains (cv,'  5  <query>  6    <textquery>  7      <progression>  8        <seq>{cafe},{bar}</seq>  9        <seq>{cafe} {bar}</seq>10      </progression>11    </textquery>12    <score datatype="FLOAT"/>13  </query>14  ', 1) > 015  /CV                SCORE(1)--------------- ----------bar & cafe         75,8094cafe bar           75,8094cafe restaurant    75,8094barbar                50,8094cafe               50,8094cafe with bar      75,80946 rows selected.>drop table test_table  2  /Table dropped.

    Thanks again! I appreciate your support.

    Regards Chris