Forum Stats

  • 3,824,930 Users
  • 2,260,440 Discussions
  • 7,896,354 Comments

Discussions

Remove 1000 limit on IN clause

Thorsten Kettner
Thorsten Kettner Member Posts: 42 Red Ribbon
edited Mar 25, 2020 8:01AM in Database Ideas - Ideas

In Oracle we can only put up to 1000 values into an IN clause. I would like to see this restriction dropped.

Sometimes it happens that the business departments asks me to "report all items not online for this list of 2500 item numbers" and I'll have to use an editor to split the numbers into chunks of 1000 and do

select * from items where status <> 'online' and itemno in ( <first 1000 item numbers> ) or itemno in ( <next 1000 item numbers> ) or itemno in ( <last 500 item numbers> );

I've never really understood why this limit even exists. Some people argue that it would be better to create a table with the numbers and use this, because an IN clause on many values is slow. What the heck? I am not supposed to even know about this. If Oracle's optimizer sees it appropriate to work with some temporary table let it do so. With SQL I am supposed to tell the DBMS what to do, not what method to apply to achieve it. And what the DBMS shall do is:

select * from items where status <> 'online' and itemno in ( <2500 item numbers> );

Thorsten KettnerFatMartinRnemecj
12 votes

Active · Last Updated

Comments

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,816 Red Diamond

    This type of approach, like coding a 1000+ literal values in a SQL statement, is (pardon me for being blunt), idiotic.

    It impacts on the amount of memory used by the Shared Pool, contributes to Shared Pool fragmentation, increases hard parsing, burns a load of CPU cycles, and degrades performance.

    No, no, and no. Technically there are far better ways to address the 1000+ values requirement from business.

  • nemecj
    nemecj Member Posts: 41 Blue Ribbon

    This type of approach, like coding a 1000+ literal values in a SQL statement, is (pardon me for being blunt), idiotic.

    It impacts on the amount of memory used by the Shared Pool, contributes to Shared Pool fragmentation, increases hard parsing, burns a load of CPU cycles, and degrades performance.

    No, no, and no. Technically there are far better ways to address the 1000+ values requirement from business.

    I agree with the intention, but I fully disagree with the argumentation. Everybody in the database business IMHO hit this limitation while dealing with ad Hoc scripts for reporting and/or fixing data. So it is a real problem.

    The “hard parse” argument makes no sense in this case where the query is run only once. Further not everyone wants to challenge all technical availabilities such as set up external tables with the ID list or define temporary tables and feed them with long insert scripts. (see here).

    The real killer argument why this is a NO GO request is the question “what should be the next limit?”.

    Is 2000 fine or should it be 5K, 100K?

    Additionally all major SQL generation tools already account for this in their Oracle dialects. E.g. if you see this is V$SQL  … x in (?,?,…1000times….,?) or x in (?,?,….,?)  it was most probably Hibernate at work.

    It will remain for as to live with it and use the same workarounds.

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,816 Red Diamond

    I agree with the intention, but I fully disagree with the argumentation. Everybody in the database business IMHO hit this limitation while dealing with ad Hoc scripts for reporting and/or fixing data. So it is a real problem.

    The “hard parse” argument makes no sense in this case where the query is run only once. Further not everyone wants to challenge all technical availabilities such as set up external tables with the ID list or define temporary tables and feed them with long insert scripts. (see here).

    The real killer argument why this is a NO GO request is the question “what should be the next limit?”.

    Is 2000 fine or should it be 5K, 100K?

    Additionally all major SQL generation tools already account for this in their Oracle dialects. E.g. if you see this is V$SQL  … x in (?,?,…1000times….,?) or x in (?,?,….,?)  it was most probably Hibernate at work.

    It will remain for as to live with it and use the same workarounds.

    The IN clause has a variable number of values. So even IF (very doubtful!!) the client do use bind variables, the number of bind variables will vary. This means hard parsing each and every time for the vast majority of parses.

    Never mind the fact that the client has to ship 100's (or even 1000's) of values for these bind variables used in the IN clause, across the network to the server, each time.

    Using the IN clause with 100's of values only shows a miserable failure on part of the developer, as there are far better, and scalable, methods to addressing this requirement.

    Arguing that product Y supports more IN clause values is a very poor argument.

  • nemecj
    nemecj Member Posts: 41 Blue Ribbon

    The IN clause has a variable number of values. So even IF (very doubtful!!) the client do use bind variables, the number of bind variables will vary. This means hard parsing each and every time for the vast majority of parses.

    Never mind the fact that the client has to ship 100's (or even 1000's) of values for these bind variables used in the IN clause, across the network to the server, each time.

    Using the IN clause with 100's of values only shows a miserable failure on part of the developer, as there are far better, and scalable, methods to addressing this requirement.

    Arguing that product Y supports more IN clause values is a very poor argument.

    Sorry for being misunderstood. My point was basically in accordance with your intention, the only difference was in the choice of slightly more gentle and polite arguments of the rejection. But apparently I did not used such gentle and polite formulation in my response, considering you reaction. Anyway to not waste the valuable space of this forum and to provide some positive feedback, I’d propose that you post here one of those “far better, and scalable, methods” to solve a request of the boss demanding a report of the total revenue of the thousand and three top customers (their IDs are provided in the attached excel file) of course ASAP and latest EOB.

    William Robertson
  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,816 Red Diamond

    Sorry for being misunderstood. My point was basically in accordance with your intention, the only difference was in the choice of slightly more gentle and polite arguments of the rejection. But apparently I did not used such gentle and polite formulation in my response, considering you reaction. Anyway to not waste the valuable space of this forum and to provide some positive feedback, I’d propose that you post here one of those “far better, and scalable, methods” to solve a request of the boss demanding a report of the total revenue of the thousand and three top customers (their IDs are provided in the attached excel file) of course ASAP and latest EOB.

    Two basic methods come to mind.

    Static:

    Standard 3NF lookup/reference tables for values, or lists of values, and used in SQL against base table(s).

    Dynamic:

    A GTT that can be populated on-the-fly by application with values of interest, and used in SQL against base table(s).

    My arguing the issue is not to be taken personally, and seen as the expected type of response from an INTJ.

    William Robertson
  • Sven W.
    Sven W. Member Posts: 10,535 Gold Crown

    Sorry for being misunderstood. My point was basically in accordance with your intention, the only difference was in the choice of slightly more gentle and polite arguments of the rejection. But apparently I did not used such gentle and polite formulation in my response, considering you reaction. Anyway to not waste the valuable space of this forum and to provide some positive feedback, I’d propose that you post here one of those “far better, and scalable, methods” to solve a request of the boss demanding a report of the total revenue of the thousand and three top customers (their IDs are provided in the attached excel file) of course ASAP and latest EOB.

    nemecj wrote: ... I’d propose that you post here one of those “far better, and scalable, methods” to solve a request of the boss demanding a report of the total revenue of the thousand and three top customers (their IDs are provided in the attached excel file) of course ASAP and latest EOB.

    Load the ids into a temp table and use that temp table as a filter in the ad hoc query.