5 Replies Latest reply: Jan 3, 2013 5:16 PM by Stew Ashton RSS

    queries to get data in batches from a huge data table

    DIVI
      HI,

      Here is my problem, i need to create some files with my own format(let say 5000 recs each) from a huge data table (May contain 5 Million records). And i want this creation to be multi threaded.
      so how can i form queries efficiently to fetch records like 1..5000 and 5001..10000 and so on.

      I can form some thing like select * from table where rownum<5000 and not exists ( already fetched records) . but it is not the efficient one. . please suggest me the best way of forming the queries or any alternative approach to create files.

      Edited by: DIVI on Jan 3, 2013 7:39 AM
        • 1. Re: queries getting data as batches from a huge data table
          sb92075
          DIVI wrote:
          HI,

          Here is my problem, i need to create some files with my own format(let say 5000 recs each) from a huge data table (May contain 5 Million records). And i want this creation to be multi threaded.
          SQL is NOT multi threaded

          BTW 5 million row table is tiny!
          • 2. Re: queries getting data as batches from a huge data table
            DIVI
            Here my java program is multi threaded, so i need an efficient query to fetch data from 5001th record to 10000th rec and another query to fetch from 100001 to 15000 and so on. how can i write those queries with out introducing a new column which holds serial number.
            • 3. Re: queries to get data in batches from a huge data table
              rp0428
              >
              Here is my problem, i need to create some files with my own format(let say 5000 recs each) from a huge data table (May contain 5 Million records). And i want this creation to be multi threaded.
              so how can i form queries efficiently to fetch records like 1..5000 and 5001..10000 and so on.

              I can form some thing like select * from table where rownum<5000 and not exists ( already fetched records) . but it is not the efficient one. . please suggest me the best way of forming the queries or any alternative approach to create files.
              >
              See Tom Kyte's article 'On Top-n and Pagination Queries'
              http://www.oracle.com/technetwork/issue-archive/2007/07-jan/o17asktom-093877.html

              He discusses several ways to do what you are asking.
              • 4. Re: queries to get data in batches from a huge data table
                jihuyao
                If data in the table is static, loop the query like

                select empno from
                (select emp.*, rownum rid from emp)
                where rid beteen 5000*(i-1)+1 and 5000*i

                and perform multi-thread process in your choice.
                • 5. Re: queries to get data in batches from a huge data table
                  Stew Ashton
                  There are different ways of doing this. Some are efficient. Some are very efficient.

                  It all depends on your requirement.

                  Do you require exactly 5000 records per thread, or do you just want to divide the work into chunks? 5000 records from a 5 million row table would mean 1000 threads, which would be much slower than one thread! I would say max 2 threads per CPU, assuming you are all alone on the machine.

                  Do the records have to be sorted in any way?

                  Does the table have a primary key index or unique index, or indeed any index on a NOT NULL value?

                  By far the most efficient would be ROWID ranges.

                  Edited by: Stew Ashton on Jan 4, 2013 12:15 AM