8 Replies Latest reply on Jul 23, 2012 7:15 AM by Purvesh K

    Pulling out two records at a time

    Brian Tkatch
      I have a TABLE with multiple records in it, of which i want to pull out two records at a time (as if they were one record), of those records that meet the WHERE criteria. Order is not important. Basically, i have a TABLE of things that wither need processing or have been processed already. I can process two at a time. (In a related search, that i am not up to yet, i can do 20 at a time.) How would i go about doing that?

      My first thought was a CTE with ROW_NUMBER() RN, using CONNECT BY RN = PRIOR RN + 2, and have a sub-query (or joined TABLE) pull out the second record:

      WITH Moo (SELECT ... ROW_NUMBER RN FROM Table WHERE ...)
      SELECT ..., (SELECT ... FROM FROM Moo Moo2 WHERE Moo2.Id = Moo.Id + 1)
      FROM Moo CONNECT BY Id = PRIOR Id + 2;

      Or a join. Is there a better way to approach it?
        • 1. Re: Pulling out two records at a time
          sb92075
          Brian Tkatch wrote:
          I have a TABLE with multiple records in it, of which i want to pull out two records at a time (as if they were one record), of those records that meet the WHERE criteria. Order is not important. Basically, i have a TABLE of things that wither need processing or have been processed already. I can process two at a time. (In a related search, that i am not up to yet, i can do 20 at a time.) How would i go about doing that?

          My first thought was a CTE with ROW_NUMBER() RN, using CONNECT BY RN = PRIOR RN + 2, and have a sub-query (or joined TABLE) pull out the second record:

          WITH Moo (SELECT ... ROW_NUMBER RN FROM Table WHERE ...)
          SELECT ..., (SELECT ... FROM FROM Moo Moo2 WHERE Moo2.Id = Moo.Id + 1)
          FROM Moo CONNECT BY Id = PRIOR Id + 2;

          Or a join. Is there a better way to approach it?
          It would be helpful if you provided DDL (CREATE TABLE ...) for tables involved.
          It would be helpful if you provided DML (INSERT INTO ...) for test data.
          It would be helpful if you provided expected/desired results & a detailed explanation how & why the test data gets transformed or organized.
          • 2. Re: Pulling out two records at a time
            rp0428
            >
            Is there a better way to approach it?
            >
            That depends on how you intend to process the records.
            >
            I can process two at a time
            >
            What does that mean? Process how? Using a stored procedure? Using Java code?

            A procedure or Java code could just retrieve the records it wants to process whether that is one or twenty. Why do you need to select only two and then two more, etc?

            If you are using code then the simples thing is to define a cursor and just fetch the records you want to process.

            In a stored procedure you can do a BULK COLLECT with a LIMIT of two if you just want 2 records. You can use a P_LIMIT parameter that would control what the limit is.

            But the least efficient way is to use a simple query with a row-limiting clause like you are proposing because Oracle does not keep the cursor open between executions. Using a cursor in a procedure the cursor stays open. Same if you use Java code to retrieve a result set and then fetch it.
            • 3. Re: Pulling out two records at a time
              Brian Tkatch
              What does that mean? Process how? Using a stored procedure? Using Java code?
              Sorry 'bout that. I plan to use a FUNCTION inside a SQL statement.

              INSERT INTO Results .....
              SELECT My_Processsor(row1.a, row1.b, row2.a, row2.b)
              FROM
              WHERE....

              The process is ultimately for Amazon PA-API and returns an XMLType, which i hope to EXTRACT() (or similar) the data i need. It works well with one record at a time. But as i have a API limit, and i can do 2 at a time, i want to make use of that in an efficient manner.
              • 4. Re: Pulling out two records at a time
                rp0428
                >
                The process is ultimately for Amazon PA-API and returns an XMLType, which i hope to EXTRACT() (or similar) the data i need. It works well with one record at a time. But as i have a API limit, and i can do 2 at a time, i want to make use of that in an efficient manner.
                >
                Well the general use is to use SQL instead of PL/SQL but since you already need to use a function you might as well make that function as efficient as possible.

                One way would be use a PIPELINED function that can be used as a table.

                Here is example code. You could do two fetches to get two rows and produce one record type for output that combines the two rows.
                But you could also just as easily add the INSERT to the function itself.
                -- type to match emp record
                create or replace type emp_scalar_type as object
                  (EMPNO NUMBER(4) ,
                   ENAME VARCHAR2(10),
                   JOB VARCHAR2(9),
                   MGR NUMBER(4),
                   HIREDATE DATE,
                   SAL NUMBER(7, 2),
                   COMM NUMBER(7, 2),
                   DEPTNO NUMBER(2)
                  )
                  /
                 
                -- table of emp records
                create or replace type emp_table_type as table of emp_scalar_type
                  /
                
                -- pipelined function 
                
                 
                create or replace function get_emp( p_deptno in number )
                  return emp_table_type
                  PIPELINED
                  as
                   TYPE EmpCurTyp IS REF CURSOR RETURN emp%ROWTYPE;
                    emp_cv EmpCurTyp;
                    l_rec  emp%rowtype;
                  begin
                    open emp_cv for select * from emp where deptno = p_deptno;
                    loop
                      fetch emp_cv into l_rec;
                      exit when (emp_cv%notfound);
                      pipe row( emp_scalar_type( l_rec.empno, LOWER(l_rec.ename),
                          l_rec.job, l_rec.mgr, l_rec.hiredate, l_rec.sal, l_rec.comm, l_rec.deptno ) );
                    end loop;
                    return;
                  end;
                  /
                  
                select * from table(get_emp(20))
                1 person found this helpful
                • 5. Re: Pulling out two records at a time
                  Brian Tkatch
                  Wow, thanx. I've never actually written a PIPELINED FUNCTION before, so, i ought to go check that out.

                  I generally keep away from CURSORs, as i want the DB to handle the set processing for me, as opposed to a CURSOR which tells the DB how to do it's job. (And, Tom Kyt's mantra of "You should do it in a single SQL statement if at all possible") But that's just a rule of thumb. If the CURSOR is the most efficient, so be it. But as you point out, if i'm already going to use a CURSOR, i might as well do all the work in the CURSOR LOOP. Why bother switching contexts.
                  • 6. Re: Pulling out two records at a time
                    rp0428
                    Pretty much everything you just said is wrong.
                    >
                    I generally keep away from CURSORs, as i want the DB to handle the set processing for me, as opposed to a CURSOR which tells the DB how to do it's job.
                    >
                    If you do any DB queries then you are using cursors. You have the wrong idea about them. All result sets are obtained using a cursor. A cursor does NOT tell a DB how to do a job; that is what the query and execution plan do.
                    >
                    (And, Tom Kyt's mantra of "You should do it in a single SQL statement if at all possible") But that's just a rule of thumb.
                    >
                    No - that isn't a rule of thumb. That is exactly what you should do if possible which is what I said. But I also said
                    >
                    Well the general use is to use SQL instead of PL/SQL but since you already need to use a function you might as well make that function as efficient as possible.
                    >
                    If you are already using a function then you should get the most bang for the buck from it and make it as efficient as possible.
                    >
                    If the CURSOR is the most efficient, so be it. But as you point out, if i'm already going to use a CURSOR, i might as well do all the work in the CURSOR LOOP. Why bother switching contexts.
                    >
                    All queries are done using cursors. It isn't a question of whether they are more efficient; it is the only way possible. I think you are referring to a user creating an explicit cursor and performing explicit manipulation of that cursor. That isn't normally what you want to do.

                    A cursor loop is the slowest way to process results especially if all you are doing is row-by-row processing to perform essentially SQL-related work. PL/SQL should be used for the transactional part of the processing. In your case that transactional use is consolidating two records together for specialized processing.

                    As for context switches if you perform any PL/SQL you are going to have a context switch: functions, procedures, triggers. The function you already planned to use would have caused a context switch. What I am saying is that if you are going to perform a context switch then you want to maximize the amount of work (transactional work) done in the PL/SQL engine so that you minimize the context switching.

                    You still haven't explained exactly why 2 records at a time is any more ideal than 1 or if you could also process 50, 100, or 10000 at a time. Is the limit what needs to be passed somewhere else? What is the reason for a max of two? What exactly are you doing with the data to prepare it before you have to pass it on?

                    Bulk collect processing is more efficient, with fewer context switches, if you process 20, 50, or 100 records in the loop at a time.







                    See Chap 13 Using Pipelined and parallel Table Functions in the Data Cartridge Dev Guide

                    http://docs.oracle.com/cd/B28359_01/appdev.111/b28425/pipe_paral_tbl.htm
                    • 7. Re: Pulling out two records at a time
                      Brian Tkatch
                      Thank you, you speak well. Just this morning i was explaining to someone how CURSORs work in the DB, in regards to the "wrongness" of the Venn diagram when explaining how a join works. But, while i have the basics down (or so i believe) the application of it seems to always entail something i do not know.

                      Regardless of which is better, i have optimized a number of processes in Oracle (that is, SPs written by people in PL/SQL) by removing CURSORs and simply doing everything in one big statement. It has always been faster. Admittedly, i am not claiming it must be that way, but that is my experience. Hence, i always aim to avoid CURSORs, in a belief it enhances speed, clarity, and efficiency.
                      You still haven't explained exactly why 2 records at a time is any more ideal than 1
                      Amazon's PA-API has a limit (which can be raised) of 2000 batches per hour. Each batch may have two requests of the same type (mixing types is not allowed in the same batch). A search request may have one item (so the batch may have a total of two), a lookup may have ten items (so the batch may have total of twenty). My test case has 3691 items to be searched for, hence, i am hoping to make 1846 bath in one hour (at a rate of no more than one per second, per the license agreement).
                      What exactly are you doing with the data to prepare it before you have to pass it on?
                      The result is an XML document, of which i need just a few data (i think i need to learn XQuery). That would be an UPDATE, though i am toying with the idea of storing the XML document in another TABLE instead. If not for space constraints, i think the latter to be better.
                      • 8. Re: Pulling out two records at a time
                        Purvesh K
                        Brian Tkatch wrote:
                        Amazon's PA-API has a limit (which can be raised) of 2000 batches per hour. Each batch may have two requests of the same type (mixing types is not allowed in the same batch). A search request may have one item (so the batch may have a total of two), a lookup may have ten items (so the batch may have total of twenty). My test case has 3691 items to be searched for, hence, i am hoping to make 1846 bath in one hour (at a rate of no more than one per second, per the license agreement).
                        I think you are trying to implement Pagination query; If yes, then please visit Ask Tom: On Rownum and Limiting Results.

                        A sample query as explained by Tom:
                        select b.*
                          from (
                                  select a.*, rownum r
                                    from (
                                          select employee_id, first_name, last_name, department_id
                                            from hr.employees
                                          order by department_id
                                         ) a
                                   where rownum <= 20
                               ) b
                         where b.r >= 10;
                        Below is a Sample code that is similar to your requirement:
                        create or replace function getSumSalary (p_start_idx      in    number,
                                               p_end_idx        IN    number
                                              )
                        return number is
                        
                            l_ret_num     number;
                        begin
                          select sum(salary)
                            into l_ret_num
                            from (
                                    select a.*, rownum r
                                      from (
                                            select employee_id, first_name, last_name, department_id, salary
                                              from hr.employees
                                            order by department_id, employee_id
                                           ) a
                                     where rownum <= p_end_idx
                                 ) b
                           where b.r > p_start_idx;
                          
                          return l_ret_num;
                        end;
                        
                        
                        select getsumsalary((level - 1) * 2, level * 2)
                          from dual
                        connect by level <= 10;
                        Edited by: Purvesh K on Jul 23, 2012 12:45 PM