8 Replies Latest reply: Jan 22, 2013 11:08 PM by Manjusha Muraleedas RSS

    Why use collections?

    Manjusha Muraleedas
      Please help me for a better and right understanding of the following point

      from one PPT available in internet , I read a sentence under heading 'Why use collections?' please find it here

      Serve up complex datasets of information to non-PL/SQL host environments using table functions

      My understanding : This means complex datasets can be retrieved from oracle using only a single sql statement using the facility called Oracle Pipelined Table Functions |http://www.akadia.com/services/ora_pipe_functions.html]

      Any other facility is there in oracle to serveup complex datasets of information to non-PL/SQL host environments ?

      Edited by: Manjusha Muraleedas o, spelling mistake corrected.
        • 1. Re: Why use collections?
          BluShadow
          Ref Cursors are a good option.

          Depends what you mean by "complex"?

          SQL is perfectly capable of providing complex data, and if necessary it can supply XML data which can be a very complex structure. No need for collections.
          • 2. Re: Why use collections?
            Billy~Verreynne
            The best is to understand WHAT a collection is. And HOW it can be used.

            Knowing that, enables you to determine whether a situation facing you can be best resolved by a PL/SQL collection (aka an array).

            The biggest constraint using collections/arrays in PL/SQL is that the structure resides in private process memory (called PGA in Oracle speak).

            If that code is executed by 10 sessions, 10 sessions will need to allocate and increase their private process memory. If the collection is a small 1MB structure, it means an additional 1MB of server memory per process using that collection code.

            So this approach is not scalable.

            When using a PL/SQL collection via SQL, it is used as a bind variable. The SQL engine cannot directly read PL/SQL variables in private memory. The variable's value need to be copied to the SQL engine for use via a process called binding. Each and every time that variable is used in a SQL statement in PL/SQL.

            So this approach is not really performant and resource friendly.

            This expensive PL/SQL structure also needs to be compared with the SQL engine's version of temporary data storage - global temporary tables (GTTs). These consume a lot less dedicated server resources. Can be indexed. Scales better. Performs better.

            So collections/arrays are typically used in PL/SQL for bulk processing. When there is the need too fetch data from a cursor - bulk fetch it. When there is need to send data to a cursor, bulk bind it. But this is an exception as row-by-row process cannot compete with driving the processing of data using SQL only.

            Putting data of a SQL cursor into a collection to cache it for later use? Questionable.

            Putting data of a SQL cursor into a collection to cache it for an external application? Nonsensical.

            So why and when use collections? Only when that is the best solution for addressing the problem. Which is less often than what many developers seem to think.
            • 3. Re: Why use collections?
              Billy~Verreynne
              Manjusha Muraleedas wrote:

              Any other facility is there in oracle to serveup complex datasets of information to non-PL/SQL host environments ?
              Oracle RDBMS is a PL/SQL host environment. So I do not understand the question.

              As for serving up complex data structures. Using PL/SQL collections are usually the wrong answer to the problem - due to impact on server memory.

              PL/SQL code can be used, as an integral part of the SQL engine, to serve as a pipeline - where the SQL engine puts data into this pipe, the pipe transforms the data and creates a new structure, and then the pipe's output is send by the SQL engine to the client process.

              This PL/SQL feature is called a pipeline table function. It can be executed in parallel by the SQL engine. It does not output a collection using expensive PGA server memory. The "collection" is dealt with by the SQL engine, as it would deal with any other cursor data - in an efficient and scalable fashion.

              It would be very unusual to serve complex data structures to clients, via SQL, from a PL/SQL collection.
              • 4. Re: Why use collections?
                Manjusha Muraleedas
                My question is that , Why the statement 'Serve up complex datasets of information to non-PL/SQL host environments using table functions' came under heading 'why use collections?'. Just because there is a facility ? or any other really good news is there ?


                I understood, complex data structures( supose a structure with 3 columns by joing 5 tables) can be serve up to non-pl/sql envirormment like Java, using ref cursors or xml. Using collection to do the same is not a good option .


                Reason is(as explained by Billy Verreynne) :
                1. Collection consumes session specific memory (PGA)
                2.When using a PL/SQL collection via SQL, it is used as a bind variable. The SQL engine cannot directly read PL/SQL variables in private memory. The variable's value need to be copied to the SQL engine for use via a process called binding. Each and every time that variable is used in a SQL statement in PL/SQL.

                So this approach is not really performant and resource friendly.


                Now lets talk about pipelined table functions.
                " pipelined table functions can be executed in parallel by the SQL engine. It does not output a collection using expensive PGA server memory. The "collection" is dealt with by the SQL engine, as it would deal with any other cursor data - in an efficient and scalable fashion."

                So when we use pipeline table function,collections used in table functions won't use PGA, it will use system or instance level memory (SGA)?

                pipeline table functions achieves performance by parallel processing . so even if binding time and memory conseption is there , it will performs better than refcursors or xml?
                • 5. Re: Why use collections?
                  Billy~Verreynne
                  Manjusha Muraleedas wrote:
                  My question is that , Why the statement 'Serve up complex datasets of information to non-PL/SQL host environments using table functions' came under heading 'why use collections?'. Just because there is a facility ? or any other really good news is there ?
                  Well, that statement does not make sense. The best means to service clients with data from Oracle is via a ref cursor interface - enabling PL/SQL code (business logic, validation, security, etc) to be applied, and then this code to construct the SQL cursor. The code then returns that SQL cursor's reference handle for the client to use.

                  What can be "complex" for a client is to directly deal with SQL collection type of a complex object class. The client needs to understand the object and collection data types. This "problem" can be overcome by returning that collection as a cursor. The client then uses the known cursor interface to determine the structure and data types of the data returned.

                  Simple example. PL/SQL proc FunkyFoo() returns SQL collection TComplexCollection. The client developer struggles to make the following PL/SQL call work:
                  begin
                    :1 := FunkyFoo();
                  end;
                  As workaround, the developer changes the call to:
                  select * from TABLE(FunkyFoo)
                  However - the real problem here is the developer's inability to directly deal with that custom SQL data type in the client language used. Oracle has provided developers with a client interface to Oracle objects/custom SQL types since Oracle 8i.
                  So when we use pipeline table function,collections used in table functions won't use PGA, it will use system or instance level memory (SGA)?
                  A pipeline table uses (for outputting data via the pipe row() instruction) - not the PGA.

                  If you call a PL/SQL function that returns a million rows as a collection - all million rows need to be created in the PGA before the client gets the data.

                  If you call a PL/SQL pipeline function that returns a million rows as a collection - limited number of rows is created by the pipeline as the caller fetches the rows. The pipeline does not run, complete and return rows. It runs a bit, and returns rows, runs a bit, returns rows, until done. This is driven by the caller that does the fetching.
                  • 6. Re: Why use collections?
                    Stew Ashton
                    I looked at the presentation and here's how I read it (although you might be better off asking the author):
                    Manjusha Muraleedas wrote:
                    My question is that , Why the statement 'Serve up complex datasets of information to non-PL/SQL host environments using table functions' came under heading 'why use collections?'. Just because there is a facility ? or any other really good news is there ?
                    There is an unwritten assumption in the presentation: "complex" data is data that has to be manipulated in PL/SQL because SQL alone doesn't work.

                    Nowhere does the author say that a pure SQL solution should be changed into a PL/SQL solution.

                    Once you decide you have to use PL/SQL, it can be better to put your data into collections than into scalar variables.
                    I understood, complex data structures( supose a structure with 3 columns by joing 5 tables) can be serve up to non-pl/sql envirormment like Java, using ref cursors or xml. Using collection to do the same is not a good option .
                    Again, I think the author is using "complex" as shorthand for "needs PL/SQL in the first place". If the data doesn't have to be processed in PL/SQL, there is no need to use PL/SQL variables, either scalar or collections.
                    I would say "using ref cursors". Providing relational data in XML format should be the exception, not the rule. It has an additional performance cost.
                    Now lets talk about pipelined table functions.
                    ...
                    pipeline table functions achieves performance by parallel processing . so even if binding time and memory conseption is there , it will performs better than refcursors or xml?
                    Pipelined table functions will not perform better than ref cursors. There are two processes running, one for the SELECT and one for the function, so there will be more work on the server. With a ref cursor, there is one process just returning the data.

                    Parallel processing is not an issue. If you can do the job in plain SQL, you can do it in parallel in plain SQL.

                    Edited by: Stew Ashton on Jan 22, 2013 4:44 PM
                    • 7. Re: Why use collections?
                      Iordan Iotzov
                      In addition to the comprehensive analysis so far, there is one more consideration to be made:

                      If a PL/SQL collection (or any other structure for that matter) is to be used in a moderately to significantly complex SQL statements, you have to be aware how the CBO (cost-based optimizer) would handle it. That is, if there are 50K records in the collection, would the CBO know there are 50k records, or just use a blanket assumption? If the CBO does not know, at least approximately, how many records are behind those objects, it would have to guess. And any guess could be incorrect, leading to bad execution plans.

                      For instance, by default the Oracle CBO (10gR2) assumes that MEMBER OF would filter 95% of the records, regardless of how big the collection is.

                      Pipelines functions have similar problems - http://www.oracle-developer.net/display.php?id=429

                      The bottom line is, if you use SQL, make sure that the CBO is aware of the size and selectivity of all involved objects - tables, filters, PL/SQL collections, etc..

                      I’ll be giving a presentation about this and related issues (confidence of cardinality estimates) at Hotsos 2013.

                      Iordan Iotzov
                      http://iiotzov.wordpress.com/
                      • 8. Re: Why use collections?
                        Manjusha Muraleedas
                        Dear All,

                        Thanks alot for the anwers.I learned alot from your answers. Now I understood the concept clearly.

                        Thanks,
                        Manjusha