10 Replies Latest reply: Apr 2, 2013 1:46 PM by EdStevens RSS

    Reg : Spool CSV data -

    ranit B
      Hi Experts,

      I'm pulling a prod table's data into a CSV file using <tt>SPOOL</tt> command.
      But the file is getting too huge. The table has ~49K records and currently 35K have been inserted... the file size is 190MB now and still growing.

      When I opened the CSV, I checked after the last column the next cell doesn't immediately after that. There's a long gap.
      Seems like the query couldn't generate the CSV.

      I'm doing something like below -
      set colsep ','
      set echo off newpage 0 space 0 pagesize 0 feed off head off trimspool on linesize 10000 
      spool c:\rb2\testing.csv
      
      SELECT 
          x.col1||','||
          x.col2||','||
          x.col3||','||
          x.col4||','||
          x.col5
      FROM 
          test_table x;
          
      spool off;
      Any idea why the size is getting so huge and how can it be avoided?
      Thanks!

      - Ranit
        • 1. Re: Reg : Spool CSV data -
          jeneesh
          {message:id=9395791}
          • 2. Re: Reg : Spool CSV data -
            Karthick_Arp
            Thats because you have set your LINESIZE as 10000.

            I would use DUMP_CSV. I have got it installed in production. It always helps.

            http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:68212348056
            • 3. Re: Reg : Spool CSV data -
              ranit B
              Thats because you have set your LINESIZE as 10000.
              Thanks karthick!!!
              Indeed this was the problem... I made <tt>LINESIZE = 2500</tt>
              Aware that this still has a bug i.e if my column data length exceeds 2500, it'll get truncated...right??
              I would use DUMP_CSV. I have got it installed in production. It always helps.

              http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:68212348056
              Thanks again for this... just had a glance. Nice utility.
              • 4. Re: Reg : Spool CSV data -
                Karthick_Arp
                ranit B wrote:
                Thats because you have set your LINESIZE as 10000.
                Thanks karthick!!!
                Indeed this was the problem... I made <tt>LINESIZE = 2500</tt>
                Aware that this still has a bug i.e if my column data length exceeds 2500, it'll get truncated...right??
                It will not be truncated but wrapped to next line. And when you look it as CSV file it will be incorrect data.


                e.g.

                Say you have output as 1,2,3 But when it get wrapped it will be

                1,2,
                3

                Now its 2 rows in your CSV file.
                SQL> set linesize 3
                SQL> 
                SQL> select ename from emp where rownum = 1;
                 
                ENA
                ---
                SMI
                TH
                 
                 
                SQL> set linesize 100
                SQL> 
                SQL> select ename from emp where rownum = 1;
                 
                ENAME
                ------
                SMITH
                 
                SQL> 
                • 5. Re: Reg : Spool CSV data -
                  ranit B
                  You rock Karthick...Thanks!!
                  • 6. Re: Reg : Spool CSV data -
                    EdStevens
                    ranit B wrote:
                    Hi Experts,

                    I'm pulling a prod table's data into a CSV file using <tt>SPOOL</tt> command.
                    But the file is getting too huge. The table has ~49K records and currently 35K have been inserted... the file size is 190MB now and still growing.

                    When I opened the CSV, I checked after the last column the next cell doesn't immediately after that. There's a long gap.
                    Seems like the query couldn't generate the CSV.

                    I'm doing something like below -
                    set colsep ','
                    set echo off newpage 0 space 0 pagesize 0 feed off head off trimspool on linesize 10000 
                    spool c:\rb2\testing.csv
                    
                    SELECT 
                    x.col1||','||
                    x.col2||','||
                    x.col3||','||
                    x.col4||','||
                    x.col5
                    FROM 
                    test_table x;
                    
                    spool off;
                    Why do you set colsep, then structure your query with concatenation and inserting your own commas, creating (from oracle's perspective) a single column of output?
                    And is there any chance any of your columns will have data that itself includes a comma?

                    Any idea why the size is getting so huge and how can it be avoided?
                    Thanks!

                    - Ranit
                    • 7. Re: Reg : Spool CSV data -
                      ranit B
                      Nice point Ed, even i thought regarding that.
                      But mostly there won't be any input data with comma in it.

                      Do you have any work around for this?
                      • 8. Re: Reg : Spool CSV data -
                        EdStevens
                        ranit B wrote:
                        Nice point Ed, even i thought regarding that.
                        But mostly there won't be any input data with comma in it.
                        To praphrase Yoda, "There is no 'mostly'. Either there is, or there is not".
                        Do you have any work around for this?
                        -Use as as column seperator a character that absolutely does not and never will appear in the data. Of course, by definition, a csv file uses a comma. So...

                        - Enclose the individual column values with double-quotes, ie:
                        "fname","lname","city"
                        "Joe","Shmo","Akron"
                        "Beetle","Baily","Camp Swampy"
                        • 9. Re: Reg : Spool CSV data -
                          BluShadow
                          EdStevens wrote:
                          ranit B wrote:
                          Nice point Ed, even i thought regarding that.
                          But mostly there won't be any input data with comma in it.
                          To praphrase Yoda, "There is no 'mostly'. Either there is, or there is not".
                          Do you have any work around for this?
                          -Use as as column seperator a character that absolutely does not and never will appear in the data. Of course, by definition, a csv file uses a comma.
                          By which definition?

                          CSV correctly stands for "Character Seperated Values" though over time people have corrupted the acronym to mean "Comma Seperated Values" which is not correct. the character seperator can be whatever is required to uniquely seperate the values in the data.
                          • 10. Re: Reg : Spool CSV data -
                            EdStevens
                            BluShadow wrote:
                            EdStevens wrote:
                            ranit B wrote:
                            Nice point Ed, even i thought regarding that.
                            But mostly there won't be any input data with comma in it.
                            To praphrase Yoda, "There is no 'mostly'. Either there is, or there is not".
                            Do you have any work around for this?
                            -Use as as column seperator a character that absolutely does not and never will appear in the data. Of course, by definition, a csv file uses a comma.
                            By which definition?

                            CSV correctly stands for "Character Seperated Values" though over time people have corrupted the acronym to mean "Comma Seperated Values" which is not correct. the character seperator can be whatever is required to uniquely seperate the values in the data.
                            I stand corrected.