Forum Stats

  • 3,854,634 Users
  • 2,264,392 Discussions
  • 7,905,743 Comments

Discussions

Splitting up a file using PL/SQL after having created it?

Lentoporo1895
Lentoporo1895 Member Posts: 25
edited Mar 8, 2018 5:48PM in SQL & PL/SQL

Hello dear experts,

I am working for a customer, that exports data to later import them into their SAP environment. The export is quite complex and even uses the APEX-tables. The export is done by several queries, that construe a new query, that is eventually used to do the export of the data and saves it into a file. The export and the creation of the file are fine and they also work performantly.

The project leader has signalised a performance problem during the import of the file. Apparantly SAP has a problem with the import of lots of records. The project leader has suggested to implement some kind of counter to reduce the file size. When the export would have been done using PL/SQL, that might be one solution, but it is not. I do not want to touch the concept behind the export using SQL, because of the complexity of the matter.

What I would like to ask you is: is there a way of splitting up the file into several smaller files using PL/SQL? If yes, how?

Thank you very much for your help!

Hannu

Tagged:
Lentoporo1895GregVBEDEmathguy
«1

Answers

  • GregV
    GregV Member Posts: 3,096 Gold Crown
    edited Mar 8, 2018 8:14AM

    Hi,

    Looks more like a SAP problem than an Oracle one. In Unix you can use the "split" command to split a file into several smaller files. If you want to do it in PL/SQL you'll need to read the file to split into other files.

  • John_K
    John_K Member Posts: 2,498 Gold Trophy
    edited Mar 8, 2018 8:18AM

    You could... or you could just use something simple like the unix split command - that is going to be far quicker.

    image

    image

    SPLIT(1)                                                                                      User Commands                                                                                      SPLIT(1)NAME       split - split a file into piecesSYNOPSIS       split [OPTION]... [INPUT [PREFIX]]DESCRIPTION       Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'.  With no INPUT, or when INPUT is -, read standard input.       Mandatory arguments to long options are mandatory for short options too.       -a, --suffix-length=N              generate suffixes of length N (default 2)       --additional-suffix=SUFFIX              append an additional SUFFIX to file names       -b, --bytes=SIZE              put SIZE bytes per output file       -C, --line-bytes=SIZE              put at most SIZE bytes of lines per output file       -d, --numeric-suffixes[=FROM]              use numeric suffixes instead of alphabetic; FROM changes the start value (default 0)       -e, --elide-empty-files              do not generate empty output files with '-n'       --filter=COMMAND              write to shell COMMAND; file name is $FILE       -l, --lines=NUMBER              put NUMBER lines per output file       -n, --number=CHUNKS              generate CHUNKS output files; see explanation below       -u, --unbuffered              immediately copy input to output with '-n r/...'       --verbose              print a diagnostic just before each output file is opened       --help display this help and exit       --version              output version information and exit       SIZE is an integer and optional unit (example: 10M is 10*1024*1024).  Units are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000).       CHUNKS  may  be:  N       split into N files based on size of input K/N     output Kth of N to stdout l/N     split into N files without splitting lines l/K/N   output Kth of N to stdout without       splitting lines r/N     like 'l' but use round robin distribution r/K/N   likewise but only output Kth of N to stdout       GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report split translation bugs to <http://translationproject.org/team/>AUTHOR       Written by Torbjorn Granlund and Richard M. Stallman.COPYRIGHT       Copyright © 2013 Free Software Foundation, Inc.  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.       This is free software: you are free to change and redistribute it.  There is NO WARRANTY, to the extent permitted by law.SEE ALSO       The full documentation for split is maintained as a Texinfo manual.  If the info and split programs are properly installed at your site, the command              info coreutils 'split invocation'       should give you access to the complete manual.GNU coreutils 8.22                                                                            September 2016   
    BEDE
  • Paulzip
    Paulzip Member Posts: 8,756 Blue Diamond
    edited Mar 8, 2018 8:29AM
    1. What is the file type, CSV?
    2. Where would you split the file? On some record boundary like a chr(10)?

    I suspect SAP won't work if you arbitrarily split it based on say size.

    Lentoporo1895
  • Lentoporo1895
    Lentoporo1895 Member Posts: 25
    edited Mar 8, 2018 8:50AM

    Yes, the file type is CSV. The CSV file would be splitted according to a number of records. The structure is following: The first column of the file represents the 'master value' - a number that represents a person. This person can have several products and each product can have up to 500 records. The total number of records in one single file has to be <= 20.000 records.

    I will communicate your ideas with my project team colleagues.

    Thank you very much!

  • Paulzip
    Paulzip Member Posts: 8,756 Blue Diamond
    edited Mar 8, 2018 9:00AM

    Personally I'd split files at file generation time, rather than post generation process.  Mainly because you need to know the relationship between records.  It's a much more complicated process doing it post generation.

    GregV
  • Lentoporo1895
    Lentoporo1895 Member Posts: 25
    edited Mar 8, 2018 9:11AM

    So would I, but that would mean adding PL/SQL to a very complex and complicated structure. Since I will not be working long there, there is no time for me to change it. An internal employee would be able to do it, when he gets the time to get the concept behind it all.

  • Paulzip
    Paulzip Member Posts: 8,756 Blue Diamond
    edited Mar 8, 2018 9:26AM

    OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.

    Lentoporo1895Lentoporo1895mathguy
  • Lentoporo1895
    Lentoporo1895 Member Posts: 25
    edited Mar 8, 2018 9:42AM

    Thank you Paul! That is an excellent idea (after having checked your blog I must say, that I am not surprised). I will get back to you on this. (next week)

  • mathguy
    mathguy Member Posts: 10,685 Blue Diamond
    edited Mar 8, 2018 10:35AM
    Paulzip wrote:OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.

    This is a much better solution than using UNIX utilities. There are many reasons for that; just to take an example, suppose we write a script to split at chr(10) (say, at every 20,000 th such character). What happens if, as in a problem posted on this forum yesterday, some of the values in the tables are VARCHAR2 strings with embedded chr(10)? How is the UNIX utility going to know that (unless we write our own version of mounting the file as an external table)?

    As it happens, I am helping someone right now (as a volunteer) with a process that is facing the opposite problem. They must build tables from data (stored in tables) from an outside source. They can download the data as CSV files, but they are limited to X rows per day (!!!) - and I don't think I violate confidentiality if I disclose that the source is Oracle itself. To mention just one of the issues: however the routines were written, the incoming files don't have a newline at the end of the last line. I thought people did jail time for that, but whatever... Anyway, a pain in the neck. The group in charge of importing the data refuses to import many small files; they want everything concatenated into one big file. Imagine the fun of just adding a newline at the end of each file before concatenating...

    Paul: question for you (perhaps you know so I don't need to test). What happens if you try to mount a file as an external table, and the file does not have a newline after the last row? Will it still work OK?

  • Paulzip
    Paulzip Member Posts: 8,756 Blue Diamond
    edited Mar 8, 2018 10:40AM
    mathguy wrote:Paulzip wrote:OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.Paul: question for you (perhaps you know so I don't need to test). What happens if you try to mount a file as an external table, and the file does not have a newline after the last row? Will it still work OK?

    Yes, it'll still work.

This discussion has been closed.