Discussions
Categories
- 196.7K All Categories
- 2.2K Data
- 234 Big Data Appliance
- 1.9K Data Science
- 449.8K Databases
- 221.5K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 549 MySQL Community Space
- 477 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 532 SQLcl
- 4K SQL Developer Data Modeler
- 186.8K SQL & PL/SQL
- 21.2K SQL Developer
- 295.3K Development
- 17 Developer Projects
- 138 Programming Languages
- 292K Development Tools
- 104 DevOps
- 3.1K QA/Testing
- 645.9K Java
- 27 Java Learning Subscription
- 37K Database Connectivity
- 153 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 17 Java Essentials
- 157 Java 8 Questions
- 85.9K Java Programming
- 79 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.2K Java SE
- 13.8K Java Security
- 203 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 389 LiveLabs
- 37 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.6K Other Languages
- 2.3K Chinese
- 170 Deutsche Oracle Community
- 1K Español
- 1.9K Japanese
- 230 Portuguese
Splitting up a file using PL/SQL after having created it?

Hello dear experts,
I am working for a customer, that exports data to later import them into their SAP environment. The export is quite complex and even uses the APEX-tables. The export is done by several queries, that construe a new query, that is eventually used to do the export of the data and saves it into a file. The export and the creation of the file are fine and they also work performantly.
The project leader has signalised a performance problem during the import of the file. Apparantly SAP has a problem with the import of lots of records. The project leader has suggested to implement some kind of counter to reduce the file size. When the export would have been done using PL/SQL, that might be one solution, but it is not. I do not want to touch the concept behind the export using SQL, because of the complexity of the matter.
What I would like to ask you is: is there a way of splitting up the file into several smaller files using PL/SQL? If yes, how?
Thank you very much for your help!
Hannu
Answers
-
Hi,
Looks more like a SAP problem than an Oracle one. In Unix you can use the "split" command to split a file into several smaller files. If you want to do it in PL/SQL you'll need to read the file to split into other files.
-
You could... or you could just use something simple like the unix split command - that is going to be far quicker.
SPLIT(1) User Commands SPLIT(1)NAME split - split a file into piecesSYNOPSIS split [OPTION]... [INPUT [PREFIX]]DESCRIPTION Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT is -, read standard input. Mandatory arguments to long options are mandatory for short options too. -a, --suffix-length=N generate suffixes of length N (default 2) --additional-suffix=SUFFIX append an additional SUFFIX to file names -b, --bytes=SIZE put SIZE bytes per output file -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file -d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic; FROM changes the start value (default 0) -e, --elide-empty-files do not generate empty output files with '-n' --filter=COMMAND write to shell COMMAND; file name is $FILE -l, --lines=NUMBER put NUMBER lines per output file -n, --number=CHUNKS generate CHUNKS output files; see explanation below -u, --unbuffered immediately copy input to output with '-n r/...' --verbose print a diagnostic just before each output file is opened --help display this help and exit --version output version information and exit SIZE is an integer and optional unit (example: 10M is 10*1024*1024). Units are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000). CHUNKS may be: N split into N files based on size of input K/N output Kth of N to stdout l/N split into N files without splitting lines l/K/N output Kth of N to stdout without splitting lines r/N like 'l' but use round robin distribution r/K/N likewise but only output Kth of N to stdout GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report split translation bugs to <http://translationproject.org/team/>AUTHOR Written by Torbjorn Granlund and Richard M. Stallman.COPYRIGHT Copyright © 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.SEE ALSO The full documentation for split is maintained as a Texinfo manual. If the info and split programs are properly installed at your site, the command info coreutils 'split invocation' should give you access to the complete manual.GNU coreutils 8.22 September 2016
-
- What is the file type, CSV?
- Where would you split the file? On some record boundary like a chr(10)?
I suspect SAP won't work if you arbitrarily split it based on say size.
-
Yes, the file type is CSV. The CSV file would be splitted according to a number of records. The structure is following: The first column of the file represents the 'master value' - a number that represents a person. This person can have several products and each product can have up to 500 records. The total number of records in one single file has to be <= 20.000 records.
I will communicate your ideas with my project team colleagues.
Thank you very much!
-
Personally I'd split files at file generation time, rather than post generation process. Mainly because you need to know the relationship between records. It's a much more complicated process doing it post generation.
-
So would I, but that would mean adding PL/SQL to a very complex and complicated structure. Since I will not be working long there, there is no time for me to change it. An internal employee would be able to do it, when he gets the time to get the concept behind it all.
-
OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.
-
Thank you Paul! That is an excellent idea (after having checked your blog I must say, that I am not surprised). I will get back to you on this. (next week)
-
Paulzip wrote:OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.
This is a much better solution than using UNIX utilities. There are many reasons for that; just to take an example, suppose we write a script to split at chr(10) (say, at every 20,000 th such character). What happens if, as in a problem posted on this forum yesterday, some of the values in the tables are VARCHAR2 strings with embedded chr(10)? How is the UNIX utility going to know that (unless we write our own version of mounting the file as an external table)?
As it happens, I am helping someone right now (as a volunteer) with a process that is facing the opposite problem. They must build tables from data (stored in tables) from an outside source. They can download the data as CSV files, but they are limited to X rows per day (!!!) - and I don't think I violate confidentiality if I disclose that the source is Oracle itself. To mention just one of the issues: however the routines were written, the incoming files don't have a newline at the end of the last line. I thought people did jail time for that, but whatever... Anyway, a pain in the neck. The group in charge of importing the data refuses to import many small files; they want everything concatenated into one big file. Imagine the fun of just adding a newline at the end of each file before concatenating...
Paul: question for you (perhaps you know so I don't need to test). What happens if you try to mount a file as an external table, and the file does not have a newline after the last row? Will it still work OK?
-
mathguy wrote:Paulzip wrote:OK, if I was doing at as a post process I'd mount files as external tables (one for each of master value, persons, products etc) adding a recnum column to each external table. Identify record groupings by joining records and write a simple routine using UTL_FILE to write those particular group lines to files, based on the recnum lines identified.Paul: question for you (perhaps you know so I don't need to test). What happens if you try to mount a file as an external table, and the file does not have a newline after the last row? Will it still work OK?
Yes, it'll still work.