"~ Load million records from flat file to Oracle tables ~"
Could anyone please let me know what is the optimum solution to load this kind of huge volume of data into oracle tables, considering the memory access and other factors which option is best?
a. SQL Loader?
b. SQL Loader direct path access?
c. External Table?
d. Any other options!!!
Depends on some other factors also, but i would use external tables.
-where will flat files be located, on some remote machine or on the database server ?
-what db version are you using ?
-how often loading will be happening ?
-what operations are done on the loaded data ?
Interesting discussions about your doubt can be read here:
Thanks for your reply.
-where will flat files be located, on some remote machine or on the database server?
The flat file is there in the DB server... some oracle directory path which will be physically mapped to the physical location of the server.
-what db version are you using?
Oracle 10g Rel 2
-how often loading will be happening?
-what operations are done on the loaded data?
If it is an external table then, the selecting data would be requiring updating or inserting into some other tables.
Now, if I have million records in external table and from there I have to load data into other heap tables then how the memory will be used... Means will it used the buffer memory or OS memory or Oracle server memory?
If I have to load data from this external table to the other oracle table which will be the most efficient method to achieve this performance wise?
Edited by: user12854528 on 25-Mar-2010 08:05
Looking at your requirements, direct load using SQL*Loader seems to be appropriate. I do not have any performance figures of using impdb vs imp direct, but if impdp can outperform imp direct, then impdp would be the way to go.
How many millions of rows are expected ? The load should not take more than a few hours, if tuned appropriately (assuming sufficient hardware resources are available)
Thanks for your reply. I am expecting 1.5 to 2 millions of record in the flat file.
After seeing the replies I have summarised the below
"SQL *Loader direct Path"
Steps: 1. Load data directly into the staging table
2. Process the data as per the business requirements
Advantage: 1. Less number of steps involved
2. Direct path SQL*Loader is fast and efficient, It is even fast when indexes on the target table are not dropped before the load
Disadvantage: 1. As for disadvantages, direct path loading has implications for recovery akin to the NOLOGGING keyword. Also, indexes on the target table are unusable for a period during the load. This can impact users trying to access the target table while a load is in progress. Also, indexes can be left in an unusable state if the SQL*Loader sessions were to crash midway through the load.
Steps: 1. Load data into external table
2. From external table load the data into staging table through data pump utility
3. Process the data as per the business requirements
Advantage: 1. Data readily available once the source flat file is available in the server
Disadvantage: 1. More number of steps involved
From here it is viewable that in this case the SQL *Loader direct path method is the better choice between these two.
Could you please let me if this is correct?
if the file does not reside on remote server (and if you load large volumes and want speed and thus probably keep the data on local volume), then external tables will give you identical speed to SQL Loader.
I would highly recommend to forget SQL*LOADER forever and use external tables. The only advantage of SQL Loader is the ability to load data over network.
one more advantage of sqlloader prior 10g was loading LOB, this can be done in 10g using external tables as well
to be honest few millions of rows should be very quick task for external table and can be done in parallel
external tables don't have more steps, actually I believe external tables have less steps
SQLLOADER needs to execute unix command to load the data to staging, then possibly data manipulation. It needs control file (a file stored outside database) and the process what loads data needs access to unix
EXTERNAL TABLE use simple INSERT INTO ... SELECT ... FROM EXTERNAL TABLE and the data manipulation can be done in the same SQL clause (unless is very complex). All objects are stored in database, there is no need to access unix (unless you want to see log/bad file - which can be viewed using second external table)