2 Replies Latest reply on Oct 12, 2009 2:54 PM by 630782

    Bulk loading in


      I'm using bulk load to load about 200 million triples into one model in The data is splitted into about 60 files with around 3 millions triples in each file. I have a script file which has
      host sqlldr ...FILE1;
      exec sem_apis.bulk_load_from_staging_table(...);
      host sqlldr ...FILE2;
      exec sem_apis.bulk_load_from_staging_table(...);
      for every file to load.

      When I run the script from command line, it looks that the time needed for the loading grows as more files are loaded. The first file took about 8 min to load, the second file took about 25 min,... It's now taking 2 and half hour to load one file after completing loading 14 files.

      Is index rebuild causing this behavior? If that's the case is there any way to turn off the index during bulk loading? If the index rebuild is not the case what other parameters can we adjust to speed up the bulk loading?


        • 1. Re: Bulk loading in
          Bulk-append is slower than bulk-load because of incremental index maintenance. The uniqueness constraint enforcing index cannot be disabled. I'd suggest moving to and then installing patch 7600122 to be able to make use of enhanced bulk-append that performs much better than in

          The best way to load 200 million rows in would be to load into an empty RDF model via a single bulk-load. You can do it as follows (assuming the filenames are f1.nt thru f60.nt):

          - [create a named pipe] mkfifo named_pipe.nt
          - cat f*.nt > named_pipe.nt

          on a different window:
          - run sqlldr with named_pipe.nt as the data file to load all 200 million rows into a staging table (you could create staging table with COMPRESS option to keep the size down)
          - next, run exec sem_apis.bulk_load_from_staging_table(...);

          (I'd also suggest use of COMPRESS for the application table.)
          • 2. Re: Bulk loading in
            Thanks for the help. I combined all the files and loaded it in a single batch. It took significantly less time to complete the load.