This discussion is archived
5 Replies Latest reply: Jul 12, 2013 5:04 AM by user8653607 RSS

sdo_geor.reproject  performance-problem

user8653607 Newbie
Currently Being Moderated

Hi,

 

actually I transform a georaster-Object from epsg-Code:31467 to egsg-Code:4647 and the transformation took really really long.

Is there a way to improve the performance of sdo_geor.reproject, maybe use multi-processor-processing?

 

details:

- importing 4100 rasterimages (rgb with 5000x5000 pixel) in Oracle an build the mosaic needs roud about 1 day

- reproject the mosaic runs now since 13 days and still is running

  • 1. Re: sdo_geor.reproject  performance-problem
    JeffreyXie Journeyer
    Currently Being Moderated

    looks like the size of your total data set is about 286GB. how are they overlapping each other? how are they blocked and interleaved in the database and are they compressed? are the source georasters sharing the same RDT table? are the RDTs in the same tablespace and how many datafiles in each tablespace if you are not using ASM? what is your server's hardware configuration? based on these info we should be able to give you some recommendation for performance improvement and even replicate your tests.

     

    Yes, you can leverage concurrent processing to reproject the mosaic in parallel to speed up - by some scripting in 11g and we can help with some sample scripts.

     

    But it would be better to upgrade to 12c if you could - The new mosaic (sdo_geor_agg.mosaicSubset, which supports reprojection too) and rectification functions are significantly more powerful and faster and support parallelism. You simply specify a single number for the DoP (degree of parallelism) parameter when you call the mosaic function.

     

    Best,

    Jeffrey Xie

  • 2. Re: sdo_geor.reproject  performance-problem
    user8653607 Newbie
    Currently Being Moderated

    Hello Jeffrey,

     

     

    yes the total size of our data is 286 GB. Most of the single images don't overlapp, they touch, the rest around 18% (imagesize: 6500x5200) overlap by 250x5000 pixel.

    The blocksize is (512,512,3) , interleaving type is BIP and they are not compressed. The source georaster for the mosaic are in the same RDT-Table.

    All RDTs are in the same tablespace and we use 1 File in the tablespace - bigfile tablespace - and not ASM.

    Our Hardware-Configuration is: 12 GB RAM with automatic memory management, 8 CPUs.

     

    We can not switch to Oracle 12c in the near future, actually we have to use 11.2.0.3. But we will keep it in mind to switch as soon as possible.

    Can you sent me some sample scripts for parallel processing sdo_geor.reproject?

     

    best regards,

    Bertram

  • 3. Re: sdo_geor.reproject  performance-problem
    JeffreyXie Journeyer
    Currently Being Moderated

    Bertram,

     

    thanks for the info. (1) make sure you've created a temporary tablespace with autoextend for the user. (2) always devote a single RDT table to large georaster object - use different RDT's for the mosaic and the reprojected mosaic. (3) when you mosaick, better use a larger block size than 512x512 for the output mosaic and let the mosaic function optimize the blocking size automatically and do the same for the reprojection output. for reprojection, the speed is certainly unexpected, it would be helpful to get an AWR report to see what bottle neck is. By default, the database will take a snapshot every hour, so you should be able to get the AWR report between the snapshots when the reprojection is running.

     

    not sure if you used securefile for the RDT lob storage. otherwise, make sure when you create the RDT's you use SecureFiles for the LOB storage, which is quite a few times faster than the default BasicFiles storage.

  • 4. Re: sdo_geor.reproject  performance-problem
    fechen Newbie
    Currently Being Moderated

    Hi,

     

    Here is the script for the concurrent/parallel reprojection. The idea is to take advantage of large RAM and multiple CPUs by concurrently using sdo_geor.reproject with crop_area to reproject the mosaicked image into tiles. Then use the sdo_geor.mosaic again to mosaic the reprojected tiles into one image. To avoid congestion on the IO, it would be helpful to create different tablespaces for read and write, and for each tablespace use multiple data files.

     

    This is the top level shell script to invoke the reprojection concurrently:

     

    <code>

     

    #!/bin/bash

     

    concurrency=$1

    out_srid=$2

     

    echo "=============================================="

    echo "Concurrency="${concurrency}

     

    # the dimension of the source image

    xdim=100870

    ydim=159697

     

    p0x=0

    p0y=0

    p1x=0

    p1y=0

    size=$((xdim/concurrency))

    echo size is ${size}

     

    start_id=1

    overlap=20

     

    date

    d1="$(date +%s)"

    while [ ${p1y} -lt ${ydim} ]

    do

      ((p1y+=size))

      p0x=0

      p1x=0

     

      while [ ${p1x} -lt ${xdim} ]

        do

          echo "id="${start_id}

     

          ((p1x+=size))

     

        sqlplus georaster/georaster @rpj.sql ${start_id} ${p0x} ${p0y} ${p1x} ${p1y} ${out_srid} > rpj_${concurrency}_${start_id}.log  &

     

          (( start_id++))

          p0x=$((p1x - overlap))

        done

     

      wait

      p0y=$((p1y - overlap))

     

    done

     

    date

    d2="$(date +%s)"

    d3="$(expr $d2 - $d1)"

    echo "Total time for concurrency of "$concurrency" is "$d3" seconds."

     

    </code>

     

    Here is the reprojection script rpj.sql invoked by the shell script:

     

    <code>

    -- reproject part of the source image into a reprojected tile

     

    define n=&1         -- id of the reprojected tile

    define r1=&2        -- row number of the upper left corner

    define c1=&3        -- column number of the upper left corner

    define r2=&4        -- row number of the lower right corner

    define c2=&5        -- column number of the lower right corner

    define SRID=&6      -- the target srid of the reprojection

     

    set timing on;

    declare

      gr1  sdo_georaster;

      gr2  sdo_georaster;

    begin

      select grobj into gr1 from gr_mosaic where id = 1;

      delete from gr_tiles_rpj where id=&n;

      insert into gr_tiles_rpj (id, description, grobj) values ( &n,

         '&r1 '||'&c1 ' ||'&r2 ' || '&c2', sdo_geor.init('gr_tiles_rpj_rdt1', &n)) 

           returning grobj into gr2;

      SDO_GEOR.reproject( gr1, 0, sdo_number_array(&r1, &c1, &r2, &c2), null,

                          'resampling=bilinear', null, &SRID, gr2);

      update gr_tiles_rpj set grobj = gr2 where id = &n;

      commit;

    end;

    /

     

    exit

     

    </code>

     

    The next step is to mosaick the reprojected tiles into one image. It should be similar to your previous mosaic script.

     

    Hope it helps.

     

    Fengting

  • 5. Re: sdo_geor.reproject  performance-problem
    user8653607 Newbie
    Currently Being Moderated

    Hi Jeffrey and Fengting,

     

    thanks for your tips and the script.

     

    we found some problems which decrease the reprojecting-performance

    - the OS of our DB-server swapped data during reprojecting, because of insufficient memory. Now we get a new server and hope it will become better.

    - we defined our own MDSYS.SDO_COORD_OPS  with a large ascii-file in MDSYS.SDO_COORD_OP_PARAM_VALS  this decreased the performance too

     

    yes, we use SecureFiles for storage the LOB.

     

    Now we try

    - parallel reprojecting after importing the rasterfiles and the build the mosaic afterwards

    - using different RDT for reproject, mosaic and generatePyramid

    - automatically optimize the blocking size for mosaic function

    - different tablespaces for read/write

     

    I will write in a few days when we will have a result

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points