This discussion is archived
5 Replies Latest reply: Apr 22, 2013 11:49 AM by user5716448 RSS

Performance when row movement enabled

user5716448 Newbie
Currently Being Moderated
Hi,

Using oracle 11.2.0.3 and considering using a partition key which volatile as this the best value to achieve partition pruning on as will be used in most queries.

Would then be date-range partitioned with hash sub partitions.

Considering using a given date and range-partitioning on this column - present on both fact and dimension table. However after the records inserted, this date has the potential to change - expected only affet max 5% of rows - load roughly 2 million rows into table daily so would need execute update to ensure data in correct partitions.


Know best if have a non-volatile partition key but just this one is best for our purposes - never used a volatile partition key before.

Anybody any experience on how bad from performance perspective is compared to say standard update of column which not part of partition key ?
Have some spare time at etl side and would make reporting quicker due to partition pruning but

Thanks
  • 1. Re: Performance when row movement enabled
    Iordan Iotzov Expert
    Currently Being Moderated
    The deterioration in update performance for cross-partition update will depend on
    ->size of the record /the bigger the record the worst cross-partition update time/
    ->the number of indexes /the more indexes the worst cross-partition update time/
    ->the number of updated columns with indexes /the more columns with indexes the smaller the difference between regular and cross-partition update time/

    Cross-partition update time could easily be 2-3 times slower that a regular update.
    You need to run a test in your system though.
    Also, you need to see how cross-partition update affect the clustering factors of your indexes.

    Iordan Iotzov
    http://iiotzov.wordpress.com/
  • 2. Re: Performance when row movement enabled
    rp0428 Guru
    Currently Being Moderated
    >
    Using oracle 11.2.0.3 and considering using a partition key which volatile as this the best value to achieve partition pruning on as will be used in most queries.

    Would then be date-range partitioned with hash sub partitions.

    Considering using a given date and range-partitioning on this column - present on both fact and dimension table. However after the records inserted, this date has the potential to change - expected only affet max 5% of rows - load roughly 2 million rows into table daily so would need execute update to ensure data in correct partitions.

    Know best if have a non-volatile partition key but just this one is best for our purposes - never used a volatile partition key before.
    >
    The data for a table is stored in a segment that represents the FINEST granularity available: table, partition, subpartition. Since you have a subpartitioned table that is where the data is physically stored.

    So if you change the partition key to a value that represents a different partition Oracle has to DELETE each row from the current partition and INSERT that row into the new partition.

    If ALL of the data in a partition/subpartition is being moved to a new, empty partition you can often use EXCHANGE PARTITION and avoid the DELETE/INSERT. That won't be applicable for your use case though.
    >
    Anybody any experience on how bad from performance perspective is compared to say standard update of column which not part of partition key ?
    >
    How bad? About as bad as it gets. A standard update is done 'in-place'; the row doesn't need to move. The amount of UNDO and REDO is minimal since only the old/new values of the one column needs to be logged.

    But update of the partition key is different. If the value is updated to a value that represents the same partition the update MIGHT be done in place. For range partitioning this is the case but I haven't tested hash partitioning/subpartitioning. I don't know if Oracle does it in-place for hash or not.

    Definitely for a key update representing a different partition you have the worst of both UNDO and REDO. The UNDO needs to have the entire existing row while the REDO has to have the entire new row; even though the row itself hasn't changed. You can run some simple tests that show the amount of UNDO and REDO that are generated for row movement.

    Summary - you want to avoid moving rows from one partition/subpartion to another whenever possible.

    Why does this date change? Are there any other columns that could be used to partition by?

    If you post a small sample of data and the type of query you need to use to get partition pruning maybe there are some other alternatives.
  • 3. Re: Performance when row movement enabled
    user5716448 Newbie
    Currently Being Moderated
    Thanks for the useful advice.


    query
    select sum(sales), product_name, customer_name, yrwk
    from sales,
    customers, 
    products
    where sales.product_id = products.dimension_key
    and sales.customer_id = customer.dimension_key
    and product.on_sale_yrwk between 201101 and 201104
    data would be say 2, bella, john smith, 291101 etc

    Sometimes will use in opertaor but often between operator in the predicates.
  • 4. Re: Performance when row movement enabled
    rp0428 Guru
    Currently Being Moderated
    >
    data would be say 2, bella, john smith, 291101 etc

    Sometimes will use in opertaor but often between operator in the predicates.
    >
    But that isn't what your thread is about. You are talking about moving data around and you didn't respond to these:
    >
    Why does this date change? Are there any other columns that could be used to partition by?

    If you post a small sample of data and the type of query you need to use to get partition pruning maybe there are some other alternatives.
    >
    Why are you changing the dates?
  • 5. Re: Performance when row movement enabled
    user5716448 Newbie
    Currently Being Moderated
    Hi,

    The dates have potential to change as we take feed of master data each day into warehouse and master data may update the product on sale_date whihc stored in the dimesnion and also the fact as decsiion made that this date best for querying upon.

    Thanks

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points