Forum Stats

  • 3,741,515 Users
  • 2,248,442 Discussions
  • 7,861,843 Comments

Discussions

Partitioning a table which is currently non partitioned

user12050819
user12050819 Member Posts: 4 Blue Ribbon
edited Apr 21, 2013 11:43PM in General Database Discussions
Hi,

I have stated my problem with the following example

today i have the following non partitioned table.

CREATE TABLE ITEM (
SEC_NBR NUMBER(10,0) NOT NULL,
ITEM_NAME VARCHAR2(400 BYTE) NOT NULL,
ITEM_NUMBER NUMBER(10,0) NOT NULL,
PURCHASE_TIME DATE NOT NULL,
CONSTRAINT "PK_ITEM" PRIMARY KEY ("SEC_NBR")
);

I have the following unique index

CREATE UNIQUE INDEX UK_ITEM ON ITEM (ITEM_NAME, ITEM_NUMBER, PURCHASE_TIME DESC);

SEC_NBR is generated with a oracle sequence during the record insert

Most of my queries on table are

QUERY1: select * from ITEM where SEC_NBR=?

and

QUERY2: select * from ITEM where ITEM_NAME=? and ITEM_NUMBER=? order by PURCHASE_TIME DESC;

Today things are working for the application due to the 2 indexes. I have tested it till about 30,000,000 records
Next release they want a system which is performent with 2,300,000,000 records. Not sure if indexes can really be that efficient.
Will it ?


Partitioning plan:

CREATE TABLE ITEM (
SEC_NBR NUMBER(10,0) NOT NULL,
ITEM_NAME VARCHAR2(400 BYTE) NOT NULL,
ITEM_NUMBER NUMBER(10,0) NOT NULL,
PURCHASE_TIME DATE NOT NULL,
EXPIRY_TIME DATE DEFAULT TO_DATE('31.12.9999 23:59:59', 'dd.mm.yyyy hh24:mi:ss'),
CONSTRAINT "PK_ITEM" PRIMARY KEY ("SEC_NBR")
)
PARTITION BY RANGE(EXPIRY_TIME)
(
PARTITION LATEST VALUES LESS THAN (MAXVALUE)
) ENABLE ROW MOVEMENT;

Modified local index
CREATE INDEX UK_ITEM ON ITEM (ITEM_NAME, ITEM_NUMBER, PURCHASE_TIME DESC) LOCAL;

Every day i will create a partition like this so that not expired data will be in LATEST partition and expired one will start moving to lower partitions

ALTER TABLE ITEM SPLIT PARTITION LATEST AT (TO_DATE('19.04.2013 23:59:59', 'dd.mm.yyyy hh24:mi:ss')) INTO (PARTITION PURGE_19042013, PARTITION LATEST);
ALTER TABLE ITEM SPLIT PARTITION LATEST AT (TO_DATE('20.04.2013 23:59:59', 'dd.mm.yyyy hh24:mi:ss')) INTO (PARTITION PURGE_20042013, PARTITION LATEST);
ALTER TABLE ITEM SPLIT PARTITION LATEST AT (TO_DATE('21.04.2013 23:59:59', 'dd.mm.yyyy hh24:mi:ss')) INTO (PARTITION PURGE_21042013, PARTITION LATEST);

When an item expires i receive a notification and I will update the EXPIRY_TIME to current time
update ITEM set EXPIRY_TIME=systimestamp where EXPIRY_TIME = TO_DATE('31.12.9999 23:59:59', 'dd.mm.yyyy hh24:mi:ss' and SEC_NBR=?

I can modify QUERY2 hoping it will work on local :
select * from ITEM where ITEM_NAME=? and ITEM_NUMBER=? and EXPIRY_TIME = TO_DATE('31.12.9999 23:59:59', 'dd.mm.yyyy hh24:mi:ss') order by PURCHASE_TIME DESC;

Will query 1 be OK since PK will be a global index and table will grow in size ? I am not able to modify the query 1 because other data is not available in that query flow.

Will partitioning benefit my use case ? Is it worth to partition my table to benefit just the Query2 so that it uses a local index and that partition will have very less data
Tagged:

Answers

  • Welcome to the forum!

    Whenever you post provide your 4 digit Oracle version.
    >
    Will query 1 be OK since PK will be a global index and table will grow in size ? I am not able to modify the query 1 because other data is not available in that query flow.

    Will partitioning benefit my use case ? Is it worth to partition my table to benefit just the Query2 so that it uses a local index and that partition will have very less data
    >
    No - partitioning won't provide any benefit. Why do you think it would or should?

    Neither of your queries include the DATE that you said you plan to partition on. So how can Oracle prune any partitions? It will have to look at all partitions to know if there is any data for your query.

    Partitioning generally only provides performance benefits if the queries and predicates used allow Oracle to 'prune' partitions; that is, eliminate some partitions from being scanned because it can determine that they CANNOT possibly have any data that the query needs.

    So if your only partition key column is EXPIRY_TIME then your queries need to include a value, or range of values, for EXPIRY_TIME or ALL partitions will need to be used.

    An extreme example for your table would be a LIST partition by ITEM_NUMBER. Then a query that uses 'WHERE ITEM_NUMBER = ?' would only need the partition for that one ITEM_NUMBER; the other partitions would not even be considered.
  • user12050819
    user12050819 Member Posts: 4 Blue Ribbon
    Modified query 2 has EXPIRY_TIME in the where clause.
  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy
    Hi,
    user12050819 wrote:
    Today things are working for the application due to the 2 indexes. I have tested it till about 30,000,000 records
    Next release they want a system which is performent with 2,300,000,000 records. Not sure if indexes can really be that efficient.
    Will it ?
    it depends: your (by design: unique) primary key will always be efficient regardless of the number of rows (you have to read only the root block, a few branch blocks, a leaf block and a table block for every access - ignoring row chaing or row migration). The efficiency of the second index is determinded by its selectivity and the clustering factor: if all the data for a combination (ITEM_NAME, ITEM_NUMBER) would reside in a few adjactent block (I assume they do not in your case) then the index access could also be fast with lots of rows. But if you have to read a lot of rows from a lot of blocks then this operation will became quite expensive.

    Two side notes: for the PK of a table with some billion rows I would perhaps prefer something bigger than NUMBER(10,0). And I would use partition names with YYYYMMDD instead of DDMMYYYY to get a meaningfull order.
    Will query 1 be OK since PK will be a global index and table will grow in size ? I am not able to modify the query 1 because other data is not available in that query flow.

    Will partitioning benefit my use case ? Is it worth to partition my table to benefit just the Query2 so that it uses a local index and that partition will have very less data
    A global index makes it hard to do administrative tasks on your table (something like partition exchange). And the row movement could make DML operations expensive if you have to adjust many rows. Apart from this I think that for your queries the partitioning would work the way you expect: the PK queries should be as fast as before and the second query should do the partition pruning using the EXPIRY_TIME and access the partition with the local index.

    Regards

    Martin
  • Hemant K Chitale
    Hemant K Chitale Member Posts: 15,755 Blue Diamond
    Next release they want a system which is performent with 2,300,000,000 records. Not sure if indexes can really be that efficient. Will it ?
    Actually, I think that the two indexes (PK and UK) will still suffice. The splitting by EXPIRY_TIME introduces complications.


    Hemant K Chitale
  • >
    Modified query 2 has EXPIRY_TIME in the where clause.
    >
    But you haven't answered the basic question: how many rows does query 2 typically return?

    If only 10 rows are returned Oracle is going to use an index so you don't even need to partition.
    >
    Every day i will create a partition like this so that not expired data will be in LATEST partition and expired one will start moving to lower partitions
    >
    That is a TERRIBLE idea. Don't even think about it.
    >
    When an item expires i receive a notification and I will update the EXPIRY_TIME to current time
    >
    That is the second most TERRIBLE idea after the one above. Why would you want to keep moving data around like that if you don't need to?
    >
    Will partitioning benefit my use case ? Is it worth to partition my table to benefit just the Query2 so that it uses a local index and that partition will have very less data
    Until you provide the basic information about the basic row counts you won't know

    1. How many rows will be UNEXPIRED at any given time?
    2. How many rows will query 2 typically return?

    Even assuming you really needed to do that data movement you should create your partitioned table by:

    1. Creating the table using ENABLE ROW MOVEMENT

    2. Create your own 'max value' partition using a date like '01/01/4000' and make it the default date for the column

    3. Manually create your partitions - there is no need at all to keep splitting a MAXVALUE partition.

    4. Update expired data by changing the expiration date - Oracle will automatically move the row to the correct partition for you.

    You should not even think about using partitioning this way and moving your data around constantly until:

    1. You have determined that you even have a problem to begin with

    2. You have eliminated other possible solutions to your problem
This discussion has been closed.