This discussion is archived
13 Replies Latest reply: Apr 24, 2013 2:47 PM by eebbesen RSS

Alternative to sequence and timestamp: uniquely ordering records in time

eebbesen Newbie
Currently Being Moderated
Alternative to sequence and timestamp: uniquely ordering records in time

I'm required to uniquely identify when records are created in time. Sequence caching means I cannot use a sequence-based ID and batching inserts means that all records inserted in one batch will have the same timestamp value (even using TIMESTAMP(9)).

The best alternatives I've ideated so far
* creating an additional sequence used for each unique timestamp
* not batching inserts of the records to force a unique timestamp for each record

Here's my requirement: I have an API that allows users to supply a sequence as a marker and request all records since that time. For example, they request 1000 records with a marker of 7 and they'll get the next 1000 records in my table. For example, they request 1000 records with a marker of 7 and they'll get 1000 records from my table with an ID greater than 1007. As an example let's say the numerically greatest ID of the returned 1000 records is 2045 so we return 2045 as the marker Later the clients request 1000 records with a marker of 2045 expecting to get the next batch of 1000 and a new marker.

Pretty straightforward way to allow them to get all of the records in whatever size works for them without missing any. However, due to sequence caching across multiple Exadata nodes, at the time the client requests 1000 records with a marker of 1007, a record with an ID of 2020 may not have been created. Therefore, when they do the next request using the marker of 2045, they will have missed record 2020 forever. Using the ID to get the timestamp of the associated record solves this, but then I must make sure to always insert records into the table individually to guarantee unique timestamps.

Hopefully I just haven't hit on the correct terms to search for existing answers. I feel that this is a problem that should have been solved by some pattern(s) for years.

Thank you for your time.
  • 1. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    rp0428 Guru
    Currently Being Moderated
    >
    Alternative to sequence and timestamp: uniquely ordering records in time

    I'm required to uniquely identify when records are created in time. Sequence caching means I cannot use a sequence-based ID and batching inserts means that all records inserted in one batch will have the same timestamp value (even using TIMESTAMP(9)).

    The best alternatives I've ideated so far
    * creating an additional sequence used for each unique timestamp
    * not batching inserts of the records to force a unique timestamp for each record

    Here's my requirement: I have an API that allows users to supply a sequence as a marker and request all records since that time. For example, they request 1000 records with a marker of 7 and they'll get the next 1000 records in my table. For example, they request 1000 records with a marker of 7 and they'll get 1000 records from my table with an ID greater than 1007. As an example let's say the numerically greatest ID of the returned 1000 records is 2045 so we return 2045 as the marker Later the clients request 1000 records with a marker of 2045 expecting to get the next batch of 1000 and a new marker.

    Pretty straightforward way to allow them to get all of the records in whatever size works for them without missing any. However, due to sequence caching across multiple Exadata nodes, at the time the client requests 1000 records with a marker of 1007, a record with an ID of 2020 may not have been created. Therefore, when they do the next request using the marker of 2045, they will have missed record 2020 forever. Using the ID to get the timestamp of the associated record solves this, but then I must make sure to always insert records into the table individually to guarantee unique timestamps.

    Hopefully I just haven't hit on the correct terms to search for existing answers. I feel that this is a problem that should have been solved by some pattern(s) for years.
    >
    The ONLY way to ensure that records are SERIALIZED is to use a SERIALIZED process to insert the records.

    The issue has NOTHING to do with sequences or timestamps. It has EVERYTHING to do with the fact that you are using a multi-user (i.e. non-serialized) process.

    One session CANNOT see uncommitted data in other sessions. A session can ONLY use an INSERT statement to put data into a table but that INSERT statement is NOT transactional; there is no commit associated with it. That means that another session can not see the first sessions data that was 'inserted' until it is committed. So if the second session does an INSERT then how should that 'MAX MARKER' number be determined?

    You have to use a serialized process whereby ALL inserts are performed by the same session.

    Are you REALLY, REALLY sure you want to serialize the insert process for this table?

    And this isn't necessarily true at all
    >
    batching inserts means that all records inserted in one batch will have the same timestamp value (even using TIMESTAMP(9)).
    >
    That depends entirely on how you determine the timestamp value being used. With 9 digits of precision you could always just use a 'rownum-like' value for each row as the nano-seconds to make the values different.
  • 2. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    sb92075 Guru
    Currently Being Moderated
    eebbesen wrote:
    Alternative to sequence and timestamp: uniquely ordering records in time

    I'm required to uniquely identify when records are created in time. Sequence caching means I cannot use a sequence-based ID and batching inserts means that all records inserted in one batch will have the same timestamp value (even using TIMESTAMP(9)).
    userid column is included in records being inserted
    each user has their own SEQUENCE; which populates column in records being inserted.
  • 3. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    L-MachineGun Pro
    Currently Being Moderated
    You could try a combination of SYSTIMESTAMP() and ROWNUM which would serialize the batch within the timestamp(9),
    being aware also of the above 2 posts.
    :p
  • 4. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    JohnWatson Guru
    Currently Being Moderated
    You might be able to do something with an Advanced Queue. Have your batch insert process enqueue a message for each row, with the row's sequence number (the sequence can be CACHE NOORDER). The messages will become visible when the insert batch commits. Then to retrieve rows, don't go directly to the table: go to the queue, and read and dequeue messages a thousand at a time, starting with the oldest. AQ can ensure that each message is dequeued exactly once. Out-of-order inserts or gaps in the sequence or serializing the inserts won't be a problem, because the messages in the queue will be serialized. You can even have several processes dequeueing concurrently, and they will each get their own group of messages.
    Any good?
  • 5. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Update -- I'm going to address the individual responses above. Just wanted to say thanks to everyone for quick replies and to apologize for my dense (and somewhat obtuse) description of the problem.

    Edited by: eebbesen on Apr 24, 2013 1:59 PM
  • 6. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Thank you John.

    I may end up using this solution if the L-MachineGun approach does not work for me -- that one has a lower overhead/upfront cost since I just have to modify my query

    What I didn't mention above is that I need to have the process be repeatable -- clients may wish to retrieve the same records many times using the same or a different marker and/or fetch size. So what I'd like to do is have each record get the combination of identifiers I need during the first insert (no updates).

    I can see employing the queue as the mechanism by which I queue records for insert or for some other job that assigns a unique, sequential (gaps are fine) identifier to each row. I am concerned, however, that whatever process would do this would itself become a bottleneck, especially in the context of an Exadata or rack deployment.
  • 7. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Thank you for your response rp0428.

    I am hoping to have my cake and eat it, too -- meaning I want to have the result of serialization without serializing :).

    In any case, I think that the L-MachineGun approach is what I'm going to try first and return to serialization (via a queue or other means) if that doesn't work for me.

    UPDATE: After reading and re-reading your suggestion, I'm kicking myself because I think the answer is indeed using ROWNUM -- Oracle's built in marker. I'm embarrassed to admit that this didn't come to me before. However, then I read [http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html] and it looks like that won't work for me consistently since it is not guaranteed to be constant over time and after migrations, etc.

    On the timestamp front, I did find that the timestamp (even to 9 units of precision) was the same for bulk/batch insert companions.

    Cheers!

    Edited by: eebbesen on Apr 24, 2013 2:14 PM

    Edited by: eebbesen on Apr 24, 2013 2:17 PM
  • 8. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    rp0428 Guru
    Currently Being Moderated
    You missed this part
    >
    That depends entirely on how you determine the timestamp value being used. With 9 digits of precision you could always just use a 'rownum-like' value for each row as the nano-seconds to make the values different.
    >
    That meant if you need a TIMESTAMP you could construct your own value by using a 'rownum-like' value as the nano-seconds part.

    So instead of inserting SYSTIMESTAMP (which, as you suggest, would use the same value for ALL rows) you would modify the nano-second part of SYSTIMESTAMP to make it different for each row. By computing a ROWNUM for each batch of records you would add that ROWNUM (as a nano-second value: 1 => 1 nanosecond, 2 => 2 nanoseconds, etc) to the SYSTIMESTAMP value.
  • 9. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Thank you L-MachineGun.

    A co-worker of mine hit upon a similar suggestion: I will use the timestamp and existing ID. (Because of [http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html] I cannot rely on it). So it will work like this:
    1. Client requests n records using the record ID I gave them for the last request (or 0 if they want to start from the beginning) as the marker
    2. I use the ID to find the timestamp
    3. Select records that meet one of the following conditions
    a) have a timestamp greater than the record associated with the ID
    b) have the same timestamp as the record associated with the ID and have an ID greater than the passed in ID
  • 10. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Another possible solution: use ROWID and hope that the table doesn't get repartitioned or somehow explicitly persist ROWID simply for marking purposes and go off of that.
  • 11. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    rp0428 Guru
    Currently Being Moderated
    >
    Another possible solution: use ROWID and hope that the table doesn't get repartitioned or somehow explicitly persist ROWID simply for marking purposes and go off of that.
    >
    Maybe your medication kicking in?

    ROWID has NOTHING to do with when a row was inserted into a table.

    The very next row you insert may wind up as the FIRST row in the FIRST block of the FIRST file of the segment.
  • 12. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    Thanks for the clarification!
  • 13. Re: Alternative to sequence and timestamp: uniquely ordering records in time
    eebbesen Newbie
    Currently Being Moderated
    I did see this part but wasn't sure how to implement it -- thank you for the details rp0428.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points