This discussion is archived
1 2 Previous Next 20 Replies Latest reply: Jan 9, 2012 11:42 AM by Charles Lamb RSS

Adding more nodes later

906230 Newbie
Currently Being Moderated
Hi All,

I would like to know what will we have to do if we want to add more nodes after Oracle NoSQL install and configuration.
What will we do on all clients and on all existing nodes?
Can you please explain all the steps?
Working with NoSQL Database, we have to deal with performance problems so we have to thing of such a situation where we need to add more nodes later.
Is there a maximum of nodes in a replication group?
Thanks for your response.
  • 1. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    Hello,

    Release 1 of NoSQL Database does not allow you to add nodes to the system. That capability will be available in some future release.

    The number of nodes in a replication group is uniform across all replication groups in the system. That number is set when you configure the system.

    Charles Lamb
  • 2. Re: Adding more nodes later
    906230 Newbie
    Currently Being Moderated
    Hi Charles,

    So why is it written that "the number of partitions is chosen to be significantly larger than the maximum number of storage nodes expected in the sotre"
    in http://www.oracle.com/technetwork/database/nosqldb/learnmore/nosql-database-498041.pdf page 8 next to picture "Figure 3: Request Processing".
    Thanks.
  • 3. Re: Adding more nodes later
    greybird Expert
    Currently Being Moderated
    I think there are two reasons:

    1) The ability to add storage nodes is coming in the future, and there is an assumption that many apps will eventually need this feature.

    2) The number of partitions is immutable in an existing store, so it cannot be changed even when upgrading to a new version of NoSQL DB.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
  • 4. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    What Mark says is correct. To add to that, when dynamic expansion is implemented, the unit of granularity will be the partition. By having a large number of partitions (i.e. smaller partitions rather than larger partitions), it will be easier to move data around when new nodes are added.

    Charles Lamb
  • 5. Re: Adding more nodes later
    801331 Newbie
    Currently Being Moderated
    It seems like a curious trade-off. The guide says, “Note that there is some overhead in configuring an excessively large number of partitions. That said, it does no harm to select a partition value that gives you plenty of room for growing your store.” Then it says a factor of 100 is not unreasonable. But there is no guidance on what the consequences are of choosing a very large number (say, ten million).

    What kind of overhead is involved (space, time, some other dimension I can’t think of)?

    joe
  • 6. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    user12620046 wrote:
    It seems like a curious trade-off. The guide says, “Note that there is some overhead in configuring an excessively large number of partitions. That said, it does no harm to select a partition value that gives you plenty of room for growing your store.” Then it says a factor of 100 is not unreasonable. But there is no guidance on what the consequences are of choosing a very large number (say, ten million).

    What kind of overhead is involved (space, time, some other dimension I can’t think of)?
    NoSQL Database is implemented using Berkeley DB Java Edition for the storage engine. Each partition in NoSQL Database is a JE Database. This means that there is more metadata to deal with. The JE cleaner maintains metadata per database and there are more entries in JE's directory mapping tree (basically mapping names and ids to actual Databases).

    Charles Lamb
  • 7. Re: Adding more nodes later
    906230 Newbie
    Currently Being Moderated
    Hi Charles,
    You said By having a large number of partitions (i.e. smaller partitions rather than larger partitions), it will be easier to move data around when new nodes are added
    Are Partition table and JE's directory mapping tree the same thing ?
    Does that mean each time a partition is added we will have redistributed all data ?
    Is Oracle using a consistent hashing ?
    If two customer have the same number of nodes. How can the number of partition make a difference between the two customers?
  • 8. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    user962305 wrote:
    Are Partition table and JE's directory mapping tree the same thing ?
    No. JE's directory mapping tree maps database names to databases in the JE Environment. The partition table maps partition ids to nodes and databases in each node. Each partition is represented as a JE Database.
    Does that mean each time a partition is added we will have redistributed all data ?
    No. Presently, there are no plans to allow adding partitions. That's why you should specify a relatively large number of partitions when you provision the NoSQL Database initially. There are plans to allow adding nodes to the system. When a new node is added, some, but not all, partitions will be redistributed to the new nodes.
    Is Oracle using a consistent hashing ?
    A variation of it.
    If two customer have the same number of nodes. How can the number of partition make a difference between the two customers?
    The one with more partitions will have finer granularity for rebalancing when more nodes are added.

    Charles Lamb
  • 9. Re: Adding more nodes later
    896774 Newbie
    Currently Being Moderated
    Hi Charles,
    The one with more partitions will have finer granularity for rebalancing when more nodes are added.
    I am not sure to understand since the two customers have the same number of nodes and replication group.
    1. I assumed the number of partitions is used in hash function
    2. I assumed the number of partition des not correspond to the number of replication group (the number of partition is higher than the number of replication group)
    3. Do you use the number of partitions as if it was the number of nodes to which data is distributed? And then used modulo to map each partition to a single replication group?
    4. How do you decide which data should be redistributed?
    Thanks.
  • 10. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    893771 wrote:
    Hi Charles,
    The one with more partitions will have finer granularity for rebalancing when more nodes are added.
    I am not sure to understand since the two customers have the same number of nodes and replication group.
    The number of partitions is specified at configuration/provisioning time for the system. Generally we recommend that it be >> the number of nodes. e.g. if you have 10 nodes and expect to ultimately grow to 100 nodes, then specify nParts as 1000 or 10,000. From your question, I assumed that the two customers had the same number of nodes but different numbers of partitions specified at provisioning time.
    1. I assumed the number of partitions is used in hash function
    Correct. md5(key) % nparts -> partition id.
    2. I assumed the number of partition des not correspond to the number of replication group (the number of partition is higher than the number of replication group)
    Correct. N Rep Groups = nNodes / Replication Factor.
    3. Do you use the number of partitions as if it was the number of nodes to which data is distributed? And then used modulo to map each partition to a single replication group?
    Once you have the partition id, that gets mapped to a replication group. The operation type along with other factors determines which node in the replication group handles the operation.
    4. How do you decide which data should be redistributed?
    We don't currently redistribute data. That is a work in progress. We will take some data from each of several nodes. e.g. if there are 10 rep groups and you add one more rep group, we'll populate the 11th rep group with 10% of the data from each of the other 10 rep groups.

    Charles Lamb
  • 11. Re: Adding more nodes later
    896774 Newbie
    Currently Being Moderated
    Hi Charles,
    Once you have the partition id, that gets mapped to a replication group.
    The number of partition and the number of replication groups are different.
    Can you confirm. partition id % N Rep Groups => rep group id
    The one with more partitions will have finer granularity for rebalancing when more nodes are added.
    I still don't understand why you said that. I agree with you one customer will have more partition than the other.
    But the two customers have the number of replication groups. So what will that change when you will have to redistribute data to a new replication group for example?

    Thanks.
  • 12. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    893771 wrote:
    Hi Charles,
    Once you have the partition id, that gets mapped to a replication group.
    The number of partition and the number of replication groups are different.
    Can you confirm. partition id % N Rep Groups => rep group id
    No. partition id -> RG ID is via a map. We have to be able change the mapping so a modulo function won't be sufficient.

    >
    The one with more partitions will have finer granularity for rebalancing when more nodes are added.
    I still don't understand why you said that. I agree with you one customer will have more partition than the other.
    But the two customers have the number of replication groups. So what will that change when you will have to redistribute data to a new replication group for example?
    The one with more partitions will allow transfer granularities to be finer grained. In the example I gave (10 RGs morphing to 11), the amount of data transferred would still be the same, but the transfer granularity would be different depending on whether there were more or less partitions.

    Charles Lamb
  • 13. Re: Adding more nodes later
    896774 Newbie
    Currently Being Moderated
    Charles,
    Excuse me to be sure I really understand.
    if you have 10 nodes and expect to ultimately grow to 100 nodes
    In the example I gave (10 RGs morphing to 11)
    Are you talking of nodes or replication groups? I suppose the number of partitions is only compared with the number of replication groups.
    So for me, the number of partitions is greater than the number of replication groups and not inevitably greater than the number of nodes.
    I think there no reason to compare the number of partitions with the number of nodes. Is that right?
    We have to be able change the mapping so a modulo function won't be sufficient.
    The one with more partitions will allow transfer granularities to be finer grained.
    I understand the following. When there are new replication group, all data remain to the same partition.
    Data is redistributed by partition. In your exemple for instance, you decide to move 10% of partitions in each replication group and add them to the new replication group.
    That's why the mapping change here. Will this change be done while of nodes including all client are online?

    Thanks.
  • 14. Re: Adding more nodes later
    Charles Lamb Pro
    Currently Being Moderated
    893771 wrote:
    if you have 10 nodes and expect to ultimately grow to 100 nodes
    In the example I gave (10 RGs morphing to 11)
    Are you talking of nodes or replication groups? I suppose the number of partitions is only compared with the number of replication groups.
    So for me, the number of partitions is greater than the number of replication groups and not inevitably greater than the number of nodes.
    I think there no reason to compare the number of partitions with the number of nodes. Is that right?
    Correct. Partitions are replicated on all nodes in an RG. You can not have fewer partitions than RGs. In general, you should have nParts >> nRGs.

    >
    We have to be able change the mapping so a modulo function won't be sufficient.
    The one with more partitions will allow transfer granularities to be finer grained.
    I understand the following. When there are new replication group, all data remain to the same partition.
    Data is redistributed by partition. In your exemple for instance, you decide to move 10% of partitions in each replication group and add them to the new replication group.
    That's why the mapping change here. Will this change be done while of nodes including all client are online?
    Correct.

    Charles Lamb
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points