8 Replies Latest reply: Jul 30, 2011 9:02 AM by 879169 RSS

    Strange situation with history & diff

    aschilling
      hello!

      on our customer's system we have a very strange situation.
      with the help of the history view we (almost 100%) reconstructed what has happened. it is (most probably) due to rollbacks (RollbackToSp) which happened successfuly. the savepoints were temporary (not user created) and only needed for internal use. after the rollback they were deleted. when deleting, we didn't choose the compress option, thus history entries remained. first of all this led to the situation that our application found a data situation which it assumes to be an error (namely an U/U between parent and child, but not a conflict). the content of the diff view is logical and correct, but as we use the history view to actually find the last change (so that the user is not confused and only sees the one change that actually needs to me merged/refreshed) this now fails, as the content of the history view is not as expected.
      this however is not the main problem. we still could synchronize those changes between the workspaces once we found out which is the one that's newer.
      however, in another case (between two other workspaces) no synchronization happened. we found out, that this was due to the fact, that OWM obviously detected the non-intuitive entry to be the one that needs synchronization. i.e. it was, that in the parent the last change was Insert and in the child it was Update. when refreshing from parent to child it worked, in the other direction nothing happened (which is as expected if OWM found out, that parent->child is the right direction).
      the question now is: how does OWM decide which direction is the right one? (or: which is the last change between two workspaces) because even with partial rollbacks in one workspace we can't figure out, how an Insert should be newer than an Update?
      we are a bit confused of what actually happens within the _LT tables and the history/diff views when you rollback to a savepoint (and possibly not the same savepoints in two different workspaces).
      we know that it's not as easy to understand from this description and currently we're working on a reproduction on our test system, but maybe you heard of such issues before and can clarify a bit on how a situation like that can occur?

      UPDATE: Question 2
      When we tried to fix the affected rows that had the parent (I) update to be newer than the child (U) we enforced a conflict by doing a fake-edit (nothing changed content-wise) and resolved the conflict with KEEP_CHILD. merging that worked and the rows were removed from the DIFF and CONF views. Today we noticed, that the data still was different, so the merge did not succeed.
      It seems quite clear to us, that something is messed up with the row versions. While the LT table shows the correct row with the correct data and a timestamp that matches the merging, the BASE view and the _HIST view show another row version which is older. The row version that would be correct seems to be something like an orphan. Do you know how something like that could happen? And is there a possibility to find row versions that are not in the version hierarchy anymore? We would need that to find all entries in a table that need fixing.

      Ultimate question resulting from all this is: what would be the best approach to get the history clean again? would it be feasible to drop the child workspaces that have the problem once we got the data clean again and re-create them again afterwards?

      kind regards,

      Andreas

      Edited by: aschilling on 17.01.2011 18:03
        • 1. Re: Strange situation with history & diff
          Ben Speckhard-Oracle
          Hi Andreas,

          I was not able to reproduce the situation that you described so it would be beneficial if you were able to exactly describe the use case. Was there a MergeWorkspace prior to the rollback, or maybe a refresh? Doing a Rollback after a Merge will complicate the conflict resolution. I was able with a combination of RollbackToSp and DeleteSavepoint to get a situation where the conf view had rows and the diff view didn't. But, this seems to be the opposite of what you have described.

          I am also not clear as to why the HIST view is needed.  The DIFF view would only show the latest change for any particular workspace. To clean the _HIST view, you could either use RemoveWorkspace or RollbackWorkspace.  You could also use RollbackTable and not specify a savepoint if you didn't need to rollback all of the tables.

          If you are able to develop a testcase that demonstrates your problem, I would be glad to take a look at it. If you do, also let me know exactly which version of OWM you are using.

          Regards,
          Ben
          • 2. Re: Strange situation with history & diff
            aschilling
            Hi Ben,

            we use the rollback to restore from an error that happens during merge or refresh so that we don't have resolved conflicts that haven't been merged yet (remember my post about conflict resolution and merge that has to be done in two transactions, so when the transaction of the merge rolls back due to some error the conflicts still remain resolved as it was a transaction before, so we use a temporary savepoint to rollback to the state before conflict resolution).
            so, to put it short, as we are near 100% sure that all that happened during a workspace synchronization: yes, there was a merge before the rollback.

            concerning the HIST view: yes, the DIFF view shows the latest change only, but if you e.g. have a standard case of an update in child and nothing was changed in parent (no conflict), the diff row for the parent still shows the latest change that ever happened in parent, which might have been some time ago. our application uses icon decorations to display to the user what change has happened on a row. if we would simply display the content of the DIFF view, we would see updates for both parent and child in that situation although just the change in child is the one that actually should be displayed. thus we use the HIST view to compare dates and find out, which row of the DIFF view is the one to display (only in case of a conflict it's both). actually we are quite sure now that we can do better by using the BASE view as it also provides the needed metadata, but all that would only fix the problem of correct display to the user, not that the whole constellation of data is somewhat weird.

            cleaning _HIST views by rolling back or removing is not an option right now (only, if things turn out really wrong), as our customer uses the workspaces for long-running (or better: forever-running) separated work-areas with synchronizations in between and wants to keep all the history for them.

            in our understanding the strange HIST situation most likely came from deleting the temporary savepoint without the compressview_wo_overwrite set to true, as the documentation states, that history is not deleted?

            concerning testcase: we are into that, although we currently need to fix the data that is different in two workspaces but not displayed in the _DIFF view. once that is done we try to reproduce the issue.
            OWM version is 10.2.0.3.

            the issue you described on your system (rows in CONF but not in DIFF): is that a know bug? and if so, is it fixed in 10.2.0.4 or 11g?

            thanks alot so far and kind regards,

            Andreas
            • 3. Re: Strange situation with history & diff
              Ben Speckhard-Oracle
              Hi Andreas,

              Concerning the DIFF view.  If I am understanding correctly, when you have a update in the child but not corresponding update in the parent(ie. no conflict), then the row in the parent would be the same as the BASE row, but would have a NC value in the WMCODE column. Is that not sufficient in order to differentiate between the 2 cases?

              When compress_view_wo_overwrite is set to false (the default), you are correct in that none of the history is deleted but rather all moved into a single savepoint/version. This would only affect tables with the VIEW_WO_OVERWRITE history option, as only those tables are able to store multiple versions of rows with the same primary key within the same version.

              The testcase that I mentioned in my previous message, I am still looking into. Rolling back over a merge complicates a number of things. As far as I know, the behavior is present in the latest version of OWM. Can't yet comment as to whether or not it's expected, but it doesn't seem to be directly applicable to your case.

              Regards,
              Ben
              • 4. Re: Strange situation with history & diff
                aschilling
                hi Ben!

                the _DIFF-thingie isn't any sophisticated stuff, I'll give a step-by-step example what we do (or what our users do)
                1) change object X in parent.
                1a) _DIFF shows U for parent and NC for child.
                2) refresh X to child. _DIFF clear, X is in sync.
                3) change X in child.
                3a) user plans to merge X from child to parent. _DIFF shows U for parent (from step (1)) and U for child (from (3)).

                at point (3a) the user now only wants to see the update from (3), but in the _DIFF we find also the old U for parent from step (1).
                in order to not confuse the user we use the _HIST view information to find the latest change for this row in parent and child so that we can determine that (in this case) only the U from child is relevant and should be displayed. only in the case of a conflict we show both updates, as both are needed by the user to decide for the correct conflict solution.

                so, the problem we had comes into play when we retrieve information from the HIST view. we look for the change in a certain workspace where wmretiretime=null. but for some reason the latest row that was visible to the user (also the row we found in the _BASE view) did not have retiretime=null.
                but as said, we could fix that by modifying our application logic. the real problem was, that the OWM thought that an earlier change was the one to be synchronized. to stay with our example from above, it would assume that at point (3a) the change from (1) is the current one, not the one from (3). thus, performing mergetable() for X simply did nothing, whereas refreshtable() did. we then modified X in child again to force a conflict which we then resolved with keep='CHILD'. this succeeded, but later on we realized, that the data from child was not merged to parent and although the rows were different in the two workspaces there was no entry for them in the DIFF or CONF view.
                I hope this clarifies things a bit.

                but, to get on a bit: why we had this problem in the first line is because of our synchronization process.
                simplified it looks like this:
                1) user decides whether he wants to merge or refresh
                2) display diff information to the user that is relevant for the synchronization direction (i.e. do not display updates in parent when doing a merge from child)
                3) user selects which differences to merge/refresh and which to ignore. user also resolves conflicts (if any)
                4) user starts the actual synchronization.
                so, after (4) our internal (on OWM-level) process then is:
                1) if conflicts exist:
                1a) beginresolve()
                1b) resolve the rows according to the users choice from (3) above
                1c) commit the transaction
                1d) commitresolve()
                2) perform mergetable/refreshtable for all relevant tables according to the users choice from (3) above

                during (2) errors may occur. this would typically be UC or RIC violations we didn't handle before. usually it shouldn't happen, but you never know (and it showed that there can be rare situations where merging/refreshing fails and we can't easily determine the problem before). so, our idea was that in case of an error everything should be as it was before starting the whole process. thus we create temporary savepoints before starting the process (that is, in our OWM level process before (1)) to which we roll back if (2) fails (because (1) was an autonomous transaction that isn't rolled back like the one from (2) when an error occurs)
                during that rollback and the cleanup (delete temporary savepoints) something went wrong in a way that led to the problems we described. phew :-)

                currently we are into another possible solution: do not try to rollback the conflict resolution. it would make things easier and also, if the user resolved the conflicts in a certain way: well, that's what he actually wanted to do. and the transaction with the actual merge/refresh would be correctly rolled back on fail anyway, so nothing bad here.
                question now would be, how the data looks for resolved conflicts. we did some tests and came to a conclusion where we wanted to ask whether this conclusion is correct.
                a) user choice for conflict was keep='PARENT'
                -> data was immediately copied to child, thus conflict completely resolved and DIFF is clear. nothing to do here, case closed(?)
                b) user choice was keep='CHILD'
                -> conflict is marked resolved but not yet merged to parent.
                we noticed, that in this case the object still appears as DIFF but not as conflict. also the HIST view is modified and there is a new (or updated?) entry for the child workspace having the most current createtime. thus, our application would display it as "change in child" without conflict when opting for a merge the next time, so a user could simply merge it like any other object.

                are these two conclusions correct? because we might modify our process then and avoid tricky rollbacks over merges or refreshes. that wouldn't exactly help if the behaviour we saw (and you had in your testcase) is not expected, because then there's obviously still a bug left in OWM, but the process itself would be (more) safe for us.

                thanks alot for the help and feedback so far,

                Andreas
                • 5. Re: Strange situation with history & diff
                  Ben Speckhard-Oracle
                  Hi,

                  I tested the example that you gave and for (3a) the _DIFF view showed 'U' for child and 'NC' for the parent, as expected.  Is there some step missing that is causing it to be a 'U' for you?  From the example and your further description, it really sounds like another update is occurring in LIVE, either by way of a dml or by a merge from a different workspace?  If you are sure that is not the case, it would be unexpected, but would really need a testcase to look into it, as I have been unable to reproduce the behavior.

                  The wm_retiretime column is only applicable for the current workspace. So, the latest row visible from a particular workspace can have a non-null wm_retiretime if it is coming from a parent workspace. For example, if you do the following:

                  1) Version Enable a table with a single row and create a child workspace of LIVE
                  2) Update the row in the LIVE workspace
                  3) Query for the row from the child workspace

                  There will be 2 versions of the row in LIVE. The original row which now has a non-null retiretime, and the new row which will not be seen by the child workspace since it was modified after the workspace creation. The row seen by the child workspace will be the original row and has a non-null retiretime even though it is the current version of the row from the perspective of the child. Is something similar to this happening in your case? This could happen due to an update within LIVE, a refresh of the child workspace, and a further update in LIVE due to a dml/merge prior to modifying the row from the child workspace.

                  Your description of the PARENT and CHILD cases are accurate. To accommodate the syncing of the rows in the CHILD case, we make a physical copy of the row after creating a new implicit savepoint during BeginResolve. This explains the additional row you are seeing in the HIST view.  If the workspace resolution session is rolled back with dbmswm.RollbackResolve we are internally rolling back to the savepoint that we created.

                  If you ever resolve in favor of the BASE, it works similarly to CHILD, in that the row would be copied to the CHILD, but would not be applied to the PARENT until the workspace is merged. Also note that in the case of BASE and CHILD, it would take only a single update in the parent to cause a conflict, while in the PARENT case, it would take a further update in both the parent and child workspace to create a conflict.

                  The behavior you are describing where the merge/refresh is synchronizing the wrong row is not what should be occurring, but again would really need a testcase. A further update in the parent might explain the behavior.

                  Regards,
                  Ben
                  • 6. Re: Strange situation with history & diff
                    aschilling
                    Hi again!

                    some time passed by, most of the problems still remain.
                    while the main problem (the one that led to those problems described in this thread) is now filed as an SR, some questions of this thread are still open.
                    so we thought we try to clarify on the issue first, that we need to use the HIST view to decide which entry in the DIFF view is the current one.
                    the workspaces we have are LIVE and WORK as a child of LIVE. tables are all versioned with VIEW_WO_OVERWRITE.
                    here's a script:
                    execute dbms_wm.gotoworkspace('WORK');
                    insert into details(det_id, det_detail) values(100, 'MY_DETAIL');
                    execute commit;
                    
                    execute dbms_wm.setdiffversions('LIVE', 'WORK');
                    
                    select * from details_diff;
                    [1]
                    
                    execute dbms_wm.mergetable('WORK', 'DETAILS', 'det_id=100', false, false, false);
                    execute commit;
                    
                    [2]
                    
                    execute dbms_wm.gotoworkspace('LIVE');
                    execute update details set det_detail='MY_MODIFIED_DETAIL' where det_id=100;
                    execute commit;
                    
                    execute dbms_wm.gotoworkspace('WORK');
                    execute dbms_wm.setdiffversions('LIVE', 'WORK');
                    
                    select * from details_diff;
                    [3]
                    at position [1] the select gives us:
                    ID    DET_DETAIL   WM_DIFFVER    WM_CODE
                    100                DiffBase        NE
                    100                LIVE, LATEST    NE
                    100   MY_DETAIL    WORK, LATEST    I
                    now we merge that, everything clean at [2].
                    after that we modify the row again in LIVE and go to workspace WORK in order to refresh the change to that workspace.
                    the select at [3] gives us:
                    ID    DET_DETAIL          WM_DIFFVER    WM_CODE
                    100                       DiffBase        NE
                    100   MY_MODIFIED_DETAIL  LIVE, LATEST    U
                    100   MY_DETAIL           WORK, LATEST    I
                    so, at this point we do not know which of the two lower lines actually is the one that triggered the entry to appear in the _DIFF view. is it the "I" or is it the "U"?
                    thus we use the hist view to find out which change actually was the latest. actually with the way our application works an I and U would always mean that the U is the latest change, but we can't generally assume that.
                    if we extend the example to perform some more ping-pong updates between the two workspaces (which actually is what happens at our customer every day), then everytime we ask for the diff the entries would have an U for both workspaces.
                    so, are we doing anything fundamental wrong? we never observed that a change went to "NC" after synchronization. I hope that example clarifies what we meant in the previous posts.

                    kind regards,

                    Andreas
                    • 7. Re: Strange situation with history & diff
                      Ben Speckhard-Oracle
                      Hi Andreas,

                      Everything you are doing is supported.. A row would never go to NC after it has been synchronized. The NC is only used when no dml changes have been performed in the workspace or savepoint since the base/common version. However, a row can become NC after a refresh. The default option for RefreshWorkspace(copy_data=>FALSE) updates the parent version of the child workspace to be the latest version of the parent workspace. This is done for performance benefits but also changes the set of potential base row. If you do not want this behavior, you would need to use copy_data=>TRUE.

                      Regards,
                      Ben
                      • 8. Re: Strange situation with history & diff
                        879169
                        I have exactly the same kind of behaviour described by Andreas: double "U" in the _DIFF view, even if there is only one change causing the difference. Any hint?