5 Replies Latest reply: Jan 18, 2013 3:28 AM by Nicosa-Oracle RSS

    efficient intersection query

    695392
      Hi,

      I have 2 huge tables, Bookmark and ExistingURLS, both of which have a nodeid column, the nodeids being effectively all my valid urls in the system.

      The ExistingURLS table is not maintained absolutely uptodate, but reasonably so. At nightly intervals, I first update ExistingURLS with deltas, removing and adding appropriate columns for nodeid's that are still extant.

      Then, I want to make a difference query between the Bookmarks and the ExistingURLS, where the bookmarks are all the nodeids that each user has bookmarked.

      I want to get all the existing bookmark rows which have a nodeid NOT in the ExistingURLS table at that moment. I could do this in two queries and create a list, but that would be way to big. Is there one query that can return the row ids of the bookmark rows whose nodeid is no longer valid, obviously taking no query params? What would the indices be on each table for this query?

      Andy
        • 1. Re: efficient intersection query
          sb92075
          user9990110 wrote:
          Hi,

          I have 2 huge tables, Bookmark and ExistingURLS, both of which have a nodeid column, the nodeids being effectively all my valid urls in the system.

          The ExistingURLS table is not maintained absolutely uptodate, but reasonably so. At nightly intervals, I first update ExistingURLS with deltas, removing and adding appropriate columns for nodeid's that are still extant.

          Then, I want to make a difference query between the Bookmarks and the ExistingURLS, where the bookmarks are all the nodeids that each user has bookmarked.

          I want to get all the existing bookmark rows which have a nodeid NOT in the ExistingURLS table at that moment. I could do this in two queries and create a list, but that would be way to big. Is there one query that can return the row ids of the bookmark rows whose nodeid is no longer valid, obviously taking no query params? What would the indices be on each table for this query?

          Andy
          How do I ask a question on the forums?
          SQL and PL/SQL FAQ


          select nodeid from bookmark
          minus
          select nodeid from existingurls;
          • 2. Re: efficient intersection query
            Nicosa-Oracle
            Hi,

            I'm not sure to totally get it, but I would opt for a NOT EXISTS +(on huge tables that would be executed as an anti-join which might certainly be the best in your case)+.

            Why don't you give us a simple test case so that we can propose solution you can easily test ?
            • 3. Re: efficient intersection query
              rp0428
              >
              The ExistingURLS table is not maintained absolutely uptodate, but reasonably so. At nightly intervals, I first update ExistingURLS with deltas, removing and adding appropriate columns for nodeid's that are still extant.
              >
              Is the 'Existing URLS' table ONLY modified by this one nightly batch process? How many records in the bookmark table have urls that are not in the URL table? A lot or a few?

              If URLS is only updated by the batch process then add an 'EXISTS_IN_URS_TABLE' column to the bookmarks table (yes, this will denormalize it, assuming it is normalized) and use your nightly delta process to update the column.
              >
              Then, I want to make a difference query between the Bookmarks and the ExistingURLS, where the bookmarks are all the nodeids that each user has bookmarked.

              I want to get all the existing bookmark rows which have a nodeid NOT in the ExistingURLS table at that moment.
              >
              With a functional index on the new 'EXISTS_IN_URS_TABLE' column you can find the 'NO' values rather easily. Then you can maintain the new flag column using the same 'delta' process you use to update the URLS table.

              For an OLTP system you shouldn't use such a denormalized solution but for batch updated OLP tables it is pretty common.
              • 4. Re: efficient intersection query
                695392
                Yes, I don't know about anti-join using not exists. That seems like the best answer.
                • 5. Re: efficient intersection query
                  Nicosa-Oracle
                  user9990110 wrote:
                  Yes, I don't know about anti-join using not exists. That seems like the best answer.
                  Give us some data that mimic yours so that we can provide a solution.