Forum Stats

  • 3,784,135 Users
  • 2,254,897 Discussions


Finding all possible combinations of numbers to reach a given sum



  • mathguy
    mathguy Member Posts: 10,229 Blue Diamond

    On a hike yesterday morning, I had a couple of ideas for improving the dynamic programming solution. I just had a chance to write the code and test it: now on your original problem I get the answer in 17 seconds.

    The DP approach is to fill out a "memo" (usually an array, one- or multi-dimensional). In the SUBSET SUM problem, the array is indexed by the individual summands (1 to N if there are N summands, 66 in your sample data) and by the sums - not just the target sum, but all positive integers between 1 and target_sum. The "memo" is filled out for the first summand (one row), then for the second summand (second row) etc. At each step, the "memo" will hold one value for each "sum" between 1 and target_sum; the value is not null if that "sum" can be achieved using the first "k" summands (when we are on row "k"). The algorithm fills the table one row at a time, and since row "k" only needs values from row "k-1", I wrote the first solution (a few days ago) using only two rows - implemented as associative arrays - and alternating them.

    The new idea (I didn't check the literature yet, I assume if I had I may have seen it already) is to fill each row in reverse order, from target_sum down to 1 rather than in ascending order. If I do it that way, I can do the computations IN PLACE and use a single array to hold a single row (instead of two such arrays), updating each value in place instead of alternating arrays.

    I also changed the collection used - I am using varrays instead of associative arrays. I believe this makes a bigger difference than the change in algorithm.

    I added a few more optimizations (which didn't help for your sample data, but may help in other situations):

    • When populating the summands array, import only summands that aren't strictly greater than the target sum.
    • For each summand, treat target_sum separately. Exit the outer loop (and, really, the entire program) as soon as the memo-ized value for target_sum is not null (we already found the answer, so stop everything and return).
    • If we are at the last summand already, then only check if target_sum can be achieved - it makes no sense to continue to check the smaller "sums" since they aren't going to be used any further.

    This makes the code a bit longer and messier, but not by much, and the optimization may prove helpful in some cases.

    create or replace function ssdp(target_sum number)
      return varchar2
      type summand_array is array(200) of number;
      type memo          is array(1e7) of varchar2(4000);
      s    summand_array := summand_array();
      m    memo          := memo();
      select val bulk collect into s from t where val <= target_sum;
      for i in 1 .. s.count loop
          if s(i) < target_sum and m(target_sum - s(i)) is not null then
            m(target_sum) := m(target_sum - s(i)) || '+' || to_char(s(i));
            exit loop_over_summands;
          elsif s(i) = target_sum then
            m(target_sum) := to_char(target_sum);
            exit loop_over_summands;
          elsif i = s.count then
            exit loop_over_summands;
          end if;
          for j in reverse s(i) + 1 .. target_sum - 1 loop
            if m(j) is null and m(j - s(i)) is not null
              then m(j) := m(j-s(i)) || '+' || to_char(s(i));
            end if;
          end loop;
            if m(s(i)) is null then
              m(s(i)) := to_char(s(i));
            end if;
      end loop loop_over_summands;
      return m(target_sum);
  • User_KI8KP
    User_KI8KP Member Posts: 11 Employee

    Thanks much math guy, you are super awesome.

    This new function yields me the matching payment amounts with the target amount within no time but it only list out only one combinations. Is there any way we can list out all possible combinations of payment records matching the target amount. ?

  • mathguy
    mathguy Member Posts: 10,229 Blue Diamond

    I think I said this several times before; I will say it one more time. Unless the number of summands is very small (say no more than 20, although even that is stretching it), you can't have a solution for the general problem, where you ask for ALL the different ways to sum up to the target. The time complexity goes back to POWER(2, N) even if you use the DP (dynamic programming) approach for part of the problem. The issue is that the number of distinct solutions may again be much greater than all the operations performed by the DP solution. Not doable.

    You also did not explain what kind of real-life problem leads to this; but that is just out of curiosity, knowing the answer (or not knowing it) won't change what I said above, and in a few other replies.

    If you just want to see "how it can be done" (so you can see yourself that you can get all the solutions in small cases, but testing it on larger data will just take forever), I can write something - but you will see on your own sample data that it will never return anything.