6 Replies Latest reply: Feb 6, 2013 4:00 PM by John Spencer RSS

    Automating Procedures to run when Source Table is Updated

    spalato76
      Hi All,

      I have two tables that get refreshed irregularly based on a Source Table in our company repository. Let's just say that I don't want to use views in this case for a variety of reasons that I won't get into. The Source Table is based on a Shipping Date column, called "SHIPPING_DT". In this example, lets name of the tables:

      Source Table: TABLE_SRC
      My Tables: TABLE_A, TABLE_B

      Historically when the source table gets updated it gets updated with a month of data. We get email notifications send to us when the table is updated and when this happens I manually run a procedure, called update_table_a, and update the table with the new month of data from TABLE_SRC. My procedure has one input parameter which is the minimum start date of loading from the source table. Updating Table_B uses another procedure, called update_table_b, which also has a minimum start date parameter just like the procedure for updating table A but I need to grab elements from Table A, do calculations on the data while loading the info into Table B so Table B is dependant on Table A and the procedure to update Table B can only start running after the procedure to update Table A has finished. So here's the series of events:

      1. January 2013 Data is loaded into TABLE_SRC.
      2. Email Notification gets sent out
      3. Table A and Table B are updated to December 2012.
      4. I execute Procedure "update_table_a" to update Table A with January 2013 data. I execute as following: execute update_table_a('01-jan-2013');
      5. After Table A is updated, I execute Procedure "update_table_b" to update Table B with January 2013 data. I execute as following: execute update_table_b('01-jan-2013');

      So obviously there is too much manual work required. What I want is for this to be done automatically by having a schedule job that checks every week perhaps whether the maximum shipping date in the source table is greater than the maximum date in Table A and B and if it is then it immediately triggers procedure "update_table_a" to execute and after it's finished executing then it goes to the next procedure, "update_table_b", to start executing. What I also need is to capture as a variable to use as the start date in my procedure. I was thinking that I use something like Table A's maximum shipping date+1 so in the example above the start data for the procedures would be "December 31 2012+1", which ends up being January 1st 2013.

      Anyway, any help would be appreciated, I realize the description above is very vague but if I can get the syntax correct on how to automate this then I can easily incorporate my existing code and I'm looking for expertise in the matter.

      Thanks kindly,
      Ed
        • 1. Re: Automating Procedures to run when Source Table is Updated
          John Spencer
          I would create a table to store the maximun date already loaded. Then create a procedure, something along the lines of:
          procedure my_proc is
             l_last_dt date;
             l_this_dt date
          begin
             select last_date_loaded into l_last_dt
             from control_table;
          
             select max(ship_date) into l_this_dt
             from table_src;
          
             if l_this_dt > l_last_dt then
                begin
                   update_table_a(l_last_dt + 1);
                exception
                   when others then
                      <do some loging and notification in autonomous transactions>
                      raise;
                end;
                begin
                   update_table_b(l_last_dt + 1);
                exception
                   when others then
                      <do some loging and notification in autonomous transactions>
                      raise;
                end;
                update control_tavle
                set last_date_loaded = l_this_date;
                commit;
             end if;
          end;
          Then schedule it with dbms_scheduler.

          John
          • 2. Re: Automating Procedures to run when Source Table is Updated
            spalato76
            Thanks John, I'll give that a try.
            • 3. Re: Automating Procedures to run when Source Table is Updated
              rp0428
              Thanks for the detailed explanation.

              One of the best examples of 'focus on the problem first, not the solution' I have seen in the forums in some time.

              The design process you have described is exactly the way you want to begin.

              1. Identify all of the sources of data (source table, tables a and b)

              2. Determine the dependencies between the data sources (table a depends on source data date, table b depends on table a)

              3. Document each simple step that needs to be executed to produce the desired result.

              4. Determine the parameters that are needed to drive the process stream

              The one critical piece you did NOT mention is the part that deals with restart and recovery: that is, what needs to happen if the process fails at any point?

              1. What if you can't get access to the source data? What kind of alert or logging is needed? When should another attempt be tried to start the process? 5 minutes? one day? never - until manual intervention?

              2. Do tables A and B need to be backed up before the process can begin so that they can be restored if the process fails? Should that backup and/or restore be done as part of the process? Or will that be handled manually if a failure occurs?

              Once you have done those steps it is easier to see what the possibilities are and to work with each of those steps independently since you have already defined the interfaces for each step.

              At a high level one approach might be:

              1. Create a CONTROL table for the process. This table maintains information about the state of the process (DISABLED, IDLE, RUNNING, ERROR, SUCCESS), the time it began, end time, etc. You should be able to tell at a glance if the process CAN BE run (you might want to disable it), is running, how long it has been running, etc.

              2.Create a DBMS_SCHEDULER job that executes a main control procedure for the process. Schedule the job to run as often as you need: weekly, monthlyl, etc.

              3. Create a control procedure that 'controls' the execution of the process. This procedure would have multiple steps:

              a. check the control tablel to make sure the process is actually allowed to run (e..g has not been disabled) and make sure it is not already running (status - RUNNING).

              b. check that the prerequisites have been met; e.g. that the source table has new data to be loaded.

              c. update the control table to indicate that the process has begun

              d. backup tables a and b (if desired)

              e. truncate tables a and b

              f. load table a with data

              g. load table b with data

              h. update the control table to show the results of running the process

              Just trying to give a simple example. This example does NOT attempt to show if all of those steps are simply coded into one procedure or if each of those steps is a separate procedure/function.

              The focus is defining WHAT has to happen; not the specifics of HOW it should happen. Get the WHAT defined and the order of the steps and then do a walk-through with your teammates or management to make sure all of the bases are covered and that you haven't missed anything.

              Once that is done you can then build each of those pieces as a simple standalong piece of functionality. A forward-looking approach might allow for the possibility that there could be a 3rd table involved later. So maybe you want to create a simple function/procedure that can truncate a table and for step 'e' you call the same procedure twice, once for each table.

              The code that loads the two tables is likely to be different for each table. The simple modular approach is to create a procedure for each table and then in the control procedure call each of those procedures one after the other checking for any error that gets returned.

              I try to ALWAYS use a modular approach during development and testing. The focus is getting the WHAT working properly and in the right order. It is generally pretty easy to combine multiple components that work and consolidate them if you want later.

              It is generally hard to get a jumble of code working when everything is thrown together.

              You might also consider creating a summary table for each of your data tables (a and b). That summary table could have the last update date, the record count and any other summary information that might be useful.
              • 4. Re: Automating Procedures to run when Source Table is Updated
                spalato76
                John,

                Quick question for you. In the sql code you wrote, does it take into account the fact that update_b has to start after update_a has completed? What I'm trying to ask is that normally in an if condition such as above, are procedures executed in order/sequentially or does Oracle execute them both at the same time?

                Thanks
                • 5. Re: Automating Procedures to run when Source Table is Updated
                  rp0428
                  >
                  Quick question for you. In the sql code you wrote, does it take into account the fact that update_b has to start after update_a has completed? What I'm trying to ask is that normally in an if condition such as above, are procedures executed in order/sequentially or does Oracle execute them both at the same time?
                  >
                  PL/SQL code executes serially. There is no parallel processing of the code. Each statement executes until it completes successfully or raises an exception.

                  If an exception is raised control is transferred to the exception block. If no such block exists control is transferred up the calling chain to the first exception block.

                  See the Exception Handling section of the PL/SQL Language doc
                  http://docs.oracle.com/cd/E11882_01/appdev.112/e25519/errors.htm#i3336
                  >
                  When an exception is raised in the executable part of the block, the executable part stops and control transfers to the exception-handling part. If ex_name_1 was raised, then statements_1 run. If either ex_name_2 or ex_name_3 was raised, then statements_2 run. If any other exception was raised, then statements_3 run.

                  After an exception handler runs, control transfers to the next statement of the enclosing block. If there is no enclosing block, then:

                  •If the exception handler is in a subprogram, then control returns to the invoker, at the statement after the invocation.

                  •If the exception handler is in an anonymous block, then control transfers to the host environment (for example, SQL*Plus)

                  If an exception is raised in a block that has no exception handler for it, then the exception propagates. That is, the exception reproduces itself in successive enclosing blocks until a block has a handler for it or there is no enclosing block (for more information, see "Exception Propagation"). If there is no handler for the exception, then PL/SQL returns an unhandled exception error to the invoker or host environment, which determines the outcome (for more information, see "Unhandled Exceptions").
                  • 6. Re: Automating Procedures to run when Source Table is Updated
                    John Spencer
                    As rp0428 said, PL/SQL always executes serially. As an additional comment, by re-raising the error in the begin/exception/end block around the update_table_a call, it will prevent the call to update_table_b from ever executing. You just need to make sure that update_table_a is not swallowing exceptions that ti cannot handle. That is, without the raise in my exception block, it would appear to the rest of the procedure that everything weas fine, and it would go on to make the call to update_table_b, which I gather you do not want to happen unless update_table_a is successful.

                    John