11 Replies Latest reply: Jan 26, 2010 11:27 PM by OraDBA02 RSS

    TAF On 11g DataGuard with FSFO

    OraDBA02
      Hi All,

      Today, i have created TAF for my 11.1.0.7 Oracle database running on RHEL 5.4 64 bit.
      I have configured my databases with FSFO enabled. (MaxAvailibility with LGWR SYNC AFFIRM transport)
      I created TAF with following entries:

      1) Created and enabled a dedicated service in primary :
      I have local_listener parameter set in my primary and standby databases.
      begin
       dbms_service.create_service('taf_test','taf_test');
      end;
      /
      begin
       DBMS_SERVICE.START_SERVICE('taf_test');
      end;
      /
      
      begin
       dbms_service.modify_service
       ('taf_test',
       FAILOVER_METHOD => 'BASIC',
       FAILOVER_TYPE => 'SELECT',
       FAILOVER_RETRIES => 200,
       FAILOVER_DELAY => 1);
      end;
      /
      2) Made sure that listener is listening to above created service:
      lsnrctl services l_payee1fe_dg_001
      LSNRCTL for Linux: Version 11.1.0.7.0 - Production on 21-JAN-2010 09:52:27
      Copyright (c) 1991, 2008, Oracle.  All rights reserved.
      
      Services Summary...
      Service "taf_test.us" has 1 instance(s).
        Instance "payee1fe", status READY, has 1 handler(s) for this service...
          Handler(s):
            "DEDICATED" established:13 refused:0 state:ready
               LOCAL SERVER
      3) TNS Entry :
      taf_test.us=
       (DESCRIPTION=
              (SDU= 32767)
              (ENABLE=BROKEN)
              (ADDRESS_LIST=
                      (FAILOVER=ON)
                      (LOAD_BALANCE=YES)
                      (ADDRESS=(PROTOCOL=TCP)(HOST=payee1fe-orasvr.db.us.com)(PORT=58003))
                      (ADDRESS=(PROTOCOL=TCP)(HOST=payee1fe-orasvr.db.us.com)(PORT=58003))
              )
              (CONNECT_DATA=
                      (SERVICE_NAME = taf_test.us)
              )
        )
      4)Trigger for setting service on appropriate primary database :
      create trigger taf_test after startup on database
      declare
       v_role varchar(30);
      begin
       select database_role into v_role from v$database;
       if v_role = 'PRIMARY' then
       DBMS_SERVICE.START_SERVICE('taf_test');
       else
       DBMS_SERVICE.STOP_SERVICE('taf_test');
       end if;
      end;
      /
      5) Ran a big SELECT on primary and after 5 sec, kill primary's pmon to do fail-over.
      6) After sometime, i saw that SELECT was hanged for some time and once, DGMGRL has open new primary (org standby), startup trigger got fired and SELECT started fetching rows from the point where it hanged.
      In that way, my TAF was working properly as expected.

      My question is :
      How come new Primary got the session state of SELECT which originated on old primary and then failed-over to new primary ? I confimed that,SELECT was NOT re-executed on new primary and started fetching rows from the row where it hanged at the time of fail-over.
      Since both the databases have different cache and controlfiles, i want to understand how TAF works on dataguard.
      In RAC,there is always a GRD through which session state can be co-ordinated between different instances. But this is not the case in DG.

      I did not find anything in PRIMARY's alert log. Though STANDBY alert log was containing below statement:
      ALTER SYSTEM SET service_names='taf_test' SCOPE=MEMORY SID='payee1fe';
      Can anyone shed light on internal working of TAF in DG.

      Regards,
      Bhavik Desai
        • 1. Re: TAF On 11g DataGuard with FSFO
          OraDBA02
          Can anyone update this thread please ?

          Regards,
          Bhavik Desai
          • 2. Re: TAF On 11g DataGuard with FSFO
            Uwehesse-Oracle
            The user process on the client side gets the cursor position of the result set upto the rows already displayed, together with the select statement. After the failover, the select run again on the new primary, suppressing all the rows already displayed until the cursor position is reached.

            Kind regards
            Uwe

            http://uhesse.wordpress.com
            • 3. Re: TAF On 11g DataGuard with FSFO
              OraDBA02
              Thanks Uwe Hesse.

              Is there any dictionary view where all such informatiion (session state) is kept. I read in one of the MAA document on OTN saying that REG$ dictionary maintains all client information which gets propated to standby as and when new entry is registered with it. However, count(*) from this view always gave me 0 records.

              Which process (component) sends this cursor position information along with SELECT to clients? Is this FAN ?
              You pointed out that SELECT is re-executed on new primary.In this way, will it get appropriate SCN information (from where it should re-execute SELECT) from new primary's control file?

              Can you correct me if i say, SELECT fail-over is only applicable with DG broker managed FSFO and when there is no archival gap between primary and standby?

              Regards,
              Bhavik Desai
              • 4. Re: TAF On 11g DataGuard with FSFO
                Uwehesse-Oracle
                I am not aware of a Dictionary view that tells you that.
                The TAF feature in Data Guard is not depending on Fast-Start Failover or even the Data Guard Broker for that matter. Remember that the user process (the client front end) is still the same as before the failover and has the same information as before. The reference SCN, determined in the EXECUTE phase of the SELECT belongs to it.

                Should you failover to a time from before that SCN (because of archive gaps or missing redo on the standby in general), the SELECT should fail. Must admit that I didn't tested that, though - I always tested in Maximum Availability mode with zero-data-loss.

                Kind regards
                Uwe

                http://uhesse.wordpress.com
                • 5. Re: TAF On 11g DataGuard with FSFO
                  OraDBA02
                  Hi Uwe Hesse,

                  Thanks for your reply.
                  May i also ask you to confirm whether this TAF works for FSFO enabled Active Data Gaurd as well?

                  I believe since UNDO TBS is not shared beetween primary and standby, DML fail-over is not possible. Is that correct?

                  Regards,
                  Bhavik Desai
                  • 6. Re: TAF On 11g DataGuard with FSFO
                    Uwehesse-Oracle
                    >
                    May i also ask you to confirm whether this TAF works for FSFO enabled Active Data Gaurd as well?

                    I believe since UNDO TBS is not shared beetween primary and standby, DML fail-over is not possible. Is that correct?
                    >

                    1) In theory, TAF should also work with Active Data Guard (FSFO or not). Incidentally I have a VM with an 11.1.0.6 Oracle Data Guard configuration on Oracle Enterprise Linux prepared. With that configuration, TAF to a Physical Standby with Real-Time Query running, did NOT work. I got an error from the SELECT, started on the primary before the failover and had to rerun it. I did not need to reconnect, though. This may be a bug of 11.1.0.6.

                    2) If you are in an open transaction while the primary resp. your current instance crashes, TAF is never transparent; you will always get an error (Transaction must rollback), even with RAC. Same is true for Data Guard, of course.

                    Kind regards
                    Uwe

                    http://uhesse.wordpress.com
                    • 7. Re: TAF On 11g DataGuard with FSFO
                      OraDBA02
                      Great...Thanks a lot Sir...

                      As i drafted earlier, i m using non-default listener ports. I want all applications which are going to use this TAF service TAF_TEST, to be directed to application listener. I have two application and two dataguard liseteners.
                      Application listenrs are running on ports 58001,58002 and DG listeners are runniong on 58003, 58004 ports.
                      My local_listeners are pointing to DG listeners.I thinik this is the way DB broker enabled FSFO works. I may be wrong here.

                      However, my above TAF test gave expected results while relocating service taf_test (port 58003) to new primary and re-executing the SELECT there. To keep application connections separte, i want to register this service to application listeners running on respective primary database.

                      May i ask you to guide me how should i register TAF_TEST to application listeners instead on dg ones.
                      My environment has restriction that local_listener should always point to DG listeners.

                      Regards,
                      Bhavik Desai
                      • 8. Re: TAF On 11g DataGuard with FSFO
                        Uwehesse-Oracle
                        >
                        I have two application and two dataguard liseteners.
                        ...
                        To keep application connections separte, i want to register this service to application listeners running on respective primary database.
                        >

                        Why? I don't see the point behind that setup & requirement. Can you provide an Oracle Corp. source that recommends such a setup? A service gets published to the listener by PMON (automatically to a listener, running on port 1521, else where LOCAL_LISTENER points to). I am not aware of an alternative method to register a service elsewhere - and I doubt the need to do so.

                        Kind regards
                        Uwe

                        http://uhesse.wordpress.com
                        • 9. Re: TAF On 11g DataGuard with FSFO
                          OraDBA02
                          Hi Uwe Hesse...

                          Alright...Got your point. I will check with my engineering team for having such set-up.

                          After SELECT fail-over, i noticed (and later confirmed from Oracle documents) that sesssion properties will be lost. I checked it with Multi block read count and index cashing parameters and their new values are lost when SELECT is executed on new primary.
                          I belive, callback can be used here to retain/reset original session properties to new primary.

                          Can you shed some light here as how to achieve such callbacks for OCI (I am using SQLPLUS) ?

                          Many thanks beforehand. :-)

                          Regards,
                          Bhavik Desai
                          • 10. Re: TAF On 11g DataGuard with FSFO
                            OraDBA02
                            Hi,

                            Just now i tested,SELECT FAIL-OVER for SELECT with PARALLEL (4) hint. I got below msg and SELECT did not executed on new primary:
                            ERROR:
                            ORA-25401: can not continue fetches
                            4350 rows selected.
                            However, i observed that the session is failed-over to new primary. After getting above msg in SQLPLUS window, when i saw number of slaves given to my new sessions, i got :
                            SQL> select *From v$pq_tqstat;
                            
                            DFO_NUMBER      TQ_ID SERVER_TYP   NUM_ROWS      BYTES  OPEN_TIME AVG_LATENCY      WAITS   TIMEOUTS PROCES   INSTANCE
                            ---------- ---------- ---------- ---------- ---------- ---------- ----------- ---------- ---------- ------ ----------
                                     1          0 Producer        19800      65296          0           0          3          0 P002            1
                                     1          0 Producer        19800      65296          0           0          4          1 P001            1
                                     1          0 Producer        19800      65294          0           0          3          0 P000            1
                                     1          0 Producer        19800      65296          0           0          3          0 P003            1
                                     1          0 Consumer         4351      65296          0           0          8          0 QC              1
                            Does it mean that, SELECT fail-over is only supported for serialized SELECTs ? Or there is other alternative to achive PARALLEL SELECT fail-over ?

                            Regards,
                            Bhavik Desai

                            Edited by: BhavikDe on Jan 24, 2010 11:30 PM
                            • 11. Re: TAF On 11g DataGuard with FSFO
                              OraDBA02
                              Hi Guys,

                              Can any one update this thread please?

                              Regards,
                              Bhavik Desai