10 Replies Latest reply: Jun 13, 2012 8:34 PM by 942815 RSS

    Application stops responding on executing query,when the connection is down

    789390
      Hi all,

      I have a big issue. I searched a lot in internet, but could't find anything, that answered my question.
      The problem is:

      When I try to establish connection (using createConnection) and the connection is successful, everything's fine. execute, executeUpdate and executeQuery work correct. But when something happens and the connection to the server is down (for example, when i drop the traffic to the server using a firewall rule), and then try to call execute*, everything just stops working. I mean, no exception is thrown, no error response in any form is received, nothing. The program just stops working..

      The environment is created this way:
      environment = oracle::occi::Environment::createEnvironment( sCharset, sNCharset, Environment::THREADED_MUTEXED );

      and the connection this way:
      connection = m_ptrOracleEnvironment->environment->createConnection( sUser, sPassword, sConnectionString );


      For example:
      void MyFunction( /* ... */ )
      throw( /* ... */ )
      {
           /* ... */
           try
           {
                m_ptrOracleStatement->execute();
           }
           catch( SQLException& sqlExcp )
           {
                /* ... */
           }
           catch( ... )
           {
                /* ... */
           }
           /* ... */
      }
      This doesn't work, when the connection is down..

      Is this a bug or I did something wrong?

      Thanks a lot in advance,
      Best Regards,
      Kiril Kirov

      Edited by: Kiril Kirov on Sep 3, 2010 10:44 AM
        • 1. Re: Problem executing query, when the connection is lost
          Mark Williams-Oracle
          Hi Kiril,

          I don't think this will help much, but I just did a few tests using OCCI 11.2.0.1 64-bit Windows to a 9.2.0.8 database on 32-bit Windows. If I pulled the network cable or used a firewall to block the communication after the connection was established the result was the same: after approximately 20 seconds there was a timeout with the following exception SQLException message:

          3113: ORA-03113: end-of-file on communication channel
          Process ID: 0
          Session ID: 9 Serial number: 15

          I'm just creating a "plain" environment as so:

          env = Environment::createEnvironment(Environment::DEFAULT);

          I was also issuing a "select user from dual" query and invoking executeQuery. However, I'm not sure those differences are enough to account for the behavior you are seeing. What versions are you using? Also, are there any parameters specified in the sqlnet.ora/tnsnames.ora files on the client and server which might impact timeouts? I'm not aware of any off the top of my head that would cause this behavior.

          Regards,

          Mark
          • 2. Re: Problem executing query, when the connection is lost
            789390
            Hi Mark,

            Well, this is strange :/ My database is ( on a RHEL 4 machine ):

            Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
            With the Partitioning, OLAP and Oracle Data Mining options
            JServer Release 9.2.0.4.0 - Production

            Also, I tested with two different versions of OCCI - 10.2.0.3-1.i386 on a RHEL 4 machine and 11.2.0.1.0-1.i386 on a RHEL 5.3 machine. It's the same on both places.
            I changed the mode to Environment::DEFAULT (just to test), but it's still the same.

            Could you tell me where do you set this timer? Or it's set by default? Because everything I found about OCCI timers was:
            "OCCI does not provide any other explicit way to set timeouts for connections."
            except the setTimeOut( .. ) method in ConnectionPool class, but I use just Connection ( not ConnectionPool). Maybe this could be a problem ?

            Thanks a lot,
            Kiril Kirov

            Edited by: 786387 on Aug 6, 2010 3:51 AM
            • 3. Re: Problem executing query, when the connection is lost
              Mark Williams-Oracle
              Hi Kiril,

              I did not set any timers or anything else that would be related to timeout values. I used mostly default installs for the client and the server pieces. I will try to take a look again later today and see if anything looks like a possibility, but I'm not sure what at thist stage. Yes, I agree, this looks strange.

              Regards,

              Mark
              • 4. Re: Problem executing query, when the connection is lost
                789390
                Hello Mark,

                Interesting problem, yeah.. Still thanks for your help!

                Best Regards,
                Kiril Kirov
                • 5. Re: Problem executing query, when the connection is lost
                  789390
                  Hello Mark,

                  I found (while researching for different OCCI things), these 2 issues:
                  Re: Non Blocking Call with OCCI

                  and

                  Re: Can you set a timeout on a sql

                  The first one is closer to my problem. For example, please take a look at these posts:
                  ---------------------------------------------
                  <i>Hello,

                  I have a cursor in my plsql program - the sql query hangs due to a problem with the target database (this is not the error). In my program I have a sequence of procedure calls and this is one of hte procedure. <b>Since the sql hangs - the whole program hangs.</b>

                  <b>What I would like to do is to attempt to run the sql for a certain amount of time - if it does not return, then exit out of the cursor and continue with the rest of the program. Can this be done ?</b>

                  Thanks
                  Dipti</i>
                  ---------------------------------------------

                  and especially this one:
                  ---------------------------------------------
                  <i>Hi,Guys

                  I wrote an application server(daemon process) to talk with oracle server continuous which used oracle9 OCCI lib, each 5 min it executes the procedure on the DB server.

                  Now I have come cross a problem:

                  <b>If the network is blocked, app server will blocked at occi call and would never pass, and no exception was catched </b>
                  for e.g.
                  1. Oracle server reboot without shutdown oracle process
                  2. udp broadcast message storm blocked the connection between app server and oracle DB.

                  I consider maybe it's because OCCI using the blocking mode of connection that caused this problem.

                  How can I interrupt the blocking call when call timeout?</i>
                  ---------------------------------------------

                  At least I found other people with this problem.

                  Thanks,
                  Kiril


                  EDIT: Also, do you think, that some of these can help we in this situation (when the OCCI is already blocked) :
                  Connection::terminateStatement()
                  ConnectionPool::terminateConnection()
                  Environment::terminateConnection()
                  Environment::terminateConnectionPool()
                  Environment::terminateEnvironment()
                  or something like
                  http://www.orafaq.com/forum/mv/msg/52133/214502/0/#msg_214502
                  can help ?

                  Thanks again (:

                  Edited by: Kiril Kirov on Oct 6, 2010 12:04 PM
                  • 6. Re: Problem executing query, when the connection is lost
                    Mark Williams-Oracle
                    Hi Kiril,

                    Well, I managed to reproduce the situation you describe (at least I think so!) by doing an "ifdown eth0" on a database server whilst an OCCI call is in progress.

                    If the OCCI calls are blocked, I'm not sure how much help the OCCI terminate* calls would be. They would block as well I suspect - I've not tested that though.

                    I did test issuing an OCIBreak from another thread and that does work to a degree.

                    To test I simply executed dbms_lock.sleep(10) to sleep for 10 seconds from the main thread. During that time in a second thread I issue an OCIBreak call after sleeping for 3 seconds in that thread (to give me time for the ifdown).

                    What I mean by "to a degree" is when the interface on the server is down the OCIBreak call results in capturing the following in the main thread:

                    ORA-12152: TNS:unable to send break message

                    Of course, if I allow the break to be sent whilst communication is still fully established I get:

                    ORA-01013: user requested cancel of current operation

                    The tests I did were very quick and dirty tests, and I'm not sure if you can reproduce the same in your environment - we seem to not be able to produce the same results!

                    Anyway, let us know how it goes if you decide to test using OCIBreak from a different thread. I would certainly like to know if you get the same results as I do in my tests.

                    Regards,

                    Mark
                    • 7. Re: Problem executing query, when the connection is lost
                      789390
                      Hi Mark,

                      Hmmm, it's very strange.. All I do is - start my application (while everything is OK with the connection). It says connected - OK. I drop the traffic to the DB, using iptables (linux) and I send a SELECT statement.. and it freezes. Nothing happens. I have another thread - timer, which on expire calls _Exit() :/ This is the way I "deal" with this problem. Anyway, in the other thread, calling terminate* - is't useful at all, you're right. But the interesting thing is, that calling
                      OCIError* pOCIError = NULL;
                      int nResult = OCIBreak( pConnection->getOCIServer(), pOCIError );
                      does not help at all ): It doesn't throw, either :/ It's interesting, that it's actually executed - returns -2 (I didn't find out what does this mean.. ), but the library is still frozen :/

                      Thanks,
                      Kiril

                      EDIT: I destroyed manually the connection and the environment (at least I tried.. apparently not successful, bacause -> ) and to create new connection and environment, but it's not working too. So, It seems that I'll leave this _Exit() .. It's not that bad, just wanted to report this (:

                      Edited by: Kiril Kirov on Oct 11, 2010 8:32 PM
                      • 8. Re: Problem executing query, when the connection is lost
                        789390
                        Hi Mark,

                        Problem solved! Using lsof, I get the FD of the opened socket and shutdown it. Then the library is unblocked and the right exception was thrown.

                        It appears that when the traffic is dropped (using iptables,) or when the server is down, the library does not understand this. But then the socket is closed by another thread, it sees that and everything continues working perfectly.

                        Best Regards,
                        Kiril Kirov
                        • 9. Re: Problem executing query, when the connection is lost
                          789390
                          Actually, closing the socket is not working. The socket must be shutdown.
                          • 10. Re: Problem executing query, when the connection is lost
                            942815
                            Hi, Kiril

                            Can you specify the detail for the solution. How did you manage this in the run time? Thank you~