I have a big issue. I searched a lot in internet, but could't find anything, that answered my question.
The problem is:
When I try to establish connection (using createConnection) and the connection is successful, everything's fine. execute, executeUpdate and executeQuery work correct. But when something happens and the connection to the server is down (for example, when i drop the traffic to the server using a firewall rule), and then try to call execute*, everything just stops working. I mean, no exception is thrown, no error response in any form is received, nothing. The program just stops working..
The environment is created this way: environment = oracle::occi::Environment::createEnvironment( sCharset, sNCharset, Environment::THREADED_MUTEXED );
and the connection this way: connection = m_ptrOracleEnvironment->environment->createConnection( sUser, sPassword, sConnectionString );
void MyFunction( /* ... */ )
throw( /* ... */ )
/* ... */
catch( SQLException& sqlExcp )
/* ... */
catch( ... )
/* ... */
/* ... */
This doesn't work, when the connection is down..
Is this a bug or I did something wrong?
Thanks a lot in advance,
Edited by: Kiril Kirov on Sep 3, 2010 10:44 AM
I don't think this will help much, but I just did a few tests using OCCI 18.104.22.168 64-bit Windows to a 22.214.171.124 database on 32-bit Windows. If I pulled the network cable or used a firewall to block the communication after the connection was established the result was the same: after approximately 20 seconds there was a timeout with the following exception SQLException message:
3113: ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 9 Serial number: 15
I'm just creating a "plain" environment as so:
env = Environment::createEnvironment(Environment::DEFAULT);
I was also issuing a "select user from dual" query and invoking executeQuery. However, I'm not sure those differences are enough to account for the behavior you are seeing. What versions are you using? Also, are there any parameters specified in the sqlnet.ora/tnsnames.ora files on the client and server which might impact timeouts? I'm not aware of any off the top of my head that would cause this behavior.
Well, this is strange :/ My database is ( on a RHEL 4 machine ):
Oracle9i Enterprise Edition Release 126.96.36.199.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 188.8.131.52.0 - Production
Also, I tested with two different versions of OCCI - 10.2.0.3-1.i386 on a RHEL 4 machine and 184.108.40.206.0-1.i386 on a RHEL 5.3 machine. It's the same on both places.
I changed the mode to Environment::DEFAULT (just to test), but it's still the same.
Could you tell me where do you set this timer? Or it's set by default? Because everything I found about OCCI timers was:
"OCCI does not provide any other explicit way to set timeouts for connections."
except the setTimeOut( .. ) method in ConnectionPool class, but I use just Connection ( not ConnectionPool). Maybe this could be a problem ?
Thanks a lot,
Edited by: 786387 on Aug 6, 2010 3:51 AM
I did not set any timers or anything else that would be related to timeout values. I used mostly default installs for the client and the server pieces. I will try to take a look again later today and see if anything looks like a possibility, but I'm not sure what at thist stage. Yes, I agree, this looks strange.
I found (while researching for different OCCI things), these 2 issues: Re: Non Blocking Call with OCCI
and Re: Can you set a timeout on a sql
The first one is closer to my problem. For example, please take a look at these posts:
I have a cursor in my plsql program - the sql query hangs due to a problem with the target database (this is not the error). In my program I have a sequence of procedure calls and this is one of hte procedure. <b>Since the sql hangs - the whole program hangs.</b>
<b>What I would like to do is to attempt to run the sql for a certain amount of time - if it does not return, then exit out of the cursor and continue with the rest of the program. Can this be done ?</b>
and especially this one:
I wrote an application server(daemon process) to talk with oracle server continuous which used oracle9 OCCI lib, each 5 min it executes the procedure on the DB server.
Now I have come cross a problem:
<b>If the network is blocked, app server will blocked at occi call and would never pass, and no exception was catched </b>
1. Oracle server reboot without shutdown oracle process
2. udp broadcast message storm blocked the connection between app server and oracle DB.
I consider maybe it's because OCCI using the blocking mode of connection that caused this problem.
How can I interrupt the blocking call when call timeout?</i>
At least I found other people with this problem.
EDIT: Also, do you think, that some of these can help we in this situation (when the OCCI is already blocked) :
or something like
can help ?
Thanks again (:
Edited by: Kiril Kirov on Oct 6, 2010 12:04 PM
Well, I managed to reproduce the situation you describe (at least I think so!) by doing an "ifdown eth0" on a database server whilst an OCCI call is in progress.
If the OCCI calls are blocked, I'm not sure how much help the OCCI terminate* calls would be. They would block as well I suspect - I've not tested that though.
I did test issuing an OCIBreak from another thread and that does work to a degree.
To test I simply executed dbms_lock.sleep(10) to sleep for 10 seconds from the main thread. During that time in a second thread I issue an OCIBreak call after sleeping for 3 seconds in that thread (to give me time for the ifdown).
What I mean by "to a degree" is when the interface on the server is down the OCIBreak call results in capturing the following in the main thread:
ORA-12152: TNS:unable to send break message
Of course, if I allow the break to be sent whilst communication is still fully established I get:
ORA-01013: user requested cancel of current operation
The tests I did were very quick and dirty tests, and I'm not sure if you can reproduce the same in your environment - we seem to not be able to produce the same results!
Anyway, let us know how it goes if you decide to test using OCIBreak from a different thread. I would certainly like to know if you get the same results as I do in my tests.
Hmmm, it's very strange.. All I do is - start my application (while everything is OK with the connection). It says connected - OK. I drop the traffic to the DB, using iptables (linux) and I send a SELECT statement.. and it freezes. Nothing happens. I have another thread - timer, which on expire calls _Exit() :/ This is the way I "deal" with this problem. Anyway, in the other thread, calling terminate* - is't useful at all, you're right. But the interesting thing is, that calling
OCIError* pOCIError = NULL;
int nResult = OCIBreak( pConnection->getOCIServer(), pOCIError );
does not help at all ): It doesn't throw, either :/ It's interesting, that it's actually executed - returns -2 (I didn't find out what does this mean.. ), but the library is still frozen :/
EDIT: I destroyed manually the connection and the environment (at least I tried.. apparently not successful, bacause -> ) and to create new connection and environment, but it's not working too. So, It seems that I'll leave this _Exit() .. It's not that bad, just wanted to report this (:
Edited by: Kiril Kirov on Oct 11, 2010 8:32 PM
Problem solved! Using lsof, I get the FD of the opened socket and shutdown it. Then the library is unblocked and the right exception was thrown.
It appears that when the traffic is dropped (using iptables,) or when the server is down, the library does not understand this. But then the socket is closed by another thread, it sees that and everything continues working perfectly.