Skip to Main Content

ODP.NET

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

OracleDataAdapter, no command timeout and fill method hang.

User_7NXUZJun 9 2021

I have a VB program using .NET Framework 3.5 and ODP.NET managed driver (Oracle.DataAccess.dll 2.112.3.0) that connects to a remote server (pooling is default) to query data.
I have assigned CommandTimeout to 5 seconds and tried to query data that takes about 10 seconds to finish. But OracleDataAdapter does not throw SQL time out error when connect to a remote server. Meanwhile OracleDataAdapter do throw SQL time out error when connect to a local database. It looks like the CommandTimeout did not work to a remote server. How to make CommandTimeout work to a remote server?
Another problem is program stuck in the fill method when internet connection lost during execution of method. Setting the CommandTimeout to 60 seconds did not work. Why CommandTimeout no working when connection lost during query data?
Included code:
Dim connection As OracleConnection = New OracleConnection(connectionstring)
Dim oracleDataAdapter As New OracleDataAdapter(connection.CreateCommand())
Dim dataset as New DataSet
Try
oracleDataAdapter.SelectCommand.AddToStatementCache = False
oracleDataAdapter.SelectCommand.NotificationAutoEnlist = False
oracleDataAdapter.SelectCommand.CommandTimeout = 60
oracleDataAdapter.SelectCommand.CommandText = str_SQL
Dim int_Count As Integer = oracleDataAdapter.Fill(dataset, table)
Return int_Count
Catch
Throw
Finally
Dataset.Dispose()
oracleDataAdapter.Dispose()
End Try

Comments

Alex Keh-Oracle

The top thing I would recommend is to migrate to the latest ODP.NET release. You can find unmanaged ODP.NET 19.10 xcopy version on OTN. You are using the 11.2.0.3 version, which is about 10 years old.
With respect to your command timeout issue, cancelling command execution on a remote DB server requires ensuring that your network isn't inadvertently preventing the cancel message from propagating. Some routers or firewalls may drop this network activity. You can learn more about configuring ODP.NET command timeout based on your network topology here.
For your second situation, the command timeout is not designed for situation in which you lose a connection to the Oracle DB.

User_7NXUZ

Alex Keh-Oracle Thanks for the answer.
For command timeout issue, there is no any lose connection to the Oracle DB. In this condition, commandTimeout no working to a remote server. So how to make CommandTimeout work to a remote server if there is no any lose connection?

Alex Keh-Oracle

You can learn more about configuring ODP.NET command timeout based on your network topology here. That's the most common reason I see Command Timeout failing.

user1698609

Hi @AlexKeh-Oracle,
User_7NXUZ is my colleague. Let me explain a little bit more on the situation that we encountered.
We tested the SELECT and UPDATE sql statement with CommandTimeout=5 in local database XE, it worked fine. Oracle exception "user request cancel.... " was thrown after 5 seconds.
But SELECT sql statement CommandTimeout doesn't work after switching the database connection point to the remote database resided in the same network.
The UPDATE sql statement CommandTimeout still working fine in the remote database.

Alex Keh-Oracle

Which ODP.NET version and which DB version are you working with in the remote DB situation?
If the SELECT statement is executed within five seconds or so, then there won't be a cancellation. CommandTimeout is more like a minimum amount of time for the SQL statement to execute, not a maximum. Moreover, it only measures the time it takes to execute the SELECT. It does not include the time it takes to return results to the client.

user1698609

We are using Oracle.DataAccess.dll 2.112.3.0. The DB is XE 10 for both local and remote DB.
Understand the CommandTimeout will not trigger exactly the when it reach the time set (for example 5sec), but I would expect it will thrown anytime after 5sec as the SELECT query taking 10+ sec to complete.

Alex Keh-Oracle

That does seem to be taking a long time for the command timeout to occur. A DB trace could tell us more exactly why the command timeout doesn't trigger. However, it would be fruitful to upgrade to the latest DB and ODP.NET releases to ensure you are not hitting a bug that's already been fixed. The Oracle stack you're using is 10 or so years old.

user1698609

I did an upgrade the Oracle Client to 19, the database server maintain at 11g.
We're trying to simulate oracle client and db server communication packet loss by unplugging the network cable at the app server during executing of the query and plug it back few second later. In some cases, the program just hang at the line of data fetching infinitely (as highlighted below).
image.pngFew minutes later, the program try to break the query by invoking OracleCommand.Cancel but nothing happened (I would expect ORA-01013 or other exception will be thrown). Also tried to close the connection but it hang at the Connection.Close() infinitely.
We did same test by pointing database to XE 18c. We can't simulate the above hang issue but rather ORA 03113 error was thrown when unplugging and plunging back of network cable during executing query (usually it happened within 60 second). This is still ok as the program able to continue instead of hanging and error handling can do a retry and resume gracefully.

Alex Keh-Oracle

For network outage recovery, Oracle Application Continuity (AC) is needed to recover the session.
Application Continuity (0 Bytes)AC actually solves a larger class of HA recovery scenarios than just losing the network, which is why it's part of RAC, RAC One Node, and Active Data Guard, but not part of XE. Unmanaged ODP.NET supports AC.
ODP.NET command cancellation works if no software nor hardware outage occurs.

user1698609

We developed program (in .NET) to try to reproduce the hang issue. The program just keep fetching data from database. In normal execution, the query will take ~4 seconds to fetch ~140000 records. The hang able to simulate 4 out of 15 times (~30%).
Below the steps to reproduce the hang
Run the application. The app just keep querying data from the database server. The function as show below.
Observe the query execution, at ~ 2nd or 3rd second (based on the log display on the screen), unplug the network cable
Plug back the network cable at ~15th second
The program occasionally just hang at the line oracleDataAdapter.Fill(dataset)
image.pngWe tested against different version of unmanaged and managed Oracle Client 11, 12 and 19 and database server 11 XE and 18 XE, hang issue able to reproduced.
Is this the behavior or the Oracle Client? Any suggestion to break the execution? Or any suggestion to handle this behavior?

Alex Keh-Oracle

This is expected behavior when you lose network connectivity for too long. You get errors, such as ORA-12571 after connectivity is restored. At some point in time, one component in the connection chain will consider the connection lost and initiate resource recovery, such as Oracle client, DB, the OS, or network router. At that point, the connection is not recoverable.
Oracle's solution to this problem is Application Continuity (AC), which recovers the in-flight DB session via a new connection. To the end user, it's as if nothing happened. Oracle takes care of the hard work in making this all look transparent by reconnecting and re-establishing the session and transaction states.
AC requires using Oracle DB with RAC, RAC One Node, or Active Data Guard and unmanaged ODP.NET. Managed ODP.NET and ODP.NET Core support is planned for next year.

user1698609

Based on your statement, can we summarize that any loss of the network packages (can be as short as milliseconds or as long as few seconds) may possibly result in behavior that the app with oracle client loss the session with DB server and result in app hang at oracleDataAdapter.Fill (in my example)?
The resolution is to enable the AC by adding Application Continuity=true to the connection string. (as refer to https://medium.com/oracledevs/oracle-net-application-continuity-getting-started-34e7045e863). Is that correct? The AC only available in oracle client 12.2 and above.
Other than of enable AC, is there any other way to break app hang at oracleDataAdapter.Fill ? Can Command.Cancel help to break?
If it is not breakable, I'm thinking of handling this behavior by implementing a threading in .NET app to monitor the execution time of racleDataAdapter.Fill. Should it executes longer than expected (for example 60 second), the thread will just abort and continue with trying to re-establish a new connection. Is this the better way to handle this behavior?

Alex Keh-Oracle

I've personally never lost the connection if I pull the plug for a second or two. It usually takes longer. That's just my experience. I would expect the time limit depends on the component in your network path with the lowest time threshold tolerance before it considers the connection/session lost.
On the ODP.NET side, yes, you just need to set Application Continuity to true, which is the default setting in the newest versions anyway. The DBA then has to configure AC on the DB server side. I do recommend using the newest ODP.NET version you can. Recovering sessions and state is a very hard problem to solve generically. With each new release, we add more usage cases that work well in AC.
In your use case, the only way I can think of to break the command is to have PL/SQL do it. If the client loses connectivity, there's no way for the client to tell the DB to cancel the command. If the timer is running on the DB side, then there's no network reliance.
AC isn't the only solution. In the past, customers have built their own custom solutions to recover from a discrete set of situations. Using AC is just easier since it covers more situations and eliminates the need to build it oneself.
The ODP.NET command cancel will still work in scenarios when the network connection still works.

user1698609

Many thanks for your info, I'll then get DBA to check on the RAC setup.
Aside to the network, is there any other possibility that may causing the issue as i described? Meaning the program execution stop at oracleDataAdapter.Fill infinitely?

Alex Keh-Oracle

If the command cancellation works fine when the network remains connected, then the root cause is the network loss or some consequence of it.
With that said, I do not have any DB nor network diagnostics to tell me exactly what failed. This is my best guess based on past experience.

user1698609

In my case, the command cancellation doesn't work. Program try to cancel but nothing happen, execution still stopped at oracleDataAdapter.Fill.

Alex Keh-Oracle

Command cancellation could have a bug, either on the client or server side. It's hard to say what the problem is without traces on both sides to see whether the cancel message is propagates to the server and how the server deals with it.
Part of the complication is that both the ODP.NET and DB versions you are using are very old (11.2). There have been a lot of bug fixes in the last decade that may address the specific situation you hit.

1 - 17

Post Details

Added on Jun 9 2021
17 comments
1,974 views