Forum Stats

  • 3,780,774 Users
  • 2,254,438 Discussions
  • 7,879,447 Comments

Discussions

OracleDataAdapter, no command timeout and fill method hang.

2»

Answers

  • Alex Keh-Oracle
    Alex Keh-Oracle Posts: 3,009 Employee

    This is expected behavior when you lose network connectivity for too long. You get errors, such as ORA-12571 after connectivity is restored. At some point in time, one component in the connection chain will consider the connection lost and initiate resource recovery, such as Oracle client, DB, the OS, or network router. At that point, the connection is not recoverable.

    Oracle's solution to this problem is Application Continuity (AC), which recovers the in-flight DB session via a new connection. To the end user, it's as if nothing happened. Oracle takes care of the hard work in making this all look transparent by reconnecting and re-establishing the session and transaction states.

    AC requires using Oracle DB with RAC, RAC One Node, or Active Data Guard and unmanaged ODP.NET. Managed ODP.NET and ODP.NET Core support is planned for next year.

  • user1698609
    user1698609 Member Posts: 7 Green Ribbon

    Based on your statement, can we summarize that any loss of the network packages (can be as short as milliseconds or as long as few seconds) may possibly result in behavior that the app with oracle client loss the session with DB server and result in app hang at oracleDataAdapter.Fill (in my example)?

    The resolution is to enable the AC by adding Application Continuity=true to the connection string. (as refer to https://medium.com/oracledevs/oracle-net-application-continuity-getting-started-34e7045e863). Is that correct? The AC only available in oracle client 12.2 and above.

    Other than of enable AC, is there any other way to break app hang at oracleDataAdapter.Fill ? Can Command.Cancel help to break?

    If it is not breakable, I'm thinking of handling this behavior by implementing a threading in .NET app to monitor the execution time of racleDataAdapter.Fill. Should it executes longer than expected (for example 60 second), the thread will just abort and continue with trying to re-establish a new connection. Is this the better way to handle this behavior?

  • Alex Keh-Oracle
    Alex Keh-Oracle Posts: 3,009 Employee

    I've personally never lost the connection if I pull the plug for a second or two. It usually takes longer. That's just my experience. I would expect the time limit depends on the component in your network path with the lowest time threshold tolerance before it considers the connection/session lost.

    On the ODP.NET side, yes, you just need to set Application Continuity to true, which is the default setting in the newest versions anyway. The DBA then has to configure AC on the DB server side. I do recommend using the newest ODP.NET version you can. Recovering sessions and state is a very hard problem to solve generically. With each new release, we add more usage cases that work well in AC.

    In your use case, the only way I can think of to break the command is to have PL/SQL do it. If the client loses connectivity, there's no way for the client to tell the DB to cancel the command. If the timer is running on the DB side, then there's no network reliance.

    AC isn't the only solution. In the past, customers have built their own custom solutions to recover from a discrete set of situations. Using AC is just easier since it covers more situations and eliminates the need to build it oneself.

    The ODP.NET command cancel will still work in scenarios when the network connection still works.

  • user1698609
    user1698609 Member Posts: 7 Green Ribbon

    Many thanks for your info, I'll then get DBA to check on the RAC setup.

    Aside to the network, is there any other possibility that may causing the issue as i described? Meaning the program execution stop at oracleDataAdapter.Fill infinitely?

  • Alex Keh-Oracle
    Alex Keh-Oracle Posts: 3,009 Employee

    If the command cancellation works fine when the network remains connected, then the root cause is the network loss or some consequence of it.

    With that said, I do not have any DB nor network diagnostics to tell me exactly what failed. This is my best guess based on past experience.

  • user1698609
    user1698609 Member Posts: 7 Green Ribbon

    In my case, the command cancellation doesn't work. Program try to cancel but nothing happen, execution still stopped at oracleDataAdapter.Fill.

  • Alex Keh-Oracle
    Alex Keh-Oracle Posts: 3,009 Employee

    Command cancellation could have a bug, either on the client or server side. It's hard to say what the problem is without traces on both sides to see whether the cancel message is propagates to the server and how the server deals with it.

    Part of the complication is that both the ODP.NET and DB versions you are using are very old (11.2). There have been a lot of bug fixes in the last decade that may address the specific situation you hit.