Forum Stats

  • 3,768,916 Users
  • 2,252,873 Discussions
  • 7,874,795 Comments

Discussions

Heavy TCP traffic after CQN registration using cx_Oracle

user6055956
user6055956 Member Posts: 4
edited May 7, 2016 5:16PM in Python

Hello all,

I am using the sample code provided with cx_Oracle 5.2.1 to register a Continuous Query Notification to an Oracle 11.2.0.4 database ("C:\Python34\cx_Oracle-doc\samples\DatabaseChangeNotification.py").

The registration works fine and I can see the registration by querying "SELECT * FROM USER_CHANGE_NOTIFICATION_REGS"

However, once the registration is in place and before any changes are made to the database (I am currently the only one who runs queries & modifies this particular sandbox database), the oracle server sends a continuous & never-ending deluge of packets to my client. None of these connections results in my callback being triggered.

From wireshark I see that the DB initiates a connection, then sends 66 bytes of data (packet in bold below). The client then disconnects and the connection is successfully closed. (see below for example traffic). Once this connection is closed, the process starts again, this time coming from the next port number on the server (port 29549 in this example). The 66 bytes are slightly different with each connection, but never contains a table name or other recognizable text.

These connections are continuously created and destroyed always with consecutive source port numbers that seem to loop once the available port range is exhausted. I have left the python shell open for up to 3 hours and the traffic continues non-stop. The original CQN subscription was registered with a short timeout (e.g. 300 sec) and even though the subscription expires and no longer shows up in USER_CHANGE_NOTIFICATION_REGS, the incoming tcp traffic continues unabated.

Any ideas as to what is going on here?

Thanks!

No.    Time                Source                     Destination           Protocol   Length   Info                                                   
29548  0.000000000    <remoteDB IP>          10.1.4.136            TCP      74     29548→50000 [SYN] Seq=0 Win=5840 Len=0
50000  0.000057000    10.1.4.136            <remoteDB IP>          TCP      74     50000→29548 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0
29548  0.014081000    <remoteDB IP>          10.1.4.136            TCP      66     29548→50000 [ACK] Seq=1 Ack=1 Win=5888 Len=0
29548  0.014267000    <remoteDB IP>          10.1.4.136            TCP      140    29548→50000 [PSH, ACK] Seq=1 Ack=1 Win=5888 Len=74
50000  0.014287000    10.1.4.136            <remoteDB IP>          TCP      66     50000→29548 [ACK] Seq=1 Ack=75 Win=29056 Len=0
50000  0.014335000    10.1.4.136            <remoteDB IP>          TCP      66     50000→29548 [FIN, ACK] Seq=1 Ack=75 Win=29056 Len=0
29548  0.028260000    <remoteDB IP>          10.1.4.136            TCP      66     29548→50000 [FIN, ACK] Seq=75 Ack=2 Win=5888 Len=0
50000  0.028312000    10.1.4.136            <remoteDB IP>          TCP      66     50000→29548 [ACK] Seq=2 Ack=76


Update: after 3.5 hours the python callback was called (once & without me making any database changes) and gave the following output:

Message type: 5
Message database name:
Message tables:
Traceback (most recent call last):
  File "/media/pdrive/Python Projects/play.py", line 8, in callback
    for table in message.tables:
TypeError: 'NoneType' object is not iterable


Best Answer

  • Anthony Tuininga-Oracle
    Anthony Tuininga-Oracle Member Posts: 37 Employee
    edited May 6, 2016 4:31PM Accepted Answer

    Hi,

    Yes, you can remove the old subscription registrations, but not as easily as you'd think!

    You can retrieve REGID from the view *_CHANGE_NOTIFICATION_REGS (where * is replaced by DBA, ALL or USER as appropriate) and then attempt to call dbms_cq_notification(REGID) but that will fail because it wasn't registered with PL/SQL.

    There is a workaround, though, that appears to work for both 11.2 and 12.1 anyway. Execute this SQL:

    revoke change notification from <user>;

    Where <user> is the user owning the registration. That will automatically deregister all of them for you. :-)

    Anthony

Answers

  • Anthony Tuininga-Oracle
    Anthony Tuininga-Oracle Member Posts: 37 Employee
    edited May 4, 2016 11:06AM

    Hi,

    I just ran the same sample code using Python 3.4 and cx_Oracle 5.2.1 on x64 Linux, both with 11.2 and 12.1. I did not see the same behaviour as you are describing. I have had one other report of change notification not behaving, though, and it turned out to be due to a networking issue that, once resolved, also alleviated the problem they were experiencing. It also may be an OS specific configuration issue.

    The Python code is accessing the OCI directly. In order to rule out any issues with the Python/C code in cx_Oracle, are you able to build and test the C sample found here?

    http://docs.oracle.com/database/121/ADFNS/adfns_cqn.htm#BAJHEGDB

    Anthony

    user6055956
  • user6055956
    user6055956 Member Posts: 4
    edited May 6, 2016 1:55PM

    Anthony,

    Thanks for trying out that code. After further testing myself, I discovered that my issue only happens after I have been debugging my Flask server which is creating the subscriptions. If my server crashes before my code is able to explicitly call "del subscription" (this happens a lot in development!), then that orphaned subscription stays registered in the database. Next time the flask server is started it registers a new subscription to listen for the same events (since you can't simply re-attach to an existing subscription). It appears that Oracle cannot handle having multiple subscriptions for the same queries sending data to the same client and it starts emitting a never-ending stream of garbage. The only solution I have found is to close everything down and wait 1 hour for the DB to reset its internals.

    My question to Anthony is therefore as follows:is there any way I can delete old subscription registrations from the database using cx_Oracle if I don't have the original subscription object? That way I can just remove all of my old & orphaned subscriptions every time my server starts and I can avoid this issue.

    Thanks!

  • Anthony Tuininga-Oracle
    Anthony Tuininga-Oracle Member Posts: 37 Employee
    edited May 6, 2016 4:31PM Accepted Answer

    Hi,

    Yes, you can remove the old subscription registrations, but not as easily as you'd think!

    You can retrieve REGID from the view *_CHANGE_NOTIFICATION_REGS (where * is replaced by DBA, ALL or USER as appropriate) and then attempt to call dbms_cq_notification(REGID) but that will fail because it wasn't registered with PL/SQL.

    There is a workaround, though, that appears to work for both 11.2 and 12.1 anyway. Execute this SQL:

    revoke change notification from <user>;

    Where <user> is the user owning the registration. That will automatically deregister all of them for you. :-)

    Anthony

  • user6055956
    user6055956 Member Posts: 4
    edited May 6, 2016 5:29PM

    Thanks Anthony. I don't have the level of privilege to revoke or grant on user accounts, so I will have to have to ask Mordac to revoke and then re-grant change notification each time. This will probably take longer than the 1 hour it takes for the oracle DB to settle itself. I am therefore committed to 1-hour debug cycles. Still faster than the physical sciences I suppose!

    Thanks again for all your help Anthony.

  • Anthony Tuininga-Oracle
    Anthony Tuininga-Oracle Member Posts: 37 Employee
    edited May 6, 2016 5:32PM

    You're welcome. You can set the timeout to be much lower in order to have it age out more quickly -- particularly during your debugging phase -- and then adjust it higher when you have ironed out the bugs. :-)

  • user6055956
    user6055956 Member Posts: 4
    edited May 7, 2016 5:16PM

    Setting the timeout to be lower was something I've already tried without success. Unfortunately, if the Oracle DB stops receiving responses from the client (due to a client crash), it somehow gets stuck in a crazy state. Once the registration timeout counter reaches zero, Oracle automatically resets the timer to a Huge number -- something in the billions(!) and so the orphaned registration persists well beyond the original timeout: this smells of a unsigned integer bug.  If you stop all communications with the database for about an hour, these registrations will finally disappear (even though the new timeout counter is nowhere near 0 by that point) and you can start debugging again.

    Other behaviors I've noticed from the Oracle notification system is that sometimes it won't correctly close TCP connections when the client legitimately disconnects. The client will send FIN and the DB will send ACK, but never FIN. Then if you later create a new notification registration to the DB, Oracle will try to re-use the old connection without first re-establishing a new TCP session.  The client correctly responds with RST (as it should), but the database ignores the RST and simply keeps re-sending the same packets over and over.

    My overall impression is that Oracle's notification system is still very young and extremely buggy. I don't think there are enough people using this functionality for the Oracle dev teams to spend the necessary time transforming it in to a robust system.

This discussion has been closed.