13 Replies Latest reply on Dec 5, 2017 2:05 AM by SteveF-Oracle

    OAC - Connection to database is failing

    Mark.Thompson

      Folks, I need some ideas.  This is a problem that just showed up in our OAC environment within the past 2 days.  Prior to that, everything worked well.

       

      I have 8 subject areas in my repository.  Seven of them work fine.  Any time I try to execute any query against the other one, I get this message:

       

      Error Codes: OPR4ONWY:U9IM8TAC:U9IM8TAC:U9IM8TAC:U9IM8TAC:U9IM8TAC:OI2DL65P

      State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. (HY000)

       

      State: HY000. Code: 43113. [nQSError: 43113] Message returned from OBIS. (HY000)

       

      State: HY000. Code: 43119. [nQSError: 43119] Query Failed: (HY000)

       

      State: HY000. Code: 17001. [nQSError: 17001] Oracle Error code: 1012, message: ORA-01012: not logged on Process ID: 8536 Session ID: 314 Serial number: 53019 at OCI call OCIStmtExecute. (HY000)

       

      State: HY000. Code: 17010. [nQSError: 17010] SQL statement preparation failed. (HY000)

       

      • The subject area passes a consistency check.
      • If I right-click ... View Data in the physical layer of the repository, I see data, so I know the connection pool credentials are correct.

       

      I do not know if this is related or not, but this additional problem has shown up along with the first one: None of my session variables are getting set when I sign in.  However, I test the initialization block while in the repository, the SQL executes correctly and I see what I expect.

       

      And one more bit of fun: I have even rolled back to a repository that was active 3 weeks ago.  And the same failures are appearing.  Until 2 days ago, everything was running well.  Rolling back to a repository that worked 3 weeks ago should get rid of whatever is wrong, UNLESS something is goofy in the application itself.

        • 1. Re: OAC - Connection to database is failing
          Charles M

          Hi -

           

          Do you have access to the database alert log? Maybe take a look at that to see if it reveals some addtional info ...

           

          Regards,

           

          Charles

          • 2. Re: OAC - Connection to database is failing
            Mark.Thompson

            At the present time, I do not have access to the server so that I can see the log files.

            • 3. Re: OAC - Connection to database is failing
              Mark.Thompson

              More information to add:

               

              As I mentioned, we initially suspected something wrong with some changes in the repository in the affected subject area, so we uploaded the previous repository.  That did not fix the problem.  Next, we did a complete restoration from an OAC snapshot that had been created the previous day.  After the snapshot restore, the problem still did not go away.  So, at this point we are pretty sure the problem is not in the repository.

               

              Remember I said that the trouble was only occurring in one subject area?  Yesterday, it appeared (at least initially) that when I duplicated that Business Model and the Subject Area things started working properly.  The failure to connect messages went away.  I could create new analyses from the new subject area.  After a little more work in the repository, I was able to replace the old objects with the new objects, and all of our existing analyses and dashboard worked – no more "not logged on" messages. Progress!

               

              But the problem was not fully resolved.  Even after discovering the “fix” mentioned above, our session variables were not getting set when a user signed in to OAC.  Thinking that the same "create a new one" theory might be useful, I deleted and recreated the initialization blocks and their variables. No good.  The session variables STILL have no values.  It doesn’t appear to me that the SQL to populate the variable is running.  In fact, when I added a brand new session variable initialization block, its associated session variable was not getting set either. 

               

              OK, that's the status and situation on the OBIEE side.  Now comes the plot twist: Yesterday (although I am not insisting that it is a cause-and-effect relationship), our DBAs began killing Oracle database sessions that:

              • Are for the user shown in the connection pool;
              • Are over 24 hours old;
              • Are running from the server servernamehere;
              • Are running this process: nqsserver@servernamehere.

               

              So now I need some thoughts from anyone who has a deeper understanding of the "guts" of OAC/OBIEE.  If you have any words of wisdom to impart, I would love to hear them.  Is there any way that killing database sessions more than 24 hours old could cause the trouble that we are seeing here?  Is there any other theory you can suggest that could explain the behavior I have described? 

               

              Last night, we restarted the BI Server service.  The system never came back up.  I am working to get access to the server so that I can review the log files.  But in the meanwhile, any suggestions and ideas are welcome.

              • 4. Re: OAC - Connection to database is failing
                SteveF-Oracle

                I would tend to agree with Charles, to check the database side.  ORA-01012, tends to lend itself more to something occurring on database side (max process or process limit hit, improper shutdown, file permission change, memory issue, etc.).  Possibly, intermittent firewall.  I cannot explain fully the single SA, but perhaps sending more connections (and then max process hit?).

                 

                Check both your data source and RCU

                 

                Why did the DBA's start killing sessions? Did they notice a limit hit?

                 

                Other comments welcomed.

                • 5. Re: OAC - Connection to database is failing
                  FPonte

                  Hi Mark.

                   

                  This looks like is in the database side. Something triggers that unstable behaviour.

                   

                  I had a similar issue some time ago with Oracle Data Integrator opening lots of sessions exceeding the Max Number of Processes. Other processes started failing to open connections.

                   

                  Have a look at this post from Burleson that may help a little more. Maybe the DBAs already reached at it.

                  http://www.dba-oracle.com/t_ora_01012_not_logged_on.htm

                   

                  Your DBAs are able to tell what happened right before the issue started? DB Patching? Database Restart for some other reason? Any heavy load consuming/opening lots of sessions?

                   

                  Cheers.

                   

                  Fernando

                  • 6. Re: OAC - Connection to database is failing
                    Mark.Thompson

                    Steve, you included 'improper shutdown' as one of the possible issues.  I think that is exactly what we have here.  Our DBA killed database sessions that were started by the program nqsserver@servername more than 24 hours earlier.  It was at that point that the ORA-01012 errors began.  (Why did the DBA start killing sessions?  Answer: We were approaching our limit of 300 connections.)  Once we were able to restart the BI Server service, the "orphaned" connections went away, and our OAC system began to work normally again.

                     

                    Today, there are 6 sessions that match those same conditions - launched more than 24 hours ago by nqsserver@servername.  We hesitate to kill them, because based on our experience from last week, NQSServer still thinks they are alive and well, and we anticipate that killing them will cause the same trouble.

                     

                    What could cause an OBIEE session to NOT disconnect from the database?  My understanding has always been that once a connection reaches the time limit imposed within the connection pool, the connection is terminated.  None of our connection pools has a timeout longer than 5 minutes.  And we don't have any variable initialization blocks that run more frequently than once per day, so it's not like an init block could be pinging the same connection over and over again.  Something is causing OBIEE connections to NOT disconnect from the database after the appointed connection timeout.  Today there are 6 such "hung" sessions.  Tomorrow there will probably be several more.  Eventually, we will have enough of them that we will have to again take some sort of step to clean them up.  (Although, now that I think of it, we COULD just restart the BI Server service each night at some ridiculous hour like 3:00 a.m., which would not answer the "why", but could eliminate the problem.)

                     

                    What could cause an OBIEE session to NOT disconnect from the database once its connection reaches the time limit imposed within the connection pool?

                    • 7. Re: OAC - Connection to database is failing
                      Venkata Rachuri

                      Hi Mark,

                       

                      See this last reply in this thread. I think database hit the max processes limit

                       

                      First you need to shutdown OBIEE.

                      Second some sessions and rebounce the db.

                       

                      ORA-01012: not logged on

                       

                      Thanks

                      Ve

                      • 8. Re: OAC - Connection to database is failing
                        Mark.Thompson

                        Thank you.  If I may ask you to read the thread entries, you will find that we have already crossed all of those bridges.

                        • 9. Re: OAC - Connection to database is failing
                          Venkata Rachuri

                          Hi Mark,

                          what are your struck thread parameters max time

                           

                          • 10. Re: OAC - Connection to database is failing
                            SteveF-Oracle

                            Hi Mark,

                             

                            I have observed issues with on-premise with firewalls causing some of that type of behavior.  It could apply to the cloud as well.

                             

                            I would put these parameters in the sqlnet.ora that the OBI server is using and bounce it, then monitor.

                             

                            DIAG_SIGHANDLER_ENABLED=FALSE

                            DISABLE_OOB=ON

                            BREAK_POLL_SKIP=10000

                             

                            Next, can check stuck thread timer, but I believe those are pre-tuned on OAC.

                            • 11. Re: OAC - Connection to database is failing
                              Mark.Thompson

                              Because this is OAC, I don't think we have direct access to those parameters.  In any event, we would have had no reason to change them from the defaults, and changing those defaults is not part of our standard setup.  So I suspect that the answer to your question is "exactly what is shown here".  Is there a way for us to view those parameters in our OAC environment?

                              • 12. Re: OAC - Connection to database is failing
                                Venkata Rachuri

                                Don't OAC guys help you out ? because this is infrastructure  and maintance issue ? just curious

                                 

                                 

                                 

                                Thanks

                                Venkat

                                • 13. Re: OAC - Connection to database is failing
                                  SteveF-Oracle

                                  You have full and direct access to your OAC system running on public compute via SSH key. (This is different than BICS)

                                   

                                  see: Managing SSH Access

                                   

                                   

                                   

                                  This MOS document also describes how:

                                   

                                  How To Collect Diagnostic Files and Logs in Oracle Analytics Cloud (OAC) (Doc ID 2236000.1)

                                  The section... To collect configuration and diagnostic logs to review a runtime problem

                                   

                                   

                                   

                                  If you need assistance, then you should log a service request with Oracle Support.