Oracle Analytics Cloud and Server

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

OAS 6.4 stuck in signing in, after moving to exadata

Closed
241
Views
27
Comments
2

Answers

  • Mostafa Morsy-Oracle
    Mostafa Morsy-Oracle Rank 6 - Analytics Lead

    @paamikumar

    From logs I can see that your nqsserver was crashing till May 6 also I can see some query cancelled After 48 Minutes of running and also you mentioned you switched to exadata DB and issue start to happen adding all these facts together I can conclude that

    your nqsserver may get into hang status from time to time which lead to this issue I will recommend the following to narrow down the issue

    1- When issue happen run the following Query in your DB Side


    select count(*),program from gv$session where LAST_CALL_ET > 600 and status = 'INACTIVE' group by program;

    If you getting a big Number then I will recommend to edit NQSConfig.INI File you can control the query Time through the following Two Variables

    DEFAULT_DB_MAX_EXEC_TIME = 600;

    MAX_LOGICAL_QUERY_EXEC_TIME = 600;

    NQSServer need to be restarted so changes can take affect

    Values above can be changed based on your estimation how long the longest query run

  • paamikumar
    paamikumar Rank 2 - Community Beginner

    We have been able to pin point the issue. Sessions are created and not cleared. When we open the rpd online, there are more than 2000 sessions. Once we clear the sessions, we are able to login.

    Issue still continuing. Any suggestions where we need tweaking.

    Thanks,

    Paami.

  • [Deleted User]
    [Deleted User] Rank 7 - Analytics Coach

    You need to figure out where those sessions come from. A sane implementation doesn't break even when 2000 users are logged on and I doubt you have 2k concurrent users who are causing this.

    What is keeping those connections open? Where do they even start their lifecycle? What queries are being forced where that cause this?

  • paamikumar
    paamikumar Rank 2 - Community Beginner

    Hi Christian,

    When we open the rpd online, we see these session of users which last used = Never.

    Even after we purged these sessions, we had to restart the obis1 and obips1.

    It is just not making any sense to us as well.

    Attached is a screenshot.

    Thanks,

    Paami

  • [Deleted User]
    [Deleted User] Rank 7 - Analytics Coach

    Those are cache entries. Not sessions.

  • paamikumar
    paamikumar Rank 2 - Community Beginner

    Hi Christian,

    We see these cache entries. When we run the command netstat -a | grep -i established | wc -l, the sessions are increasing to > 2000.

    Thanks,

    Paami.

  • I checked the Service Request.
    There are a lot of "moving parts" for investigation. I see discussions on OAM SSO, Load Runner tests, etc.
    There appear to be some outstanding action plans that should be completed.

    1. Update to the latest patched client tools
    2. Clean diagnostic dump after the issue reproduces to assist in narrowing the issue/ check for any crashes, etc.
    3. Questions about agent schedule/backlog
    4. RPD configuration for Exadata Oracle DB source needs some updates.
    5. Checking for initialization block failures, hangs

    6. Checking ulimits.

    Previously, a 'broken pipe' was reported, which I see in the logs attached here.. that can indicate the nqsserver crashed (as Moustafa) mentioned, but there is no .out file attached here to correlate that.

    Next time you restart, notate the PID of the obipsX, obisX, when the issue occurs, check if the PID changes (that is a easy clear indication of crash), in addition there should be crash reports generated in the component logs directory.

    Once the issue is narrowed, the thread can be updated; otherwise, it may be guess after guess here without proper data to perform root cause analysis.

    The service request is also 24*7, which can be helpful in some cases, but also not efficient if there is no traction on the issue, or new lines of thinking without follow through. I would suggest you align it with someone in your timezone. The assigned person has a team they can collaborate with, but this really needs a thorough methodical review.

    BTW, you should change your netstat command should be filtered by the nqsserver port (default: :9514, but you can check with status.sh -v command)

  • KhaderBelgoud-Oracle
    KhaderBelgoud-Oracle Rank 4 - Community Specialist

    @paamikumar In the obis diagnostic log, i can see init block failures cause by invalid SQL statement. Please validate the failed queries and then check the signin issues. Also make sure you have selected Correct Database Type When Designing An RPD Refer:- (Doc ID 2965721.1)

  • There is an SR open, all the above has been suggested already.
    There is progress being made, all will be summarized post solution.

This discussion has been closed.