This discussion is archived
9 Replies Latest reply: Jan 27, 2013 10:15 PM by 901518 RSS

Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)

901518 Newbie
Currently Being Moderated
Hello,

I am quite new to Oracle, we have a 2-node RAC 10.2.0.4 configured on RHEL 4.5 on a virtual environment using VMWare

When the system starts up, all services are running fine as seen in crs_stat -t :


[oracle@bsspbbi2 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....I1.inst application ONLINE ONLINE bsspbbi1
ora....I2.inst application ONLINE ONLINE bsspbbi2
ora.BSSPBBI.db application ONLINE ONLINE bsspbbi1
ora....SM1.asm application ONLINE ONLINE bsspbbi1
ora....I1.lsnr application ONLINE ONLINE bsspbbi1
ora....bi1.gsd application ONLINE ONLINE bsspbbi1
ora....bi1.ons application ONLINE ONLINE bsspbbi1
ora....bi1.vip application ONLINE ONLINE bsspbbi1
ora....SM2.asm application ONLINE ONLINE bsspbbi2
ora....I2.lsnr application ONLINE ONLINE bsspbbi2
ora....bi2.gsd application ONLINE ONLINE bsspbbi2
ora....bi2.ons application ONLINE ONLINE bsspbbi2
ora....bi2.vip application ONLINE ONLINE bsspbbi2


But suddenly, after some time, the listener stops (sometime it is the listener of the node1, sometimes node2):

[oracle@bsspbbi1 admin]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....I1.inst application ONLINE ONLINE bsspbbi1
ora....I2.inst application ONLINE ONLINE bsspbbi2
ora.BSSPBBI.db application ONLINE ONLINE bsspbbi1
ora....SM1.asm application ONLINE ONLINE bsspbbi1
ora....I1.lsnr application ONLINE ONLINE bsspbbi1
ora....bi1.gsd application ONLINE ONLINE bsspbbi1
ora....bi1.ons application ONLINE ONLINE bsspbbi1
ora....bi1.vip application ONLINE ONLINE bsspbbi1
ora....SM2.asm application ONLINE ONLINE bsspbbi2
ora....I2.lsnr application ONLINE OFFLINE
ora....bi2.gsd application ONLINE ONLINE bsspbbi2
ora....bi2.ons application ONLINE ONLINE bsspbbi2
ora....bi2.vip application ONLINE ONLINE bsspbbi1

I have to manually start the listener again by issuing the ff commands, depending on which node has the listener stopped:

lsnrctl start
srvctl start nodeapps -n bsspbbi1

After which, the status is back to online and we can again succesfully connect to the RAC, to node1 and/or node 2.

Is it related to network? Because the 2 VMs shares one physical LAN, meaning the public, private, and virtual IPs of the 2 nodes shares on 1 physical LAN.
They run on the same VMWare host.

I hope to receive feedbacks from you.

Thanks.

-Chris
  • 1. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    asahide Expert
    Currently Being Moderated
    Hi,

    First, Put your listener log file, pls.

    Regards,
  • 2. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    901518 Newbie
    Currently Being Moderated
    Hello,

    here's the listener.log:

    24-JAN-2013 13:31:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:31:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:31:47 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:32:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:32:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:33:05 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:33:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:33:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:33:56 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:34:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:34:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:34:52 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=bsspbbi1.etpi.com.ph)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER_BSSPBBI1)(VERSION=169870336)) * status * 0
    24-JAN-2013 13:35:16 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:35:16 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:35:17 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:36:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:36:13 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:36:14 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:37:16 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:37:16 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:37:35 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:38:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:38:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:38:47 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:39:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:39:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:39:44 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=BSSPBBI.etpi.com.ph)(CID=(PROGRAM=imp)(HOST=bsspbbi2.etpi.com.ph)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.216.36)(PORT=37743)) * establish * BSSPBBI.etpi.com.ph * 0
    24-JAN-2013 13:39:47 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:40:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:40:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:40:56 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:41:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:41:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:41:47 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:42:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:42:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:43:05 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:43:22 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:43:22 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:43:56 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:44:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:44:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:44:53 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=bsspbbi1.etpi.com.ph)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER_BSSPBBI1)(VERSION=169870336)) * status * 0
    24-JAN-2013 13:45:14 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:45:22 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:45:22 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:46:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:46:19 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:46:21 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:47:33 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:47:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:47:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:48:33 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:48:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:48:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:49:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:49:34 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:49:48 * service_update * BSSPBBI2 * 0
    24-JAN-2013 13:50:32 * node_down * bsspbbi1 * 0
    24-JAN-2013 13:50:32 * service_down * +ASM1 * 0
    24-JAN-2013 13:50:32 * service_down * BSSPBBI1 * 0
    24-JAN-2013 13:50:32 * service_down * BSSPBBI1 * 0
    24-JAN-2013 13:50:32 * node_down * bsspbbi1 * 0
    24-JAN-2013 13:50:32 * service_register * BSSPBBI1 * 0
    24-JAN-2013 13:50:32 * service_update * BSSPBBI1 * 0
    24-JAN-2013 13:50:32 * service_register * +ASM1 * 0
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1)))
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.216.45)(PORT=1521)))
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.216.35)(PORT=1521)))
    Listener completed notification to CRS on stop
    24-JAN-2013 13:50:32 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=bsspbbi1.etpi.com.ph)(USER=oracle))(COMMAND=stop)(ARGUMENTS=64)(SERVICE=LISTENER_BSSPBBI1)(VERSION=169870336)) * stop * 0
  • 3. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    asahide Expert
    Currently Being Moderated
    Hi,
    24-JAN-2013 13:50:32 * node_down * bsspbbi1 * 0
    ..
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1)))
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.216.45)(PORT=1521)))
    No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.216.35)(PORT=1521)))
    Listener completed notification to CRS on stop
    24-JAN-2013 13:50:32 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=bsspbbi1.etpi.com.ph)(USER=oracle))(COMMAND=stop)(ARGUMENTS=64)(SERVICE=LISTENER_BSSPBBI1)(VERSION=169870336)) * stop * 0
    It looks like Listener has been stopped by CRS, Because of "bsspbbi1 node_down".
    But I think Node bsspbbi1 isn't down, right?

    Maybe heartbeat has broken temporary..
    (You should check crs log, if you want to know what is cause.)

    Regards,
  • 4. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    901518 Newbie
    Currently Being Moderated
    Thank you very much asahide0,

    Yes, the node was never down..

    Yes, maybe. I can see logs in the crs like this:

    2013-01-23 15:09:15.137
    [cssd(8469)]CRS-1610:node bsspbbi2 (2) at 90% heartbeat fatal, eviction in 0.150 seconds

    With that, can I assume that the cause is the network traffic? Since all LANs uses 1 physical network card?
    Is there anything we can adjust in Oracle? is there a Timeout settings that can be configured? Or we need to adjust the hardware already?

    Thanks a lot.
  • 5. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    asahide Expert
    Currently Being Moderated
    Hi,
    With that, can I assume that the cause is the network traffic? Since all LANs uses 1 physical network card?
    Maybe..
    Is there anything we can adjust in Oracle? is there a Timeout settings that can be configured? Or we need to adjust the hardware already?
    Is This site helpful 4u?
    <<http://salaic-dbaoracle.blogspot.jp/2009/04/oracle-rac-q.html>>

    Regards,
  • 6. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    901518 Newbie
    Currently Being Moderated
    Thanks! :)

    Will do what is suggested in your link, and will update this thread if that solved my issue.

    Post question:
    What will be the effect if we increase the misscount parameter? Will that greatly impact the performance?
    Thanks again.

    Edited by: 898515 on Jan 24, 2013 5:55 PM
  • 7. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    asahide Expert
    Currently Being Moderated
    Hi,
    What will be the effect if we increase the misscount parameter? Will that greatly impact the performance?
    ==
    Increasing misscount will prolong the time to take corrective action in the event of network failure or other anomalies effecting the availability of a node in the cluster. This directly effects cluster availability.
    ==

    By here..
    <<https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=294430.1>>

    Regards,
  • 8. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    user583843 Newbie
    Currently Being Moderated
    hi,

    you need to add this line in Red Hat Enterprise Linux 4.vmx so that disk will not hang.
    reslck.timeout="1200"

    May be your disks I/O is more and because of that node was rebooted, try above command.
  • 9. Re: Listener Suddenly Stops (Oracle DB 10g R2 RAC on VMWare)
    901518 Newbie
    Currently Being Moderated
    user583843 wrote:
    hi,

    you need to add this line in Red Hat Enterprise Linux 4.vmx so that disk will not hang.
    reslck.timeout="1200"

    May be your disks I/O is more and because of that node was rebooted, try above command.
    Thanks for this, can you explain it further? Are you saying that network might not be the problem, and it might be the disk i/o?
    We have already added network cable, so the setup now is each VM are assigned to 1 vSwitch which has 1 physical network cable connected, but we still encountered the same error.
    When we added the network cable, we have successfully imported 2 user schemas and data. But now, we still experience the same issue.

    Thanks.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points