7 Replies Latest reply: Oct 16, 2013 5:31 PM by Dude! RSS

    Reboot issue with OCFS2 partition on OEL5U2

    plabrevo-Oracle
      I have configured a shared virtual disk, and this disk is shared between multiple virtual machines, all running OEL5 U2.

      There is apparently no issue with OCFS2 and this file-system. OCFS2 is started and running fine.

      However, if I specify this partition in /etc/fstab, I am getting problems during the reboot. What I see from the VM console is:

      fsck.ocfs2: invalid option -- a
      +Usage: fsck.ocfs2 [ -fGnuvVy] [ -b superblock block ]+
      +[ -B block size] [-r num] device+

      Critical flags for emergency repair:
      +. . .+

      Les critical flags:
      +. . .+

      ** An error occured during the file system check.
      *** Dropping you into a shell; the system will reboot
      *** when you leave the shell.
      Give root password for maintenance
      +(or type Control-D to continue):+

      I have tried the options _netdev in /etc/fstab but without much success.

      The weird thing is that this occurs even there was no activity on the file system, even when there is no file in the filesystem. The only workaround I found was to specify the mount in /etc/rc.local instead of /etc/fstab.

      Would this is a bug in OEL5, OCFS2, or my misunderstanding?

      Thanks

      Pascal

      Edited by: plabrevo on Oct 3, 2008 1:06 AM
        • 1. Re: Reboot issue with OCFS2 partition on OEL5U2
          Tommyreynolds-Oracle
          Please post your current:

          1     /etc/fstab

          2     /etc/ocfs2/cluster.cong

          3     /etc/sysconfig/o2cb

          Thanks!
          • 2. Re: Reboot issue with OCFS2 partition on OEL5U2
            plabrevo-Oracle
            The content of these files is as follow:

            */etc/fstab*
            +...+
            +/dev/hdb1 /shared ocfs2 _netdev,defaults 0 3+


            */etc/ocfs2/cluster.conf*
            node:
            ip_port = 7777
            ip_address = 192.168.1.171
            number = 0
            name = vis1204a-dbserver1
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.173
            number = 1
            name = vis1204a-dbserver2
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.175
            number = 2
            name = vis1204a-dbserver3
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.177
            number = 3
            name = vis1204a-appserver1
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.178
            number = 4
            name = vis1204a-appserver2
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.179
            number = 5
            name = vis1204a-appserver3
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.181
            number = 6
            name = vis1204b-dbserver1
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.183
            number = 7
            name = vis1204b-dbserver2
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.185
            number = 8
            name = vis1204b-dbserver3
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.187
            number = 9
            name = vis1204b-appserver1
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.188
            number = 10
            name = vis1204b-appserver2
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.189
            number = 11
            name = vis1204b-appserver3
            cluster = ocfs2

            node:
            ip_port = 7777
            ip_address = 192.168.1.191
            number = 12
            name = vis1204c-single
            cluster = ocfs2

            cluster:
            node_count = 13
            name = ocfs2


            */etc/sysconfig/o2cb*
            +#+
            +# This is a configuration file for automatic startup of the O2CB+
            +# driver. It is generated by running /etc/init.d/o2cb configure.+
            +# On Debian based systems the preferred method is running+
            +# 'dpkg-reconfigure ocfs2-tools'.+
            +#+

            +# O2CB_ENABLED: 'true' means to load the driver on boot.+
            O2CB_ENABLED=true

            +# O2CB_STACK: The name of the cluster stack backing O2CB.+
            O2CB_STACK=o2cb

            +# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.+
            O2CB_BOOTCLUSTER=ocfs2

            +# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.+
            O2CB_HEARTBEAT_THRESHOLD=

            +# O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead.+
            O2CB_IDLE_TIMEOUT_MS=

            +# O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive packet is sent+
            O2CB_KEEPALIVE_DELAY_MS=

            +# O2CB_RECONNECT_DELAY_MS: Min time in ms between connection attempts+
            O2CB_RECONNECT_DELAY_MS=
            • 3. Re: Reboot issue with OCFS2 partition on OEL5U2
              Tommyreynolds-Oracle
              Change your "/etc/fstab" like this:

                   /dev/hdb1 /shared ocfs2 _netdev 0 3

              because "defaults" is a placeholder, not a set of options. This used to be called "-" but that was realy hard to read.

              Does this help?
              • 4. Re: Reboot issue with OCFS2 partition on OEL5U2
                plabrevo-Oracle
                Nope, same issue with _netdev only.

                In both case, _netdev is properly interpreted since the failure occurs after network services are started.                                                                                                                                                                                                                                                                                                   
                • 5. Re: Reboot issue with OCFS2 partition on OEL5U2
                  506787
                  Could you verify if there are any messages in /var/log/messages which have a relationship with this problem? Perhaps you've updated your system and kernel and forgot to install the ocfs2 kernel module for that specific kernel version? The messages file should reveal if so. In other words: just a hunch.
                  • 6. Re: Reboot issue with OCFS2 partition on OEL5U2
                    user541629

                    I had the same problem and I fixed mine by changing the following

                     

                    /dev/hdb1 /shared ocfs2 _netdev 0 3

                     

                    changed to

                     

                    /dev/hdb1 /shared ocfs2 _netdev 0 0

                     

                    changing it to 0 0 worked for me.

                    • 7. Re: Reboot issue with OCFS2 partition on OEL5U2
                      Dude!

                      Checking a cluster filesystem during reboot is a bad idea, because it might  already be mounted on another node. Fsck'ng any mounted file system can result in data loss.

                       

                      A similar problem exists also with gfs: https://bugzilla.redhat.com/show_bug.cgi?id=732921

                       

                      /dev/hdb1 /shared ocfs2 _netdev 0 3

                      should better read:

                      /dev/hdb1 /shared ocfs2 _netdev 0 0