1 2 3 Previous Next 32 Replies Latest reply: Oct 30, 2012 12:17 PM by 971499 RSS

    ACFS volume shows up as disabled after reboot or crs stop / start.

    user9308584
      Hello,

      I'm running on a two-node RAC cluster OEL6 using Enterprise Edition 11.2.0.3. All my PSUs are up to date.

      I've created the ACFS filesystem following the steps carefully (or so I thought) only to have it vanish after a reboot. Support had me reinstall the ACFS libraries and that brought me up to the point where the underlying volume was in a DISABLED state. After enabling and mounting the filesystem, all was good. I then tried shutting down the cluster on one node 'crsctl stop crs' and starting it. The volume came up as DISABLED.

      ASMCMD> volinfo -a
      Diskgroup Name: FS_ORCL_CFS0195

      Volume Name: INSRACD_VG
      Volume Device: /dev/asm/insracd_vg-107
      State: DISABLED
      Size (MB): 5120000
      Resize Unit (MB): 32
      Redundancy: UNPROT
      Stripe Columns: 4
      Stripe Width (K): 128
      Usage: ACFS
      Mountpath: /insracd

      oracle@lmmk87:/fisc/oracle> /sbin/acfsutil registry
      Mount Object:
      Device: /dev/asm/insracd_vg-107
      Mount Point: /insracd
      Disk Group: FS_ORCL_CFS0195
      Volume: INSRACD_VG
      Options: none
      Nodes: lmmk87,lmmk88
      oracle@lmmk87:/fisc/oracle>

      This required an umount of the filesystem, an ENABLE from asmcmd and then a mount to get it back.

      How can I automate this so that the volumes return as enabled and the filesystem exists without intervention? I thought the fact that I registered them ...

      /sbin/acfsutil registry -a -f -n lmmk87,lmmk88/dev/asm/insracd_vg-107 /insracd

      What would take care of this situation? I looked into using the srvctl add filesystem, but that seems more geared toward when using ACFS for a shared ORACLE_HOME and creating a CRS dependency -- we're not doing that. I'm appreciative of this forum and I hope I targeted this question in the correct place. I'll try to add some ASM log information on this.

      Thanks,

      Malcolm.
        • 1. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
          Levi Pereira
          Hi,
          Put here relevant info of alert.log of ASM during startup.

          Please use text formatting options by using tag
           at begin and end of output.                                                                                                                                                                                                                                                                                                            
          • 2. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
            user9308584
            I may have found something else. In the ASM logs I'm seeing this ...

            ORA-15032: not all alterations performed
            ORA-15027: active use of diskgroup "FS_ORCL_CFS0195" precludes its dismount

            QL> ALTER DISKGROUP FS_ORCL_CFS0195 DISMOUNT FORCE /* asm agent *//* {1:15916:2754} */
            Thu Sep 20 11:31:59 2012
            NOTE: client +ASM1:asmvol deregistered
            NOTE: cache dismounting (not clean) group 4/0x58836407 (FS_ORCL_CFS0195)
            NOTE: messaging CKPT to quiesce pins Unix process pid: 5877, image: oracle@lmmk87 (TNS V1-V3)
            Thu Sep 20 11:32:01 2012
            NOTE: halting all I/Os to diskgroup 4 (FS_ORCL_CFS0195)
            Thu Sep 20 11:32:01 2012
            NOTE: LGWR doing non-clean dismount of group 4 (FS_ORCL_CFS0195)
            NOTE: LGWR sync ABA=8.74 last written ABA 8.74
            Thu Sep 20 11:32:01 2012


            I had to shutdown the cluster using crsctl stop crs -f , but I later noticed that to dismount the acfs filesystem afterwards, I had to run fuser -c /insracd to find and kill users still on the FS. Not sure if making sure that acfs is free and clear of all users prior to stopping the cluster is a normal manual step to perform.

            Malcolm.

            S
            • 3. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
              Levi Pereira
              Oracle ACFS File Systems Must be Manually Dismounted Prior to Upgrade, Deinstallation, or Direct Shutdown of Oracle Clusterware or Oracle ASM

              http://docs.oracle.com/cd/E11882_01/readmes.112/e22488/toc.htm#sthref382.
              • 4. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                user9308584
                Hi,

                I have not been doing that. I will test it out. Thank you. I guess there's no way that this could be automated in a shutdown init sequence. During a shutdown of a server, users are logged off, but the order of the FS dismount would have to coincide. I can't imagine unix admins are handed instructions to manually kick off the users and dismount acfs prior to clusterware coming down. We do provide orderly steps, but I never heard of this with the filesystem. It must be automated somehow.

                Thanks again,

                Malcolm.
                • 5. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                  Levi Pereira
                  I'm very discontented with this task has to be performed manually. Since when we need reboot a server crsctl must handle with all task automatically as OS does.

                  But we can automate this by creating a script to do that.

                  If you start creation of this script I can help you finish it. We can create a resource on Clusterware to do it before ACFS try dismount the filesystem.

                  So, what we need do is. Check and clear all process running under that filesystem and after that the script will handle with this and return 0 to CRS, after that the CRS will make a clean shutdown.
                  • 6. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                    user9308584
                    Exactly. It seems ludicrous that we're left to manage this, but I'm not finding anything cut and dry. A check using fuser -c /filesysem can tell what users need to be killed before the dismount. "We can create a resource on Clusterware to do it before ACFS try dismount the filesystem." -- this is the part I need to research. Never done this, some of these things are handled by the shutdown sequences, but if it can be added to the crs, I'm all for it.
                    • 7. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                      user9308584
                      is this it ...

                      As an extension of CRS, ASCRS must be installed within each CRS home on each node of the cluster and configured separately before it can be used.

                      With the ASCRS command line tool ascrsctl, you can give ASCRS control of various middleware components. Once a component is controlled by CRS, its runtime state is closely monitored and CRS takes the proper actions if the component fails. With ascrsctl, you can create a CRS resource, and once a resource is created, you can perform start, stop, update, switch, status, and delete operations on it.
                      • 8. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                        Levi Pereira
                        ACFS Technical Overview and Deployment Guide

                        Un-mounting filesystems involves a typical OS umount command. Before unmounting the filesystem, ensure that it is not in use.
                        This may involve stopping dependent databases, jobs , etc. In-use filesystems cannot be unmounted. There are various methods to show open file references of a filesystem.

                        The Linux/Unix lsof command - Will show open file descriptors for the filesystem
                        * ASMCMD lsof - Lists open files in an ASM instance by database or disk group.
                        * Unix/Linux fuser command - Displays the PIDs of processes using the specified filesystems.

                        Any users or processes listed should be logged off or killed (kill –9).

                        http://www.oracle.com/technetwork/products/cloud-storage/acfs-technical-overview-514457.pdf


                        If you create this script and it works, just post script here and I help you with "Create a resource on Clusterware" .
                        • 9. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                          Allan-Oracle
                          The ACFS registry should handle enabling the volume, mounting the file system, and dismounting the volume during shutdown. If there are active processes on the file system, it will try and kill them. Some processes, such as NFS exports, cannot be identified or do not respond to the kill request. These processes can prevent the unmount of the ACFS file system during stack shutdown.

                          When the stack is started up again, the file system will show as OFFLINE in the 'acfsutil info fs' output. These file systems will need to be unmounted and remounted. The ACFS registry resource will attempt to automate this operation for you, but certain conditions, such as processes with open files on the file system, can prevent this.

                          If the ACFS registry resource is not mounting the file system automatically on stack start, the first place to look is the CRS alert log - look for messages with 'ACFS-XXXX' as the message id. These messages should show the following information:
                          1) mounting the file system
                          2) unmounting the file system
                          3) processes using the file system

                          Another thing to check is that your ACFS registry resource is online. You can do that by taking a look at 'crsctl stat res ora.registry.acfs -t'. It should be ONLINE on all nodes.
                          • 10. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                            Levi Pereira
                            Allan,

                            Oracle ACFS File Systems Must be Manually Dismounted Prior to Direct Shutdown of Oracle Clusterware or Oracle ASM.

                            That mean CRS does not handle with ACFS dismounting the volume during shutdown.

                            {message:id=10589054}
                            • 11. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                              user9308584
                              Hi Allan,

                              Another thing to check is that your ACFS registry resource is online. You can do that by taking a look at 'crsctl stat res ora.registry.acfs -t'. It should be ONLINE on all nodes.

                              Would this imply that CRS was managing the filesystem as if it were added using 'srvctl add filesystem', that's not the case here with mine. If doing this would resolve the issues, I'd give it a try.

                              Malcolm.
                              • 12. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                                Allan-Oracle
                                This statement is incorrect.

                                What it means is that CRS will try and unmount the volume. If it cannot, it will try and kill processes.

                                Best practice however is to manually do this. This is consistent with shutting down the system.
                                • 13. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                                  Allan-Oracle
                                  Malcolm -
                                  If you add a file system to the registry, the ora.registry.acfs resource manages the file system.

                                  This is different than the single file system resource.

                                  They both achieve the same end - enable the volume, mount the file system, unmount the volume and disable the volume.

                                  You can use a single file system resource for general purpose file systems.

                                  If your ora.registry.acfs resource is not online, then it is not managing the file systems - just like any other CRS resource or the single file system resource.
                                  • 14. Re: ACFS volume shows up as disabled after reboot or crs stop / start.
                                    Levi Pereira
                                    Allan wrote:
                                    This statement is incorrect.

                                    What it means is that CRS will try and unmount the volume. If it cannot, it will try and kill processes.

                                    Best practice however is to manually do this. This is consistent with shutting down the system.
                                    Please read the link:

                                    2.2.16 Oracle ACFS File Systems Must be Manually Dismounted Prior to Upgrade, Deinstallation, or Direct Shutdown of Oracle Clusterware or Oracle ASM
                                    http://docs.oracle.com/cd/E11882_01/readmes.112/e22488/toc.htm#sthref382

                                    The word MUST means MANDATORY thefore it's not a best practice, and Oracle don't try kill any process these files opened on ACFS. He try only unmount, if he cannot do that he will hang o force a shutdown of ASM.

                                    This error below was posted on this thread.. that what happens when ACFS have some files opened by other process during CRS Shutdown
                                    I may have found something else. In the ASM logs I'm seeing this ...
                                    
                                    ORA-15032: not all alterations performed
                                    ORA-15027: active use of diskgroup "FS_ORCL_CFS0195" precludes its dismount
                                    
                                    QL> ALTER DISKGROUP FS_ORCL_CFS0195 DISMOUNT FORCE /* asm agent // {1:15916:2754} */
                                    Thu Sep 20 11:31:59 2012
                                    NOTE: client +ASM1:asmvol deregistered
                                    NOTE: cache dismounting (not clean) group 4/0x58836407 (FS_ORCL_CFS0195)
                                    It is highly recommended to un-mount any ACFS filesystems first before ASM instance is shutdown. A forced shutdown or failure of ASM instance with a mounted ACFS filesystem will result in I/O failures and dangling file handles; i.e., the ACFS filesystem user data and metadata that was written at the time of the termination may not be flushed to storage before ASM storage is fenced off. Thus a forced shutdown of ASM will result in the ACFS filesystem with an offline error state. In the event that a file system enters into an offline error state, the ACFS Mount Registry action routines attempt to recover the file system and return it to an on-line state by un-mounting and re-mounting the filesystem.

                                    So, If you don't ensure that you dismounted your ACFS Filesystem clean, you will corrupt your ACFS Filesystem.


                                    http://www.oracle.com/technetwork/products/cloud-storage/acfs-technical-overview-514457.pdf

                                    Edited by: Levi Pereira on Sep 22, 2012 1:57 PM
                                    1 2 3 Previous Next