1 2 Previous Next 23 Replies Latest reply on Apr 1, 2010 6:53 PM by lrp1

    ORA-27102 SVR4 Error: 12: Not enough space

    lrp
      Our Oracle 11.1.0.7 database is running on Solaris 10 / Sunfire hardware. Unfortunately, we've been seeing our database crash and corrupt a datafile in the process several times. The key alert log message is below (which I've blanked out some info for security purposes):
      KCF: write/open error block=0xbc28 online=1
      file=6 /***************.dbf
      error=27063 txt: 'SVR4 Error: 12: Not enough space
      Additional information: -1
      Additional information: 8192'
      
      KCF: write/open error block=0x1571 online=1
      file=57 *************.dbf
      error=27063 txt: 'SVR4 Error: 12: Not enough space
      Additional information: -1
      Additional information: 8192'
      Automatic datafile offline due to write error on
      file 57: ***********.dbf
      I've already done a forum search for the error codes but couldn't find any other post that matched my exact situation/code. The key coincidence here is that the number 8192 matches a value in my ulimit (stack size). Below is what I would currently see as my oracle user:
      ulimit -a
      core file size (blocks, -c) unlimited
      data seg size (kbytes, -d) unlimited
      file size (blocks, -f) unlimited
      open files (-n) 256
      pipe size (512 bytes, -p) 10
      stack size (kbytes, -s) 8192  <-------------
      cpu time (seconds, -t) unlimited
      max user processes (-u) 29995
      virtual memory (kbytes, -v) unlimited
      Oracle's documentation doesn't give me much information on what ORA-27063 means. Does anybody else know how Oracle reports 'Error 12:" codes coming from the OS? What the "additional information: -1" or "additional information: 8192" means? I need this in my troubleshooting so we can measure and adjust the correct resource, instead of blindly increasing stack size or other resource parameters on the OS.
        • 1. Re: ORA-27102 SVR4 Error: 12: Not enough space
          sb92075
          27063, 00000, "number of bytes read/written is incorrect"
          // *Cause:  the number of bytes read/written as returned by aiowait
          // does not match the original number, additional information
          // indicates both these numbers
          // *Action: check errno

          : 'SVR4 Error: 12: Not enough space
          Is the volume filling up & running out of free disk space?
          1 person found this helpful
          • 2. Re: ORA-27102 SVR4 Error: 12: Not enough space
            lrp
            Good thoughts, but we had ruled that out early: Filesystem's not the issue, the disk had plenty of space. Metalink itself (472813.1) points to "the Unix error number (ERRNO) 12 during a Unix write()/open() system call, and this Unix error indicates a lack of *process memory* rather than physical disk space."

            - /var/adm/messages has no memory- or disk-related messages around the time of failure.
            - SAN administrator saw nothing in their logs at the time of failure

            We had already tried raising SGA and raising shared memory in the solaris project, but it seems like we're fishing for an answer by blindly raising a parameter when we don't know what OS limit Oracle had reached. The key numbers I'm looking for are those specified in the 'additional information' section. Oracle's knowledge base has nothing that I can use so far.
            • 3. Re: ORA-27102 SVR4 Error: 12: Not enough space
              sb92075
              http://www.lmgtfy.com/?q=oracle+SVR4+Error:+12:+Not+enough+space
              • 4. Re: ORA-27102 SVR4 Error: 12: Not enough space
                lrp
                Thanks! We'd definitely been checking google for information for the past couple weeks before checking with Oracle Forums. In fact, there are several blogs and [lazydba/experts-exchange links|http://www.lazydba.com/oracle/0__125336.html] on the subject which point us in the right direction and was the basis for us looking at enlarging the shared memory kernel parameters to start.

                At this point, it's more how to interpret how Oracle spits out information, since there wasn't any publicly available information on the format of the error code.

                Much like how P1, P2, and P3 in v$session_wait will mean different things, I would guess that the "Additional Information" tokens after the "Error: 12 code" mean different things. That much is evident in my searches, where it appears that:
                Error code 12 => memory related things
                Error code 11 => resource is temporarily unavailable for whatever reason
                Error code 5 => disk/IO issue

                .. so drilling down further, Error code 12's two additional information items must mean something:
                -1 => some return code?
                8192 => the number at which it failed at? Some bit-wise address?

                At no point does the stack gets mentioned in our diagnostic text, which is why I'm asking the larger oracle community.
                • 5. Re: ORA-27102 SVR4 Error: 12: Not enough space
                  sb92075
                  Are file=6 & file=57 on same volume?

                  What is storage architecture containing Oracle's dbf files?
                  What flavor of file system supports Oracle's dbf files?
                  • 6. Re: ORA-27102 SVR4 Error: 12: Not enough space
                    lrp
                    To elaborate, the filesystem is UFS-based, all configured in RAID-6 [(striped disks with dual parity)|http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks] (including the redo logs and archive logs). The storage underneath the filesystem is a Hitachi 9985v SAN, meaning the physical disks themselves were grouped into logical partitions and then divvied up into filesystems. Files #6 & 57 are on different filesystems, but all of them had ample space at the time.
                    • 7. Re: ORA-27102 SVR4 Error: 12: Not enough space
                      sb92075
                      Since errors are being logged into alertSID.log file, you know when this problem occurs.
                      When these errors occur, is it during HEAVY I/O activity.

                      I am not making any accusation, just making idle observation.
                      I have never used (or seen) UFS under Oracle.
                      An ever so slight possibility is a file system bug is inflicting the damage.

                      From my exeperience, the root cause is outside Oracle at OS or similar layer.
                      Oracle is just to dumb to lie about errors & it is detecting a SNAFU in underlying system.

                      Good Luck with your Gremlin hunt.
                      • 8. Re: ORA-27102 SVR4 Error: 12: Not enough space
                        lrp
                        I am entirely with you on this.
                        -- if it was exclusively an oracle memory error, we would have seen an ORA-4031 indicating exactly which pool was compromised.
                        -- if it was a dbwr error, we would have seen a more descriptive alert saying 'disk full' or 'crc did not match'
                        -- etc.

                        The only IO activity we did catch was the fact that RMAN archive log backups were running at the time. Our I/O usage charts for those filesystems (and solaris sar/vmstat/iostat counts) did not spike during those times.

                        Regarding the filesystem setup, our vote was to use ZFS, but this was a decision made beyond our heads. Unfortunately, without a real error to show the SAN administrator, we are unable to provide them effective evidence to support the claim. Getting proper diagnostic information was the point of this forum post -- notably, Metalink's own article +(22080.1 - An Introduction to Error Message Articles)+ pretty much admits we need to look further than just the error codes:

                        Please note that the steps included under these additional headings
                        is generally very terse. Information may be incomplete and may use
                        abbreviations which are not self explanatory. It is also possible that
                        there may be references to articles which are not visible to customers.
                        These additional notes are intended to help give pointers and are NOT
                        intended to be a complete explanation of the error.

                        More complete descriptions may be included in separate problem/solution or
                        bulletin documents.
                        • 9. Re: ORA-27102 SVR4 Error: 12: Not enough space
                          orafad
                          ora-27063 mentions "Cause: the number of bytes read/written as returned by aiowait does not match the original number, additional information indicates both these numbers"
                          So OS returned -1 ("error state") while Oracle expected 8196 bytes (block size, probably).

                          About the ENOMEM (error code 12) - it might be a good idea to check shared mem settings and system memory overall.
                          1 person found this helpful
                          • 10. Re: ORA-27102 SVR4 Error: 12: Not enough space
                            sb92075
                            When these errors occur, is it during HEAVY I/O activity?
                            or just randomly across 24 hours?
                            • 11. Re: ORA-27102 SVR4 Error: 12: Not enough space
                              lrp
                              to SB: This symptom has showed up across our dev and production databases every 2 months or so. Interpreting from the alert log, it has also happened close to the time when an archive/backup occurs at the same time as either a recompile, auto-stats gather, or snapshot. So, "load" would appear to cause it, but nothing I would consider heavy I/O activity.

                              orafad: The metalink for ORA-27063 does mention this, and I guess it to be a rather generic post. The real culprit has to be the codes behind the 27063, which is what I'm trying to get to the bottom of:
                              SVR4 = a header error indicating some OS 'thing'
                              Error 12 = ???? Solaris memory error. This appears to show up for most 'capacity-related' things on google searches, for both file descriptors, semaphores, swap, and shared memory.
                              Additional information = -1 error state, you're most likely correct.
                              Additional information: 8192 = could mean anything, and that's what I need to find out from either Metalink support (in progress) or Solaris's knowledge base.

                              Update: I apologize, orafad -- you mentioned the ENOMEM, which I missed. Where did you reference this ? I'd love to have an additional resource to look up the error code. Our memory_target/memory_max_size were set at 5G/8GB, with the shared memory set at 20GB and overall physical memory at 32GB. Our sysadmin logs showed no memory usage or swap errors, leading me to believe it was not a general 'out of memory' error so much as a kernel resource setting (semaphores, per process limit of some sort).

                              Edited by: lrp on Jun 1, 2009 5:50 PM

                              Edited by: lrp on Jun 1, 2009 5:53 PM
                              • 12. Re: ORA-27102 SVR4 Error: 12: Not enough space
                                sb92075
                                This symptom has showed up across our dev
                                if I were in your shoes, I'd do what I could to change the underlying file system .
                                If problem still happens on different fs, then file system flavor can be ruled out as possible root cause.

                                Again, I am fairly certain Oracle is the victim & not the culprit.
                                Proving who or what is to blame will be a battle.

                                Happy Hunting!
                                • 13. Re: ORA-27102 SVR4 Error: 12: Not enough space
                                  lrp
                                  Because this happens so infrequently, I need a way to measure what is happening (be it stack, filedescriptors) . I've already got scripts tallying those resources per process on a 5 minute interval, so I'm hoping to prepare myself for the next occurrence (could be next week, could be a month from now). ..unfortunately, moving to another filesystem is going to be rather hard to prove a case for, since we would have no way to really identifying whether the experiment was successful.

                                  Thanks for your time in this.
                                  • 14. Re: ORA-27102 SVR4 Error: 12: Not enough space
                                    sb92075
                                    Are there any additional clues in OS messages or dmesg logs?
                                    1 2 Previous Next