9 Replies Latest reply: Feb 14, 2014 2:54 PM by cooldog RSS

    Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs

    cooldog

      cross-posted to gain a wider audience, because this is urgent

      -----------------------------------------------------------------------------------------------

      When my backup system attempts to create an LVM snapshot of an ext4 fs, it fails, and the log is full of kernel warnings.  Some of my servers actually reboot themselves.  i've had to turn off backups for the moment.

       

      It seems that this is a known problem, and a patch for it already exists.  See: Re: warning in ext4_journal_start_sb on filesystem freeze (Linux Ext4)

       

      I gave this info to a receptionist at Oracle on 12/30, and she seemed to understand that it needed to get to the Oracle Linux product group, but so far no one has contacted me for details, or to let me know that they're working on this.

       

      It seems to me that this is a 100% critical bug, that *ought* to have gotten an immediate reaction.  Does anyone here know how to get Oracle's attention?

        • 1. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
          rukbat

          1009163 wrote:

          cross-posted to gain a wider audience, because this is urgent

          Moderator Comment and Action:

          There is no "urgent" in the forums.   These are user-to-user discussion spaces and NOT a way to get to company technical support.   You will get a response if someone has knowledge of the topic and then chooses to be a registered participant in the forum web site, then decides to read your post and respond.

           

          The usage of the term "urgent" implies that your issue is critical -- so critical that everyone else must stop what they are doing and help you.   No one is paid to do that.   Everyone else on these forums has their own job to do, and it's not to sit around waiting for your posts.

           

          Your other post, some few minutes earlier

          https://community.oracle.com/thread/2617417

          has been locked because you placed it where it doesn't belong.

          This new post is appropriately placed because it is now in the UEK3 discussion space.

           

          I gave this info to a receptionist at Oracle on 12/30, and she seemed to understand that it needed to get to the Oracle Linux product group, but so far no one has contacted me for details, or to let me know that they're working on this.

           

          It seems to me that this is a 100% critical bug, that *ought* to have gotten an immediate reaction.  Does anyone here know how to get Oracle's attention?

          ... open a support request ticket with the company.  However that usually means you'll need service contract support credentials with them.   If you already have done this then you already have a SR ticket number and can go back and make direct contact with whichever support department has ownership of that SR ticket.  If there is no SR ticket number, then no one is working on your behalf yet.

           

          If this is truly something that needs to be worked on, then no one in these user-to-user forums will be able to fix it for you.

          • 2. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
            cooldog

            rukbat,

             

            In this instance, as in several similar cases over the years, I am providing a *service* to Oracle, by letting them know about something that absolutely, positively, will be having a very negative effect upon their paying customers. 

             

            Many years ago, when I did something similar for Sybase (after our support contract had expired), the product manager and principal developer worked with me to resolve their bug, and they were grateful for the information and help that I provided them, because it was in *their* interest to find and fix the bug.

             

            I don't expect any help from forum members, other than perhaps one who may have an "in" with the OL product group, and get this to their attention.  It truly is urgent, because anyone using the UEKR3 kernel and trying to make LVM snapshots is not only NOT getting the snapshots they think they are, they are also likely risking filesystem corruption. I highly doubt that this is an unusual scenario.  If that's not urgent, I don't know what is.  Not so much for me, because I know about the problem now and can mitigate the risks, but for the hundreds or thousands of other OL6.5 users.

             

            If Oracle is too big or too hidebound to have a channel for "open-source users", i.e. non-paying users, to help ORACLE by reporting serious bugs in their releases, then I'll try to find ways to get that info to them ... just like this.

             

            So I find your attitude puzzling, at best.

             

            Paul

            • 3. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
              rukbat

              In this instance, as in several similar cases over the years, I am providing a *service* to Oracle, by letting them know about something that absolutely, positively, will be having a very negative effect upon their paying customers.

               

              Many years ago, when I did something similar for Sybase (after our support contract had expired), the product manager and principal developer worked with me to resolve their bug, and they were grateful for the information and help that I provided them, because it was in *their* interest to find and fix the bug.

               

              I don't expect any help from forum members, other than perhaps one who may have an "in" with the OL product group, and get this to their attention.  It truly is urgent, because anyone using the UEKR3 kernel and trying to make LVM snapshots is not only NOT getting the snapshots they think they are, they are also likely risking filesystem corruption. I highly doubt that this is an unusual scenario.  If that's not urgent, I don't know what is.  Not so much for me, because I know about the problem now and can mitigate the risks, but for the hundreds or thousands of other OL6.5 users.

               

              If Oracle is too big or too hidebound to have a channel for "open-source users", i.e. non-paying users, to help ORACLE by reporting serious bugs in their releases, then I'll try to find ways to get that info to them ... just like this.

              Cooldog (Paul),

              Firstly, note that I'm just an end-user as you are.  I am not a company employee, but am just someone that got shanghaied to take on a community moderator role on top of being a normal forum participant.   I have access to these OTN forums only.   I do not even have an edelivery access account, let alone any MOS credentials.

               

              Let me ramble a bit...

               

              This forum site that you posted to is set aside for end users to hash things out between themselves.

              Glance at its URL --> COMMUNITY-dot-oracle-dot com.  This is the freebie site.

              My perception is that the company finds value in not having to answer the questions here.   Postings (mostly) get resolved without any effort by employee staff, and thus incurring no expense.

               

              While company employees may glance at the OTN forums, there are no expectations that they must participate.  This specific subforum may have a better chance than of being examined simply because it is for the beta status of the OS.   If someone in Oracle's OS development notices your inquiry, they might act on it or not.   There's no promise that they'll get around to finding a fix or any timeline for prioritizing a fix.

               

              There is another forum site for people with MOS credentials:

              COMMUNITIES-dot-Oracle-dot-Com

              I've been led to believe that company employees actively participate there.

              [Lastly, there is a company-internal forum.  It's employee-only]

               

              As for hoping the company offering a place for the Community to give free input, I think you are wishing for more than we all are going to get.    Unlike many or even most enterprises, there is very little anyone can get for free from Oracle.   Examples?  Access to software patches and hardware drivers require paid service contract privileges.  They're not free.

               

              I've rambled a bit too long.   Let me close by saying that I do hope your issue gets resolved. 

              • 4. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
                Catch_22

                What rukbat outlined is in my opinion not a negative personal attitude, but the result of forum experience and good advice. Sorry to say, just by reading your posting, what about your own attitude? Who was the receptionist at Oracle? Do you mean the nice lady at the entrance to the Oracle building?

                • 5. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
                  cooldog

                  The receptionist at Oracle was a person that you'll eventually get if you call their paid support line, and don't have your ID number.

                   

                  As I said, she sounded as if she understood what I was trying to convey, and understood as well why it was a good idea that she do so.

                   

                  Leaving a post on this forum was simply another effort to get the word to the proper folks, either by hook or by crook, and I can see absolutely nothing negative about the attempt.

                   

                  At the moment, I'm trying to modify the specfile to build a patched kernel, but the learning curve is rather steep what with all the macros and all the stuff done to make a 3.8 kernel "look like" a 2.6 kernel.

                  • 6. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
                    cooldog

                    Installing kernel version  3.12.6-3.12.y.20131224.ol6 from the playground/latest repo seems to take care of the problem for the moment.  I have to do more testing to see if it causes any other problems.

                    • 7. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
                      cooldog

                      Well, unfortunately that new kernel DOES cause a problem.  Now guestmount fails.  <sigh>

                      • 8. Re: Showstopper bug in latest UEKR3 kernel - can't create LVM snapshot of ext4 fs
                        Matt C

                        Hi @cooldog ,

                         

                        I hit this same LVM2 snapshot kernel oops on several Oracle Linux 6.5 servers running UEK R3 kernel version 3.8.13-16.3.1.  I have Linux Premier Support so I opened a Service Request.  Oracle Support got back to me with the following notes.

                         

                        "

                         

                        Hello Matt,

                         

                        Bug 17487738 : EXT4: STRESS TESTING WITH SUSPEND/RESUME FS ACCESS CAUSES FS ERRORS

                         

                        This bug is fixed in kernel version: 3.8.13-18. This kernel will be available quite soon for download.
                        You may upgrade the kernel once its available.

                         

                        ~Siju

                         

                        "

                         

                        Update

                         

                        Dear Matt,

                         

                        Latest available UEK3 kernel version 'kernel-uek-3.8.13-26.el6uek.x86_64' incorporates the required bugfix.

                         

                        [root@server1 tmp]# rpm -q --changelog -p kernel-uek-3.8.13-26.el6uek.x86_64.rpm | grep -i 17487738
                        warning: kernel-uek-3.8.13-26.el6uek.x86_64.rpm: Header V3 RSA/SHA256 signature: NOKEY, key ID ec551f03
                        - fs: protect write with sb_start/end_write in generic_file_write_iter (Guangyu Sun) [Orabug: 17487738] <<<<<<========================================

                         

                        You can download the UEK3 kernel from ULN or from public-yum repo.

                         

                        http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/x86_64/getPackage/kernel-uek-firmware-3.8.13-26.el6uek.noarch.rpm
                        http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/x86_64/getPackage/kernel-uek-3.8.13-26.el6uek.x86_64.rpm

                         

                        Hope this helps!

                         

                        ~Siju

                         

                         

                        Subscribe to the Oracle Linux el-errata mailing list .

                         

                        The latest kernel-uek-3.8.13-26.el6uek.x86_64 version fixed the problem.

                         

                        - Matt