Forum Stats

  • 3,722,383 Users
  • 2,244,297 Discussions
  • 7,849,820 Comments

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Problem with centos2ol.sh migration script

Microlinux
Microlinux Member Posts: 33 Green Ribbon

Hi,

Two weeks ago I got badly bitten by the centos2ol.sh migration script. After running it successfully on about half a dozen of my servers, I decided to migrate my main production server. Unfortunately all I got after the initial reboot was this:

ssh: connect to host sandbox port 22: No route to host

Since the machine is in a datacenter 800 km from where I live, all I could do was setup a new server from scratch and then restore all my backups manually. This was a bare-metal installation, so I spent four long days getting everything back to normal.

I just tested the migration script on a local CentOS installation, and got the same problem. Ran the script, rebooted, and... no net. Since now I have physical access, I can confirm only the loopback interface appears.

So whatever it is the developers were doing these last two or three weeks. Something bad was introduced in the migration script.

Cheers,

Niki

PS: in France we have the saying "Le mieux est l'ennemi du bien".

PPS: how do you display a code listing in this forum? Are there special tags to use?

Answers

  • Microlinux
    Microlinux Member Posts: 33 Green Ribbon

    I investigated this a bit more and I think I found the answer. My local sandbox machine is an old ThinkCentre PC. Booting on the UEK kernel gives me no network interface except 'lo'. But when I boot the original Red Hat kernel, I see my enp3s0 interface OK. Now when I consider the fact that my production server was also some older hardware, it looks like the UEK kernel has some trouble with some network cards.

    I'll stick to the pragmatic solution by avoiding the UEK kernel in the future and reverting everything to the Red Hat kernel.

  • Microlinux
    Microlinux Member Posts: 33 Green Ribbon

    Some more information.

    Tried a vanilla installation on this machine.

    Booting the UEK kernel results in a kernel panic on bootup.

    Booting the RCHK kernel works fine.

    As far as I understand, it's possible to stick with the RHCK kernel from the start using the x86_64-boot.iso. This boots the RHCK kernel and also defaults to the same kernel after the initial reboot.

    Q: is there a way to default to RHCK (and not install kernel-uek) using the OracleLinux-R7-U9-Server-x86_64-dvd.iso installation medium? Or do I have to use the x86_64-boot.iso to achieve this?

    Cheers,

    Niki

  • Avi Miller-Oracle
    Avi Miller-Oracle Senior Solution Architect, Oracle Cloud Infrastructure Developer Adoption Melbourne, AustraliaPosts: 4,785 Employee

    The forum software allows you format something as a code block by clicking the paragraph symbol that appears outside the text box and then the quote icon and then change to code block. Possibly the least intuitive mechanism I've ever seen, in my opinion.

    As for the other options: we added a flag to centos2ol.sh to skip installing or enabling the UEK, so that should help any future migrations. However, I'm genuinely suprised by this report, because we tend to keep more older devices enabled than the RHCK. It would be great if you could let me know which network cards are not being seen by the UEK, so I can follow up internally.

    I haven't tested this yet, but I suspect you could probably just disable the UEK repo/installation source in the default Anaconda installer so that the UEK is not installed. It may also be a package group. Otherwise, creating a kickstart file is probably the best way to create a repeatable standard install for your needs: https://docs.oracle.com/en/operating-systems/oracle-linux/7/install/ol7-install-options.html#ol7-install-kickstart

    Microlinux
  • Microlinux
    Microlinux Member Posts: 33 Green Ribbon

    Unfortunately I don't have any access to the root server anymore, because it got decommissioned two days ago.

    Here's the fairly standard NIC on my local sandbox Thinkcentre PC:

    $ lspci | grep -i eth 
    03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
    

    And here's what happens.

    1. Booting any version of the Red Hat kernel succeeds, and there don't seem to be any problems.
    2. Booting UEK 5.4.17-2011.6.2 results in a kernel panic on startup.
    3. Booting UEK 5.4.17-2036.104.5 gets me to a prompt, but I have no net.

    The last attempt gives me a "realtek.ko not loaded" error message on boot.

    One other thing. I know you guys copy RHEL feature by feature and bug by bug. But one thing that's really annoying is the missing default URL in the network installer. Because everytime I have to configure it, I have to lookup the correct URL, which happens to be this:

    http://yum.oracle.com/repo/OracleLinux/OL7/latest/x86_64
    

    Cheers,

    Niki

  • Avi Miller-Oracle
    Avi Miller-Oracle Senior Solution Architect, Oracle Cloud Infrastructure Developer Adoption Melbourne, AustraliaPosts: 4,785 Employee

    I think @User_GSQTY is correct about needing to add the Realtek driver to initramfs.

    As for the network installer issue: I've logged an enhancement for OL8 because it annoys me too. I'm hoping it'll get added to the next OL8 release, but as it's unlikely there will be another OL7 release, this will not be added to that version. We do not rebuild ISOs once they are released for various legal/escrow reasons.

  • Microlinux
    Microlinux Member Posts: 33 Green Ribbon

    Since this Realtek card is a very common piece of equipment, I wonder how the OL 7.9 installer could have passed Q&A. This is a showstopping bug which made me spend four days in hell. Fifty websites and all emails for ten domains all gone.

    My "luck" was that a few days before the OVH datacenter in Strasbourg had gone down in flames, so in light of this (if I may say so) my clients showed some understanding since I could at least restore everything from my backup server (which, by the way, already runs OL7).

    After this nerve-wrecking experience you'll understand why I'd rather stick with RHCK.

  • Avi Miller-Oracle
    Avi Miller-Oracle Senior Solution Architect, Oracle Cloud Infrastructure Developer Adoption Melbourne, AustraliaPosts: 4,785 Employee
    edited March 23

    This is the first report we've had of a Realtek not working immediately after an install. We've only had reports of this happening after an upgrade and we couldn't reproduce that in-house.

    We do thousands of hours of QA for each release, so whenever anything manages to get past our QA team, it becomes the subject of some serious investigating.

    I'll forward the PCI details for the card you're using to the QA team .

    Microlinux
  • Microlinux
    Microlinux Member Posts: 33 Green Ribbon
    edited March 23

    My production server crash happened on March 7th 2021, after migrating an up-to-date CentOS 7 system to Oracle Linux using the centos2ol.sh script. I'll see if I can find out which NIC was used on that machine.

    By the way, please know this: I'm rather a fatalistic kind of guy (as a climber you should be). Things happen even to the best of us. Remember the boothole bug last summer? I had to cancel a week of holidays because of this. As an admin you have to develop a pretty thick skin for this kind of stuff.

    You all know that precise moment when you take a peek at your terminal and say to yourself: "Oh %&[email protected]# my week is ruined." 😀

    One thing I appreciate about OL : how you guys deal with problems.

Sign In or Register to comment.