This discussion is archived
8 Replies Latest reply: Aug 20, 2013 12:18 AM by WadhahDAOUEHI RSS

802.3ad Bonding issues with OEL 6.1

msimm29 Newbie
Currently Being Moderated

We are trying to get 802.3ad (Mode 4) bonding functional on our Oracle database servers running OEL 6.1.

 

Everything seems to be ok, but we cannot ping in or out when we set mode to 4.  If we change mode to 0 and reboot everything works fine.

 

The following is our config.

 

/etc/modprobe.d/bond.conf

alias bond0 bonding

 

/etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

ONBOOT=yes

BROADCAST=10.41.5.255

IPADDR=10.41.5.88

NETMASK=255.255.255.0

GATEWAY=10.41.5.3

DNS1=172.28.210.50

DNS2=172.28.179.50

USERCTL=no

BONDING_OPTS="mode=4 miimon=100 updelay=40000"

ifcfg-em1

DEVICE=em1

ONBOOT=yes

BOOTPROTO=none

USERCTL=no

MASTER=bond0

SLAVE=yes

ifcfg-p3p1

DEVICE=p3p1

ONBOOT=yes

BOOTPROTO=none

USERCTL=no

MASTER=bond0

SLAVE=yes

 

Here is the bonding section of /var/log/messages

Aug  8 10:23:17 lxsmgcm15003c kernel: Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.

Aug  8 10:23:17 lxsmgcm15003c kernel: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated).  Use CAP_NET_ADMIN and alias netdev-bond0 instead

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: setting mode to 802.3ad (4).

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: Setting MII monitoring interval to 100.

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: Setting up delay to 40000.

Aug  8 10:23:17 lxsmgcm15003c kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: Adding slave em1.

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:01:00.0: em1: using MSIX

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: enslaving em1 as a backup interface with a down link.

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: Adding slave p3p1.

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:07:00.0: p3p1: using MSIX

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: enslaving p3p1 as a backup interface with a down link.

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:01:00.1: em2: using MSIX

Aug  8 10:23:17 lxsmgcm15003c kernel: ADDRCONF(NETDEV_UP): em2: link is not ready

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:01:00.0: em1: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: link status up for interface em1, enabling it in 0 ms.

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: link status definitely up for interface em1.

Aug  8 10:23:17 lxsmgcm15003c kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:07:00.0: p3p1: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON

Aug  8 10:23:17 lxsmgcm15003c kernel: bonding: bond0: link status up for interface p3p1, enabling it in 40000 ms.

Aug  8 10:23:17 lxsmgcm15003c kernel: bnx2 0000:01:00.1: em2: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON

Aug  8 10:23:17 lxsmgcm15003c kernel: ADDRCONF(NETDEV_CHANGE): em2: link becomes ready

Aug  8 10:23:17 lxsmgcm15003c avahi-daemon[3074]: Registering new address record for fe80::862b:2bff:fe5a:6e49 on bond0.*.

Aug  8 10:23:17 lxsmgcm15003c avahi-daemon[3074]: Registering new address record for 10.41.5.88 on bond0.IPv4.

 

Aug  8 10:23:53 lxsmgcm15003c kernel: bonding: bond0: link status definitely up for interface p3p1.

 

 

/proc/net/bonding/bond0

Ethernet Channel bonding Driver: v3.6.0 (September 26, 2009)

 

Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Transmit Hash Policy: layer2 (0)

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 40000

Down Delay (ms): 0

 

802.ad info

LACP rate: slow

Aggregator selection policy (ad_select): stable

Active Aggregator Info:

            Aggregator ID: 1

            Number of ports:1

            Actor Key: 17

            Partner Key: 1

            Partner Mac Address: 00:00:00:00:00:00

 

Slave Interface: em1

MII Status: up

Link Failure Count: 0

Permanent HW addr: 84:2b:2b:xx:xx:xx

Aggregator ID: 1

Slave queue ID: 0

 

Slave Interface: p3p1

MII Status: up

Link Failure Count: 0

Permanent HW addr: 00:10:18:xx:xx:xx

Aggregator ID: 2

Slave queue ID: 0

 

 

I asked the networking guy to verify that these ports are indeed set for etherchanneling and he provided this in response..

 

c#sho etherchannel summary

Flags:  D - down        P - bundled in port-channel

I - stand-alone s - suspended

H - Hot-standby (LACP only)

R - Layer3      S - Layer2

U - in use      f - failed to allocate aggregator

 

M - not in use, minimum links not met

u - unsuitable for bundling

w - waiting to be aggregated

d - default port

 

 

Number of channel-groups in use: 9

Number of aggregators:           9

 

Group  Port-channel Protocol    Ports

------+-------------+-----------+-----------------------------------------------

2      Po2(SU) -        Te3/3(P)    Te3/4(P)

11 Po11(SU) -        Gi1/1(P)    Gi2/1(P)

12 Po12(SU) -        Gi1/2(P)    Gi2/2(P)

13 Po13(SU) -        Gi5/1(P)    Gi6/1(P)

14 Po14(SU) -        Gi5/2(P)    Gi6/2(P)

15 Po15(SU) -        Gi1/3(P)    Gi2/3(P)

16 Po16(SU) -        Gi1/4(P)    Gi2/4(P)

17 Po17(SU) -        Gi5/3(P)    Gi6/3(P)

18 Po18(SD) -        Gi5/4(D)    Gi6/4(D)

 

em1 and p3p1 go to card 1 and card 2 port 4 so they are etherchanneled.

 

If someone has any ideas, we are drawing at straws here.

 

Thanks,

Matt

  • 1. Re: 802.3ad Bonding issues with OEL 6.1
    Dude! Guru
    Currently Being Moderated

    What is updelay=40000 supposed to accomplish? Do you have LACP enabled on your switch?

  • 2. Re: 802.3ad Bonding issues with OEL 6.1
    msimm29 Newbie
    Currently Being Moderated

    We did some testing back when we were running Mode 0.  When I pulled one network cable then waited a bit and plugged it back in, we would get a bunch of ping timeouts for about 30 seconds.  It was determined that the bond was putting the network card back into service before the card had finished initializing.  Adding this updelay fixed it and makes plugging the cable back in seamless.

     

    I sent an email back off to network support asking if the port group is in LACP or Static Persistence.  I have a feeling its not in LACP because I don't see it listed under protocol in his etherchannel summary.

  • 3. Re: 802.3ad Bonding issues with OEL 6.1
    Dude! Guru
    Currently Being Moderated

    It was determined that the bond was putting the network card back into service before the card had finished initializing.

    How was this determined? It sounds very strange. How is it possible that a NIC is UP before it finished initializing and negotiating the physical layer?

     

    Perhaps you need to disable spanning tree or enable portfast or the equivalent. PortFast minimizes the time it takes for the server or workstation to come online

  • 4. Re: 802.3ad Bonding issues with OEL 6.1
    WadhahDAOUEHI Journeyer
    Currently Being Moderated

    Hi,

    The mode 4

    To use the 802.3ad (Mode 4) bonding functional under Linux, you must use a specific switch that support 802.3ad. In other case if you don't have a specific switch you can use the mode 6 under Linux that have the same specification like mode 4.


    I hope this can help you

    Best Regards



  • 5. Re: 802.3ad Bonding issues with OEL 6.1
    Dude! Guru
    Currently Being Moderated

    Many people use mode 6 (ALB) because it is an easy and compatible method for fault-tolerance and load balancing. However, load balancing under mode 6 is achieved through ARP negotiation, which is certainly not the same as mode 4 and not providing the same performance. A single connection under mode 6 only uses one NIC, whereas under mode 4 multiple devices share one physical address. Mode 4 requires a managed switch with LACP (802.3ad) support and that all NIC's are plugged into the same switch.

  • 6. Re: 802.3ad Bonding issues with OEL 6.1
    WadhahDAOUEHI Journeyer
    Currently Being Moderated

    Hi,

    Thank you for your clarification, but i know that the mode 6 is requiring that the driver of the slave NIC should support the changement of the MAC address when it is working, then the slave NIC driver should be one of this drivers (e100, e1000, tg3, bnx2, b44, forcedeth).

     

    Best Regards.

  • 7. Re: 802.3ad Bonding issues with OEL 6.1
    Dude! Guru
    Currently Being Moderated

    Are you sure? The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves. (http://wiki.centos.org/TipsAndTricks/BondingInterfaces)

  • 8. Re: 802.3ad Bonding issues with OEL 6.1
    WadhahDAOUEHI Journeyer
    Currently Being Moderated

    Hi,

    About the driver of  the slave NIC should support the changement of the MAC address when it is working, i found this in the book titled "Linux Solutions de Haute Disponibilité" , author Sébastien ROHAUT . this book is in the french language.

    Mode 6 ou balance ALB : Adaptative Load Balancing. Ce mode reprend le mode 5, balance TLB, et un mode RLB (Receive Load Balancing) pour le trafic IPv4. Les tables ARP sont modifiées de telle manière que ARP voit l’ensemble des interfaces réseau comme une seule en entrée comme en sortie. Celleci est modifiée à la volée sur chaque interface esclave par le bonding. Ce mode ne nécessite aucune configuration particulière du switch. C’est le seul dans ce cas qui propose à la fois une modification de la bande passante, un équilibrage de charge et une tolérance de panne. Cependant les pilotes des cartes esclaves doivent supporter la modification de leur adresse MAC une fois le port ouvert, ce qui n’est pas toujours le cas.

     

    Pour récapituler, les modes activebackup, balance TLB et balance ALB ne nécessitent pas de switch particulier. Le dernier mode est le plus intéressant mais nécessite cependant des pilotes le supportant (par exemple les pilotes e100, e1000, tg3, bnx2, b44, forcedeth).

     

    if you understand the french language that's good, if not use google translate.

     

    I hope this can help you

    Best Regards

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points