3 Replies Latest reply: Jul 26, 2011 5:17 AM by 800381 RSS

    serious problems with udp on solaris 10

    864157
      Hi
      I am trying to write a piece of code that sends data using udp .i am using the standard socket API for sending the data with sendto. i send packets of 60k in a rate of 4MBps per openned socket( there might be more than one socket) on a 1Gb network .
      when i start another socket(a second one) , i have a drop of performance in both of the sockets - as if something was overflowed and i get almost nothing coming out of the machine . i used snoop to see the flow of data and i saw that most of the sent data was dropped ( there are big gaps between the fragment ids send out of the machine)

      when working with one socket a cannot exceed a rate of more that 7MByte at which point i have the same degredation as when i have when i open a second socket.

      what can cause this effect?
      is there a limit of how many opened sockets can a process hold? ( the openned sockets are held by the same process but in different threads)
      why does one socket influence another socket?

      i have tried setting udp_max_buf to 4Mb and udp_recv_hiwat and udp_xmit_hiwat to 65536 but it didn't help. why??

      i'd appreciate any help
      thanks,
      alon
        • 1. Re: serious problems with udp on solaris 10
          877394
          Your "7 MB/sec" caught my attention because I'm seeing something similar. I have a network throughput performance problem.

          The best throughput I'm seeing between 2 T2000s on a gigabit network is 6-7 MB/sec. This is between two Intel e1000g interfaces and a Cisco switch running IOS 12.2.xx.

          Two T2000s (not Enterprise T2000s) with fresh installations of Solaris 10 update 9. Recommended Security patch cluster from 5-20-2011. Current firmware.

          The command used is:

          $ scp -p 10_Recommended.zip host2:/export/home/user

          Traffic proceeds out the public interface (e1000g0) to the other system and into e1000g0 there, but the throughput is consistently only 6-7 MB/sec.

          Link status looks ok:

          host1:/root # kstat e1000g:0 | egrep link
          link_asmpause 1
          link_autoneg 1
          link_duplex 2
          link_pause 1
          link_state 1
          link_up 1
          link_speed 1000

          No collisions, crc errors, ierrors or oerrors. The network cabling is cat 6. Switch admins have confirmed the ports have autonegotiated to 1000 full duplex.

          What's frustrating is that e1000g man page states that forcing 1000 Full is not supported. Normally I'd leave it to auto-negotiate, but forcing it useful for troubleshooting. I plan on running a cable directly between two e1000g interfaces to rule out the switch, but I suspect I'm going to see the same thing.

          Thoughts include a network tuning rc3.d script, but I'm not certain that ndd fuctionality works with this Intel chipset.

          I need some Oracle Solaris networking insight here. Any help is appreciated.

          John
          • 2. Re: serious problems with udp on solaris 10
            877394
            Ok, this is my own oversight and obvious in retrospect.

            The thoughput limitation I saw of 6 MB/sec with scp on gigabit has more to do with the protocol than gb link negotiation, i.e., ssh is slower than other forms of network copying.

            If I changed the cipher used with scp, e.g.

            scp -c arcfour sourcefile idname@destinationhost:/directory

            Choosing the arcfour cipher almost doubled the average throughput to 11 MB/sec.

            I was under the impression that FTP transfers were equally slow, but that wasn't the case. FTP shows 70-80 MB/sec, which is normal for gigabit.

            - John
            • 3. Re: serious problems with udp on solaris 10
              800381
              861154 wrote:
              Hi
              I am trying to write a piece of code that sends data using udp .i am using the standard socket API for sending the data with sendto. i send packets of 60k in a rate of 4MBps per openned socket( there might be more than one socket) on a 1Gb network .
              when i start another socket(a second one) , i have a drop of performance in both of the sockets - as if something was overflowed and i get almost nothing coming out of the machine . i used snoop to see the flow of data and i saw that most of the sent data was dropped ( there are big gaps between the fragment ids send out of the machine)

              when working with one socket a cannot exceed a rate of more that 7MByte at which point i have the same degredation as when i have when i open a second socket.

              what can cause this effect?
              is there a limit of how many opened sockets can a process hold? ( the openned sockets are held by the same process but in different threads)
              why does one socket influence another socket?

              i have tried setting udp_max_buf to 4Mb and udp_recv_hiwat and udp_xmit_hiwat to 65536 but it didn't help. why??

              i'd appreciate any help
              thanks,
              alon
              64K for a UDP buffer is too small. WAAAY too small:

              http://download.oracle.com/docs/cd/E18752_01/html/817-0404/chapter4-57.html