We are currently experiencing packet loss issues with Coherence 3.4.2 during the datagram test.
Packet loss statistics are as follows:
Rx from publisher: /10.96.67.169:9999
packet size: 1468
throughput: 66 MB/sec
received: 7004519 of 8252377
success rate: 0.848788
out of order: 0
avg offset: 0
avg gap size: 2
avg gap time: 0ms
avg ack time: -1.0E-6ms; acks 0
The Coherence implementation is running on a Xen VM.
We see this happen for both Fully Virtual and Paravirtualized Guests.
This problem does not happen on physical hardware.
Here is the general sequence that we tried:
1. After finding the problem on coherence, we tried to simulate similar results on 2 HVM xen systems and we did not find the problem there.
a. These boxes were HVM guests.
b. Were running kernel 2.6.18-164 and redhat 5.4
c. These guests were running on Dom-0 kernel of 2.6.18-164.2.1el5xen
2. We had 2 para virt machines on the same Dom-0 as above but they were redhat 5.2 so we ran the same test there and still we were running into problem.
3. We upgraded the para virt machines to redhat 5.4 with latest patch rev and still problem was present.
4. after this research found out that we need to disabled module ipv6 and that seems to fix the problem. After disabling IPv6 module ran some more tests between pl1rap704-beta and pl1rap706-beta. Results were performance improved but still packet loss.
5. We converted 2 para virt guests to HVM guests (pl1rap704-beta and pl1rap705-beta) and ran the tests it was still having problem.
6. Upgrade pl1rap704-beta and pl1rap705-beta to redhat release 5.4 and latest kernel rev and see if the problem is still there
We haven't tried this on Oracle VM, but think that would be the next step to see if the problem persists there, although Oracle support indicates that Coherence is not officially supported on Oracle VM.
We still see the packet loss issues and wonder if anyone has encountered this issue before and has a solution to it?