Java version: 1.6.0_37-b06
OS: Solaris 5.10 (Generic_142900-02)
Sporadically, our application is hitting IOException "Bad file number" when invoking NIO DevPollSelector.select(). Below is the exception stack:
java.io.IOException: Bad file number
at sun.nio.ch.DevPollArrayWrapper.poll0(Native Method)
"Bad file number" seems to be errno=EBADF thrown from ioctl() system call (used by DevPoll selector), manpage (ioctl) documents EBADF if the 1st arg or the fd set (3rd arg) does not point to a valid fd#.
We used a custom javaagent to dump java heap as well as collect output of lsof/pfiles and validated that all FDs referred by the Selector in the heap are valid and visible in the pfiles/lsof output.
Is this a known java bug similar to this java bug on linux? Are there any workarounds (such as to ignore the IOException temporarily) that can be implemented in the application?
Any help is appreciated.
The only other 'hit' on the Internet related to this issue was from a 'Grizzly build tests' log
One interesting observation we had from heap & pfiles analysis, and checking openjdk code (just as a reference implementation but aware that it need not be the same in Sun JDK)
One of the application connection (SocketChannel) listed in pfiles has been closed by the application.
> The corresponding fd# is a "unix socket" and not the original TCP socket, which means that the dup() call has been carried out by SocketChannelImpl.implCloseSelectableChannel() has been invoked
> The SelectionKey object for this SocketChannel is present in the Selector.cancelledKeys set, so the Selector.cancel(SelectionKey) has also been invoked
> This SelectionKey object is also a part of DevPollSelectorImpl.keys which the select()/ioctl() is polling on
We suspect that the IOException might be due to some race condition when ioctl() is polling the fd set just around the time when dup() is invoked on the SocketChannel FD to change it to a unix socket.
Any comments on this observation is also welcome.