Skip to Main Content

Java APIs

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Problem in Java socket reconnection

843790Nov 26 2007 — edited Sep 15 2008
While writing to the socket if IOException occurred I am trying to reconnect to the destination. In reconnection I am trying the following things:

1. After connection drop out for the first time, I am closing the connection
connObj.close();
connObj = null; (for safety purpose trying to make sure that connection object is completely destroyed and candidate of GC now.)

2. Create a new connection object
connObj = new Socket(); 
connObj.connect(new InetSocketAddress(host, port));
connObj.setReuseAddress(true);
in case again failure occurred while connecting to host and port I am waiting for 5 mins to again reconnect, but with totally new connection object.

But here the problem is it is taking long time to connect on an average 60 minutes.

Can we reduce this time? Is there any other setting in java sockets???

Comments

EJP
While writing to the socket if IOException occurred
Hold it right there. Which IOException did you get when writing to the socket? Some of them just mean application protocol errors ...
I am closing the connection
connObj.close();
OK.
connObj = null; (for safety purpose trying to make sure that connection object is completely destroyed and candidate of GC now.)
This is completely pointless.
2. Create a new connection object
connObj = new Socket(); 
connObj.connect(new InetSocketAddress(host, port));
connObj.setReuseAddress(true);
Too late. You should do setReuseAddress() before connect(), but in this case there's no reason to do it at all as you're not trying to reuse a local port number, which is mostly a dumb thing to do anyway ...
But here the problem is it is taking long time to connect on an average 60 minutes.
60 minutes? It should time out after about 75 seconds. Are you sure?
And which IOException do you get when this reconnection fails? This is sounding like a major network problem, not a Java sockets programming problem.
Can we reduce this time? Is there any other setting in java sockets???
Yes. java.net.socket.Connect(SocketAddress, timeout);
843790
I am getting the following exception

Exception message : Connection reset by peer: socket write error Exception trace :
The exception trace is :
java.net.SocketOutputStream.socketWrite0(Native Method) java.net.SocketOutputStream.socketWrite(Unknown Source) java.net.SocketOutputStream.write(Unknown Source)

Actually I am trying to write 3 times even after exception occurred in writing, if not then trying to reconnect again. Here I am trying to confirm the socket is disconnected may be because of that I am not able to write the data to out socket.

After starting application I am creating the connection, after successfully connecting to destination I am using this setReuseAddress(true) function. Is it only for local ports?

Some times it takes NO time to reconnect and some time takes near about 45-60 mins.
EJP
(a) Your application is misbehaving. Whatever is at the other end of this client is closing the socket while you are still writing data to it. That causes the connection resets. This is what you primarily need to fix. The receiver should read everything that's being sent until it gets the EOS condition.

(b) Your network or your local TCP implementation is misbehaving if it really takes 45-60 minutes to reconnect. Are you sure it is minutes not seconds? I'm finding it very hard to believe this, unless it includes application-level retries that you're not telling us about.
843790
I introduced writing 3 times few days ago but this reconnection problem is there since long time. Even if I remove this code does not make any difference to reconnection problem, yes but it may increase the intensity of problem.

Can we shout down the OP, IP and set the SO Linger to false at our end and then close the socket, so that the closing will be smooth?

connObj.shutdownInput();
connObj.shutdownOutput();
connObj.setSoLinger(false, 1);.
connObj.close();

Yes it is near about 45-60 minutes only. I am keep on trying to connect after every 100 secs.
EJP
Can we shout down the OP, IP and set the SO Linger to false at our end and then close the socket, so that the closing will be smooth?
What makes you think the closing isn't smooth now? You are in danger of losing data if you mess with setSoLinger. Don't do this.
Yes it is near about 45-60 minutes only. I am keep on trying to connect after every 100 secs.
In other words it is less than 100 seconds not counting your retries, in other words most probably 75 seconds like I said before. TCP/IP already does several retries of its own. Possibly you are driving it mad by putting all that in your own retry loop. When your reconnections fail, what exception do you get? and what can you tell us about the server you're trying to reconnect to? Is is very busy for example?

And you still need to fix (a) above. I suspect you have major coding problems in both the client and the server.
843790
After closing the socket

connObj.close()

if we check

connObj.isConnected(), connObj.isBound() these returns TRUE.

How this can happen?

After closing at least isConnected() should return FALSE.
After starting application I am creating the connection,
What is the longest time that might elapse after you first create that connection and when you actually try to send something?

What is the longest time that might elapse between when you send one message and you try to send another?
EJP
After closing at least isConnected() should return FALSE.
That's not what it's specified to do. From the Javadoc:

Returns: true if the socket successfuly connected to a server.

Not 'if the socket is currently connected to a server'.

isConnected() only ever returns false between new Socket() with no arguments and a successful Socket.connect(). In other words all it does is tell you whether you have successfully called connect() yet in the case where you created it unconnected. IN all other cases, e.g. new Socket(host, port), or socket = serverSocket.accept(), it returns true forever.

A similar thing applies to isClosed(): it only tells you whether you have closed it. Similary for isInputShutdown(), isOutputShutdown(), ...

None of these methods tells you anything about the state of the connection.. Because they can't. There is no such thing in TCP.
843790
What is the longest time that might elapse after you first create that connection and when you actually try to send something?
What is the longest time that might elapse between when you send one message and you try to send another?
Actually it depends on the triggering event, but in general observation some time immediately (1-2 minutes) or some times after 2-3 hrs.

Currently in reconnection I am first closing the output stream and then closing connection Object waiting for 2 minutes and trying to connect again with new connection object.

Its connecting successfully in 2 minutes of time but need to monitor for more time.

If any one has more information please let me know.

Thanks in advance.
843790
Again failure :( some times its taking 45-60 minutes to reconnect. In this scenario if we restart the entire application its getting connected.

Please advice...
EJP
What is the longest time that might elapse after you first create that connection and when you actually try to send something?
What is the longest time that might elapse between when you send one message and you try to send another?
Actually it depends on the triggering event, but in general observation some time immediately (1-2 minutes) or some times after 2-3 hrs.
You haven't answered both questions here, but in general a socket connection can't be relied on to last for over two hours with no activity if there are intermediate routers - some of them like to, or are configured to, drop idle connections. Try calling Socket.setKeepAlive(true), which triggers a packet exchange every two hours, or better still see if you can add a 'ping' transaction to your application protocol and have it executed say every 5 minutes.
Currently in reconnection I am first closing the output stream and then closing connection Object waiting for 2 minutes
Why? This is just literally a waste of time. This is one reason your reconnection attempts add up to over an hour. Don't do that. The server should be ready to accept a new connection immediately. If it isn't there is something seriously wrong with it and with the TCP/IP stack it is running on.
Its connecting successfully in 2 minutes of time but need to monitor for more time.
I do not understand this statement.
843790
Sorry for replying so late...

Que 1: What is the longest time that might elapse after you first create that connection and when you actually try to send something?
Que 2: What is the longest time that might elapse between when you send one message and you try to send another?
--> It may be less than or more than 2 hrs.

Connection dropping is a normal scenario in case socket is idle for 2+ hrs. But after that reconnection is important. This should happen. Till this time I was monitoring the reconnection. As I explained I was waiting for 2 min. to reconnect and it is successfully reconnecting also. BUT I found its not able to reconnect for over 30 min VERY RARELY.

And one more thing is, I am not able to telnet that IP and port during this time when the connection is down.

Should not I wait for 2 min? I came to the decision to wait for 2 min because I saw the ip and port was in TIME_WAIT state in network traffic with the help of netstat command. In general after 2 min the socket will be removed (Closed) from the network.

Shall I keep on trying without waiting? How socket will work in that case? Will these frequent attempts create any problems?

Please let me know.

Thanks
Tushar.
EJP
Que 1: What is the longest time that might elapse after you first create that connection and when you actually try to send something?
You haven't answered that at all.
Que 2: What is the longest time that might elapse between when you send one message and you try to send another?
--> It may be less than or more than 2 hrs.
That doesn't answer the question. What's the longest time that might elapse? If it's hours rather than minutes you need to reconsider your strategy, as firewalls will often drop connections that have been idle that long, and TCP keepalive will drop them after two hours too if it's enabled on either end. On these time-scales the overhead of reconnecting should be minimal.
BUT I found its not able to reconnect for over 30 min VERY RARELY.
You need to show me some code here. This could only happen if you are doing retries with a large sleep interval. By itself, a socket connection attempt will time out after about 75 seconds.
And one more thing is, I am not able to telnet that IP and port during this time when the connection is down.
Then there must be something wrong with the server or the network. Do you have access to the server code? Is it multithreaded? Does it keep recreating ServerSockets for example, instead of running a dedicated thread which does nothing except accept from an existing ServerSocket?
Should not I wait for 2 min?
No, but see below.
In general after 2 min the socket will be removed (Closed) from the network.
You have a confusion here between sockets, ports, and connections. The socket is already closed: that's how you get into the TIME_WAIT state. The connection, i.e. the 5-tuple {TCP, remote host, remote port, local host, local port} will be removed from your local host after two minutes.*Not at all* the same thing. Ther e is no need to wait for two minutes unless you get a BindException. Possibly the local port will be tied up for two minutes too, but that shouldn't matter as you shouldn't be specifying a local port, so the system will automatically give you a dynamic port number.
Shall I keep on trying without waiting?
See above. In general you should retry after a small interval which doubles on every failure and is reset on every success, and it should initially be 5-10 seconds, no more.

And if you're still getting 'connection reset' it still means you have a bug in your application for the reasons I gave earlier in the thread.
843790
As soon as I start my application the first connection is made. But I cant say when it will send something. Actually as per my observation if the elapsed time is less than 2 hrs the messages are being sent without any problem (If no there is other problem) But if the time exceed 2 hrs the sending will be failed, it means it will try to write the bytes and get IOException, then we starts trying to reconnect.
Ther e is no need to wait for two minutes unless you get a BindException
I am not getting BindException.

About Telnet not connecting can we say the client port is not listening??
What will be the reason for Telnet not to work?
843790
It seems to me that your problem is actually a server problem rather than a client problem.

Evidence of this is the fact that using telnet you can't establish a connection on your address/port!
And one more thing is, I am not able to telnet that IP and port during this time when the connection is down.
Who coded the server? Is it a multiplexed/threaded server (using select, etc)? If so, then you should still be able to connect and the offset would be socket leaks on your server. But if you tell me that your server is not multiplexed/threaded than this would explain things.

If you coded the server, you could try putting log traces to see where it gets boggled up.

Hope this helps!
EJP
Clearly your server is enabling TCP keepalive, hence the 2-hour dropout.

Two further questions:

1. What exception do you get when the connection drops out?
2. What exception do you get when the connection fails?

Show the complete exception text in both cases.
843790
The server side code is client's code. I cant access this code.

1. What exception do you get when the connection drops out?
--> While sending the data to the server side say after the gap of 2+ hrs I am getting the following exception, as I am not able to write data to SocketOutputStream.

java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(Unknown Source)
java.net.SocketOutputStream.write(Unknown Source)

After this I am trying to reconnect... And if reconnection failed I am getting

java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.PlainSocketImpl.doConnect(Unknown Source)
java.net.PlainSocketImpl.connectToAddress(Unknown Source)
java.net.PlainSocketImpl.connect(Unknown Source)
java.net.SocksSocketImpl.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)

this exception.

Can you please clarify one doubt?
--> I am trying to reconnect always with the new socket object. If not reconnected, what happens to the socket object in the network, will it be blocked somewhere as a failed session? Does so many attempts to reconnect to some server affect network?
As per my knowledge about socket, it has to go through 7 layers to connect to some server. Is my created socket blocking some thing in this flow?

My questions may be dumb... :-) But got these doubts.

I didn't understand 2nd question.
2. What exception do you get when the connection fails?

Do you mean reconnection fails? if yes, I already mentioned above (2nd exception).
EJP
These are just stack traces and they are useless on their own. You need to show the exception class and the exception text. You also need to show some code. This has been asked for several times.
Can you please clarify one doubt?
--> I am trying to reconnect always with the new socket object. If not reconnected, what happens to the socket object in the network
There is no 'socket object in the network'.
2. What exception do you get when the connection fails?
Do you mean reconnection fails?
There is no such thing as 'reconnection' in TCP. There is the creation of a new Socket. What exception do you get when you do this?

and I don't mean just the stack trace thanks.
You haven't presented nearly enough evidence or information about this problem and that's one reason why you still have it after two months.
843790
	if(socketObj != null) {
		try{
			//Closing output stream
			socketObj.getOutputStream().close();
		} catch(Exception e){
			e.printStackTrace();
		}
		try {
			//Closing client socket
			socketObj.close();
		} catch (Exception e) {
			e.printStackTrace();
		}
		try {
		        //Rest time for 2 min.
			Thread.sleep(DELAY);  //2 min.
		} catch (InterruptedException iee) {
			iee.printStackTrace();
			}
		}

	while (true) {
		try {
			//Create new socket object.
			socketObj = new Socket();

			//Try to connect to host, port.
			socketObj.connect(new InetSocketAddress(host, port));

			// Reuse the adderess when trying to reconnect.
			socketObj.setReuseAddress(true);

			break;
		} catch (IOException e) {
			//I am getting the stack trace what I showed you here.
			e.printStackTrace();
			try {
				Thread.sleep(DELAY);
			} catch (InterruptedException iee) {
				iee.printStackTrace();
			}
		}
		catch (Exception e) {
			e.printStackTrace();
			try {
				Thread.sleep(DELAY);
			} catch (InterruptedException iee) {
				iee.printStackTrace();
			}
		}
	}
This is the code for reconnection. I am not getting any specific exception text in the printStacktrace() function. The exception is being caught in IOException catch block.

Please let me know if you need more information.
EJP
Please let me know if you need more information.
OK. I need more information. I still need the name of the exception class and the message that went with it, i.e. the first line of the exception trace. I already asked for that.

BTW this code still shows numerous bugs that I pointed out in November, as well as the pointless two-minute sleep that is a large part of your problem.
843790
IOException:
Connection refused: connect
java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.PlainSocketImpl.doConnect(Unknown Source)
java.net.PlainSocketImpl.connectToAddress(Unknown Source)
java.net.PlainSocketImpl.connect(Unknown Source)
java.net.SocksSocketImpl.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)

We are using OutputStream only so I am closing the same. Isn't it�s needed to close socket after closing OutputStream?

Now I am calling setReuseAddress() function before call to connect() function.
EJP
IOException:
Connection refused: connect
Thank you.

This exception means that the server has stopped listening at the port. This is either a software bug or an error condition in the server application. There is nothing whatsoever you can do about it at the client end except sleep and retry. If it takes you 30 minutes to establish a new connection, it means that the server is taking somewhere around that time to start listening again. Possibly it has crashed. It is not a network problem; and it is not due to any application code or Java or O/S code in the client.
We are using OutputStream only so I am closing the same. Isn't it’s needed to close socket after closing OutputStream?
No, closing the output stream is sufficient.
843790
The following exception is occurring when connection is failed.

Connection refused: connect
java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.PlainSocketImpl.doConnect(Unknown Source)
java.net.PlainSocketImpl.connectToAddress(Unknown Source)
java.net.PlainSocketImpl.connect(Unknown Source)
java.net.SocksSocketImpl.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)
java.net.Socket.connect(Unknown Source)

If connection drops out while sending data, the following exception occurs.

Connection reset by peer: socket write error
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(Unknown Source)
java.net.SocketOutputStream.write(Unknown Source)
EJP
I suspect the entire server process is crashing and taking several minutes to be restarted.

In any case it is definitely not your problem. It is definitely a problem at the server end. If you lose a connection for any reason you should be able to establish another one practically instaneously if the server program is competently implemented.
843790
After idle period (from 2:36:00 PM to 4:01:03 PM) of around 1 hrs 25 min while writing the data to socket output stream I got the following exception.

The IOException trace is:
Software caused connection abort: socket write error
The exception trace is :
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(Unknown Source)
java.net.SocketOutputStream.write(Unknown Source)

But Connected immediately in 5 sec.

Can you please explain what will be the reason behind getting this exception? under what circumstances this exception may occur?

Tushar.
843790
Tush wrote:
After idle period (from 2:36:00 PM to 4:01:03 PM) of around 1 hrs 25 min while writing the data to socket output stream I got the following exception.

The IOException trace is:
Software caused connection abort: socket write error
The exception trace is :
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(Unknown Source)
java.net.SocketOutputStream.write(Unknown Source)

But Connected immediately in 5 sec.

Can you please explain what will be the reason behind getting this exception? under what circumstances this exception may occur?
I really think you need to put a lot more thought and effort into investigating, understanding and resolving your own issues.

We already went through this in this thread.

The connection is being closed after long idle periods. So the reason is that the connection is idle for "too long". What "too long" means is dependent on the server and any network devices between you and it.

I find it difficult to understand why at this point you could observe that behaviour, see the stack trace and then ask "what the reason" and "under what circumstances". You can see the reason and you can see the circumstances.
EJP
Can you please explain what will be the reason behind getting this exception?
I did that on 27/11/2007 09:06 (reply 3 of 26), nearly eleven weeks ago.

I agree with cotton.m.
843790
Hi,

I am experiencing a similar problem where during an active transfer an unexpected connection loss on the client side will result in the connection remaining open on the server side. Is there a way to close that particular connection after regaining internet connectivity rather than waiting for the timeout (which can be too long for my purposes). My guess is no, but maybe I'm wrong.

Also, I am connecting to a ftp server which allows a limited number of open connections per IP, so multiple scenarios like the one above can create a functional problem for the program for quite a while(long timeout). But the bigger problem is that if the program was transferring via outputstream to the socket and is disrupted then the server essentially keeps a lock(or under a temp name) on the file until the connection is closed(gracefully or by timeout). This makes resuming the upload impossible(as far as I know) until the file in question is available again.

Any advice would be appreciated.
EJP
I am experiencing a similar problem where during an active transfer an unexpected connection loss on the client side will result in the connection remaining open on the server side.
That's not a 'similar problem' at all. Please start your own thread. Locking this one.
1 - 29
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Oct 11 2008
Added on Nov 26 2007
29 comments
5,167 views