How to use tcp keepalive to check the client is alive?

wangningyu

I use the WSAAsyncSelect now, and then the server use the vector to save the client. the server has a ListCtrl to show the online's client,and then FD_ACCEPT it will push_back into the vector. and then FD_CLOSE it will earse from the vector. but it often not receive FD_CLOSE when the client's power failure. I found the setsockopt from google, but don't quite understand the Windows Socket TCP Keepalive. how to use it ? Thanks for your reply !

Code o mat

MSDN[^] states: If a connection is dropped as the result of keep-alives the error code WSAENETRESET is returned to any calls in progress on the socket, and any subsequent calls will fail with WSAENOTCONN. If keep-alive is enabled for a TCP socket with SO_KEEPALIVE, then the default TCP settings are used for the keep-alive timeout and interval unless these values have been changed by calling the WSAIoctl function with the SIO_KEEPALIVE_VALS option. Also, here[^] it says: For TCP, the default keep-alive timeout is 2 hours and the keep-alive interval is 1 second. The default number of keep-alive probes varies based on the version of Windows. Also, this[^] states: KeepAliveTime Specifies how often TCP sends keep-alive transmissions. TCP sends keep-alive transmissions to verify that an idle connection is still active. This entry is used when the remote system is responding to TCP... and here[^]: KeepAliveInterval Specifies how often TCP repeats keep-alive transmissions when no response is received. So this whole thing boils down -to me at least- to this: 1. If this is enabled you will receive WSAENETRESET or WSAENOTCONN if you try to read or write the socket in case the "other side" disappeared. 2. By default, the system "pings" the other side every 2 hours, if no answer is received it will start "pinging" it every second who-knows-how many times before giving up. 2 hours is a lot of time, you might want to shorten that.

> The problem with computers is that they do what you tell them to do and not what you want them to do. < > //TODO: Implement signature here<

Emilio Garavaglia

Code-o-mat wrote:

2 hours is a lot of time, you might want to shorten that.

Dangerous: those timers are expected to be consistent on a very wide number of system, and -because of the way TCP works- there is no need to make it shorter. The keep-alive is sent on a silent socket to let it be not-anymore silent. But the problem is ... why is it silent? If there are data to let flow, TCP itself recover transmission error and - if that cannot be done - reports to the soket WSAENETRESET/WSAENOTCONN independently on every keep-alive feature. If there are no data to let flow for long time, considering servers queue, routing tables, switches cam table, firewall caches, ARP table etc. assuming everything still works the same is risky. Better to close the socket, and reopen it again when furtherly needed, and use some "session layer command or data" (the application should define what the y should be) to track the state of the session or re-sync the client and the server. In other words, the OP had better to implement a more robust application protocol that uses the underlying socket not just to transfer data, but also to exchange some status information about the existence and activity of the client and the server (this can be done more frequently) if he wants to promptly react to an unexpected event.

2 bugs found. > recompile ... 65534 bugs found. :doh:

Code o mat

That all might be true but he asked how keepalive works (and correct me if i am wrong in what i answered to that), not how he should design his client/server protocol. :) I agree that instead of relying on keepalive he should implement his own mechanism for detecting if the other side is gone because that would give him a much better control over the whole thing, but of course, that's just my oppinion. To detect if a long-silent connection is silent because the other side doesn't have anything to say OR because someone poured some coffee over the router would require -as far as i know- some kind of "pinging" and "timeouting", and keepalive does exactly that, so the aproach isn't fundamentally flawed. For the "shortening of the 2 hours thing", as i understood you can set the timings for sockets individually, i meant that, not fiddlign with the system-wide settings.

> The problem with computers is that they do what you tell them to do and not what you want them to do. < > //TODO: Implement signature here<

jschell

Koma Wang wrote:

how to use it ?

Simple - don't. No one wants to know if a connection is alive. What they want to know is whether the business logic is still functioning. And the only way to do that is to implement something that actually tests the functionality. The more fully it tests it the better. But at a minimum you can assure that the server is at least still responding to a message enough to return a response. A simple implementation is to send a do nothing request from the client at a configured interval if no other request has been sent in that period. And do NOT rely on such functionality to get around error checking legitimate requests. A server can go down right in the middle of a legitimate request.