1.有客户端开多线程对服务器进行连接断开压力测试,在连接接近4000次时,再也连接不上服务器,过了段时间后恢复正常,而后再出现,如此往复.使用Prcess Explorer查看System Idle Process发现大量的TIME_WAIT状态下的Socket.
解析如下:TCP TIME-WAIT 延迟断开TCP 连接时,套接字对被置于一种称为TIME-WAIT 的状态。这样,新的连接不会使用相同的协议、源 IP 地址、目标 IP 地址、源端口和目标端口,直到经过足够长的时间后,确保任何可能被错误路由或延迟的段没有被异常传送。在RFC 793 中,将这种套接字对不被其它连接重新使用的时间长度指定为 2 个MSL(最大段生存时间的 2 倍)或 4 分钟。对于Windows NT 和Windows 2000 来说,这是默认设置。然而,在此默认设置下,某些网络应用程序在很短时间内执行多个出站连接,就可能会在端口收回前用完所有的可用端口。Windows NT 和Windows 2000 提供两种方法来控制这种情况。第一种方法是使用TcpTimedWaitDelay 注册表参数,改变该数值。对于 Windows NT 和Windows 2000,其值最低可设置为30 秒,这样在大多数情况下不会出现问题。每二种方法是使用 MaxUserPorts 注册表参数,来配置用户可访问的临时端口数(用作出站连 接的源端口)。默认情况下,当应用程序从系统请求任何套接字用于出站调用时,就会提供一个数值在1024 到 5000 之间的端口。MaxUserPorts 参数可用于设置管理员所允许的出站连接的最大端口值。例如,将该值设置为10,000(十进制),就会有约 9000 个用户端口可用于出站连接。关于这一概念的详细信息,请参见 RFC 793,也可参见MaxFreeTcbs 和 MaxHashTableSize 注册表参数。
注册表位置:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters主动断开的一方进入TIME_WAIT状态,被动断开的一方进入CLOSE_WAIT状态.不要试图使用SO_DONTLINGER设置避免TIME_WAIT状态!!!
On a busy HTTP server, the number of sockets in this TIME_WAIT state can far exceed those in the ESTABLISHED state. For instance, I checked an IIS 6.0 box that serves a fairly busy corporate site earlier today and got 124 ESTABLISHED connections versus 431 in TIME_WAIT.
The side shutting down the connection gets the TIME_WAIT. It's typical for a webserver to shutdown the connection immediately after sending a response. If you can instead stash the connection away and give the client time to close it you can push the TIME_WAIT to them. I believe Apache does something like this. In the worst case, if the client doesn't shutdown the connection, say within 5 seconds, you'll have to do it and suffer the TIME_WAIT. However if in that time the client does initiate the close then you have avoided a TIME_WAIT.
2. If you are using I/O completion ports, note that the order of calls made to
WSASend/WSARecv is also the order in which the buffers are populated.
WSASend//WSARecv should not be called on the same socket simultaneously from different threads, since it can result in an unpredictable buffer order.
3.作客户端开1024个稳定连接到服务器,然后使用Prcess Explorer强行非法关闭该客户端,1024个连接瞬间被异常断开,服务器将不能检测到所有socket的断开,如果是完成端口,即时当前打开KeepAlive,即时当前正在执行异步
WSARecv, GetQueuedCompletionStatus仍然可能无法返回正确的错误提示客户端关闭! Socket资源将残留.原来我以为仅在瞬间断电,操作系统崩溃,网线拔开这3种情况下才会发生该现象,看来进程崩溃也可以导致,只是以前的socket数量不足以让操作系统来不及处理.
References:
Windows 2000 TCP/IP 实现详述
http://www.port80software.com/200ok/archive/2004/12/07/205.aspxhttp://support.microsoft.com/kb/q196271/