- Published on
Winsock 10053 `WSAECONNABORTED` and TCP Retransmission Timeout
- Authors
- Name
Summary
At work I was developing software that communicates over LAN with one of our measuring instruments, and I ran into an error where communication was cut off partway through. The software runs on Windows. The application sends commands to the instrument, the instrument performs the requested processing, and then it returns the result to the application. The communication layer was implemented with Winsock.
The processing on the instrument side can take as long as about 60 seconds, and in the program I had set the socket timeout to 120 seconds. I thought that should be more than enough, but in practice communication was timing out after only about 20 seconds.
Cause
After tracing the failure point with the debugger, I found that once a timeout had been set with select, the recv call waiting for the response returned -1. When I checked the error code with WSAGetLastError, it returned 10053, which corresponds to WSAECONNABORTED.
I looked up the error code but could not find the underlying cause. When I asked my manager, it turned out that the root issue was TCP/IP retransmission timeout.
In TCP/IP, if no acknowledgment comes back from the remote side, the data segment is retransmitted. Once the retransmission count exceeds the configured limit, the OS closes the connection. On Windows, the default is five retransmissions. By increasing that count, you can increase the effective timeout period.
To change the setting, add a value called TcpMaxDataRetransmissions under the key below. The value type should be REG_DWORD. The value was not present by default in my environment, so it had to be added manually. If you change the value, a reboot is required for it to take effect.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters
Capturing packets with Wireshark
I had barely used Wireshark before, but I installed it and captured the packets as an experiment. The image below is the packet capture from when the error actually occurred.

Packets No.1 through No.3 perform the three-way handshake, and in No.4 the application sends a command to the instrument. Because there is no response from the instrument to No.4, the same data is retransmitted in No.5 through No.9, for a total of five retransmissions. This is the TCP Retransmission, meaning the contents of No.4 are sent again.
I still do not understand why a sixth retransmission appears at No.12, or why there is a gap between the moment the retransmissions finish and the point where recv actually raises the error. But I was able to understand the overall flow.