200 milliseconds doesn’t sound like a lot, but it’s an eternity for latency-sensitive code (you could travel around the world at the speed of light in 133ms). If you’re working with latency-sensitive code over the network, you might have found that sometimes your requests take much longer than expected. Up to 200ms, even for basic requests on localhost.
Nagle’s Algorithm was introduced in 1984 to reduce the number of packets sent over TCP/IP. Suppose many small data packets are sent over the network in a short time. In that case, they are buffered until the outstanding packet is acknowledged or until the buffer accumulates enough data to send a full-sized packet.
Around the same time, Delayed ACK was introduced, which has the server wait for a fixed time (e.g., 200ms) before acknowledging as a bet there might be more packets sent. When both are enabled, I’ve seen this called “silly window syndrome.”. You have two systems, implementing delays and waiting to acknowledge each other (in the name of performance). Delayed ACK could wait up to 200ms for another packet.
It’s a hard problem to debug. Requests seem to get delayed at random or with regard to their position. The first few packets might get delayed, and later ones go through fine. Nagle’s Algorithm is enabled by default, which surprises most programmers. Some languages (like Go) have it disabled, but many do not.
It turns out you only need `TCP_NODELAY` (to disable Nagle’s Algorithm) and maybe even `TCP_QUICKACK` (to disable Delayed ACK) if your system is dealing with RPC calls or is especially low latency (why Go has it disabled since it is rarely used for web development where there is a human on the other side). It just might solve your mysterious latency issues.
Here’s John Nagle talking about the problems between Delayed Ack and Nagle’s Algorithm in a Hacker News comment.