HTTP Connection reuse in Go clients

The other day I was running some validation and performance tests for a new web service. To do this, I was effectively making a bunch of queries against a local HTTP endpoint. Both the server and client are written in Go, and the code querying the endpoint looks somewhat like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
func main() {
    // shared HTTP transport and client for efficient connection reuse
	tr := &http.Transport{
		MaxIdleConns:          10,
		IdleConnTimeout:       15 * time.Second,
		ResponseHeaderTimeout: 15 * time.Second,
		DisableKeepAlives:     false,
	}
	httpClient := &http.Client{
		Transport: tr,
		// do not follow redirects
		CheckRedirect: func(req *http.Request, via []*http.Request) error {
			return http.ErrUseLastResponse
		},
	}

    for _, record := range records {
        runRequestAndCompare(httpClient)
    }
}

func runRequest(httpClient *http.Client, record record) (string, error) {
	var resp *http.Response
    // set up new HTTP request
    req, err := http.NewRequest(record.method, record.url, nil)
    if err != nil {
        return "", fmt.Errorf("%s %s got error: %s", record.method, record.url, err)
	}
    // execute HTTP request with the shared client
    resp, err = httpClient.Do(req)
    if err != nil {
        return "", fmt.Errorf("%s %s got error: %s", record.method, url, err)
    }

    return resp.Header.Get("Location"), nil
}

This simplified setup has two functions. The main function iterates over all records (what exactly these records are does not matter in this context). Crucially, it also sets up a shared HTTP transport and client, since this is the standard way of doing HTTP connection reuse in Golang. The runRequest function takes this shared HTTP client as an argument and uses it to send the request to the server.

However, after a short while, my server (running with go run -race cmd/server) crashed with the following error message:

race: limit on 8128 simultaneously alive goroutines is exceeded, dying.

Woah, that’s a pretty serious error message!

After a quick online search, I found out that when running a Go program with the race detector enabled, it will automatically limit the number of active goroutines to 8128. This indicates that the application is leaking goroutines somewhere, so I connected to the pprof endpoint and looked at the active goroutines. And indeed, there are thousands of active routines “stuck”:

5342 @ 0x470436 0x469097 0x49c6c9 0x53253d 0x5350e5 0x5350b5 0x652b4a 0x66fcc5 0x66e4c8 0x7d55a7 0x7d4f7d 0xbd2569 0xbd20f5 0x470072 0x4a1f41
#  0x49c6c8    internal/poll.runtime_pollWait+0x88       /usr/lib/golang/src/runtime/netpoll.go:302
#  0x53253c    internal/poll.(*pollDesc).wait+0xbc       /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:83
#  0x5350e4    internal/poll.(*pollDesc).waitRead+0x424  /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:88
#  0x5350b4    internal/poll.(*FD).Accept+0x3f4          /usr/lib/golang/src/internal/poll/fd_unix.go:614
#  0x652b49    net.(*netFD).accept+0x49                  /usr/lib/golang/src/net/fd_unix.go:172
#  0x66fcc4    net.(*TCPListener).accept+0x44            /usr/lib/golang/src/net/tcpsock_posix.go:139
#  0x66e4c7    net.(*TCPListener).Accept+0x67            /usr/lib/golang/src/net/tcpsock.go:288
#  0x7d55a6    net/http.(*Server).Serve+0x5a6            /usr/lib/golang/src/net/http/server.go:3039
#  0x7d4f7c    net/http.(*Server).ListenAndServe+0xbc    /usr/lib/golang/src/net/http/server.go:2968
#  0xbd2568    main.serve+0x448                          /app/cmd/server/main.go:98
#  0xbd20f4    main.main+0x7d4                           /app/cmd/server/main.go:61
#  0x470071    runtime.main+0x211                        /usr/lib/golang/src/runtime/proc.go:250

This is strange, because I thought I had connection reuse enabled… and furthermore, the client is performing all requests sequentially.

It turns out that the http.Client will only reuse connections when the response body has been fully read and closed! Otherwise, the application keeps accumulating open connections, especially because the number of connections per host is unlimited by default (transport.MaxConnsPerHost).

Upon reading the Go documentation about HTTP responses very carefully, I found this:

The default HTTP client’s Transport may not reuse HTTP/1.x “keep-alive” TCP connections if the Body is not read to completion and closed.

Note that just closing the body is not enough, the body also has to be read fully before closing it. This slightly unintuitive behavior was also discussed in https://stackoverflow.com/a/17953506.

To get connection reuse working in this scenario, I had to modify the code as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
    // execute HTTP request with the shared client
    resp, err = httpClient.Do(req)
    if err != nil {
        return "", fmt.Errorf("%s %s got error: %s", record.method, url, err)
    }

    // ensure that we close the request so the httpClient can reuse the connection
    io.Copy(io.Discard, resp.Body) // equivalent to `cp body /dev/null`
    resp.Body.Close()

    return resp.Header.Get("Location"), nil

Et voilĂ , no more thousands of active goroutines (and associated TCP connections)! This was another good example of why it’s useful to always run Go programs with the race detector during development.

Happy coding!