Monitoring HTTP traffic of Kubernetes applications with mitmproxy
In this post I want to share a method I used recently for understanding the network activity of an application running on Kubernetes. Specifically the application I was looking at was velero (a backup and recovery tool): I could not figure out to which object storage endpoints it was talking to and which credentials it was using.
I searched the internet for a transparent HTTP proxy that can intercept outgoing network traffic and log details about the HTTP request and response.
Of course, on a lower layer this is possible by using good’old tcpdump
, but that quickly stops being useful when you’re dealing with encrypted HTTPS traffic.
In that case the only information we could see would be the origin and destination IP and port pairs plus the server hostname (via SNI) – not nearly enough useful information for debugging.
At the same time one could also make the argument that this a perfect use case for a service mesh (Istio, Linkerd & co), however for debugging a single application setting up a service mesh would be complete overkill.
I came across the mitmproxy project which bills itself as an interactive TLS-capable, intercepting HTTP proxy for penetration testers and software developers.
mitmproxy
can capture outgoing HTTP(S) requests and allows you to modify requests and response in real time via a TUI (terminal client), web interface or Python API.
For the purpose of my debugging I only needed traffic logging which can be achieved with the bundled mitmdump
command.
To get started, let’s spin up a new Deployment using the official Docker image:
|
|
We’re setting the command of the container to mitmdump
so that the requests and responses are printed on the terminal (we don’t need any interactivity).
The environment variable PYTHONUNBUFFERED
is set so the output is printed immediately (more details).
The environment variable HOME
is set to /tmp
because this directory is guaranteed to writeable.
Upon starting mitmproxy
will generate a CA private key and root certificate and put it there.
|
|
This leads us into the difficult topic of certificate management.
In 2024 most HTTP connections are encrypted by default, especially if they are leaving the context of your Kubernetes cluster.
However, to be able to see the contents of the requests mitmproxy
needs to decrypt the outgoing request and then re-encrypt it before sending it to the upstream server.
For this purpose the above mentioned certificate authority (CA) is used.
But this CA is not trusted by other clients (yet), such as the velero application.
To change that we will use the following trick:
|
|
Lastly we need to tell the application that it should send its traffic through the mitmproxy
service we created earlier.
Luckily most well-behaved applications support the simple HTTP_PROXY
environment variable, like for example velero does:
kubectl set env deployment/velero HTTP_PROXY=http://mitmproxy:8080 HTTPS_PROXY=http://mitmproxy:8080 NO_PROXY=172.30.0.1
In my case I’m setting NO_PROXY
to 172.30.0.1
(a.k.a kubernetes.default.svc.cluster.local
) so that connections to the Kubernetes API server do not get proxied because I was not interested in those requests, but only the external ones.
Note: make sure that no NetworkPolicies
are blocking connections between the client application and the mitmproxy deployment - this can lead to very cryptic errors that are hard to troubleshoot.
If the application does not support configuring an HTTP proxy for outgoing requests, the mitmproxy documentation also has instructions for setting up transparent proxy, i.e. a mode where no client configuration is required.
Finally we should be able to see the intercepted HTTP requests and responses in the logs:
|
|
Happy debugging!