Switching from Nginx to Caddy - or not?
In my homelab I’m self-hosting a couple of static websites from Minio S3 buckets - including the blog you are reading this article on. Using S3 buckets for static file hosting is great, because while the S3 interface was originally proprietary to AWS, it is nowadays widely supported by many tools and services. In addition, the S3 API comes with excellent built-in authorization primitives: practically this means it’s very easy to generate temporary access keys and distribute these to various people, machines etc. because they come with least privileges and can be revoked trivially.
However, there is one issue: while the S3 API is directly available via HTTP, it is not “web browser friendly”, i.e. it does not support pretty URLs like http://s3.example.com/foobar/
.
Instead, one would need to make sure to always navigate to a particular file, such as http://s3.example.com/foobar/index.html
(otherwise the user will see an ugly XML error page).
This might have been acceptable behavior 20 years ago, but today users are accustomed to better UX.
Therefore, an intermediate proxy is necessary to perform URL rewriting: for example, when the requested path ends with a slash (/
), then the index.html
suffix should automatically be added before requesting the file from the S3 endpoint.
|
|
In addition, this proxy should also cache the content for some time. This way, the service can be scaled for large number of requests by simply increasing the number of proxies (spread across multiple machines), without impacting the backend (a “mini” CDN)
So far, I was running such a setup with Nginx. However, over the last couple of years and new contender in the realm of static file serving and reverse proxying has emerged: Caddy.
Caddy boasts with its easy configuration (many people have been bitten by some footguns in Nginx' configuration file), included configuration API, automatic HTTPS, high performance, and extensibility (thanks to its modules). Especially the easy configuration is often cited on Hacker News et al. as one of the major advantages of Caddy.
So I decided to give it a try and convert the following Nginx configuration into an equivalent Caddy config (Caddyfile). In this post I will go over reverse proxying with Caddy, URL rewriting, modifying headers, disallowing methods, caching and metrics monitoring.
In this particular example, the reverse proxy should listen to blog.cubieserver.de
and forward the requests to s3.cubieserver.de/blog-cubieserver-de/
(while preserving the rest of the URL path).
I won’t explain the details of the Nginx configuration, since this post will focus on the Caddy part.
|
|
# Basic reverse proxy
Let’s try to replicate the Nginx configuration above step-by-step with Caddy. A basic Caddyfile for reverse proxying looks like this:
|
|
Here we have the first difference from Nginx: if the to
section contains the upstream path (to https://s3.cubieserver.de/blog-cubieserver-de/
), Caddy will complain that:
|
|
Therefore we need to first rewrite the URL internally (line 4 above), before we can forward it to the upstream (line 9).
Let’s give it a try:
|
|
So far, so good!
# Inflight URL rewriting
As explained in the introduction, when requesting a folder (instead of an existing file), we get a nasty 404 response.
Let’s fix that next: essentially we to rewrite all requests that have a trailing slash (regex: ^(.*)/$
) to the index file.
|
|
Here I came across my first footgun with Caddy: Rewrite rules are always mutually exclusive - they are not processed in sequential order! Initially, I was expecting to be able to use the following snippet:
|
|
In my mind, this should rewrite /foobar/
to /foobar/index.html
and then /blog-cubieserver-de/foobar/index.html
.
While Caddy will happily accept this configuration, it will silently apply just one of the rules - it took me quite some time to identify the issue.
To be honest, I was shocked to find such unintuitive behavior so early during my exploration.
But anyway, with the first snippet above it works:
|
|
# Modifying HTTP headers
Now we still have a bunch of unnecessary headers in the response. Let’s strip those:
|
|
In a Caddyfile, the header_up
directive is used for modifying request headers which should be sent to the backend (upstream), whereas the header_down
directive is used for modifying response headers sent to the client.
The fact that headers are removed by simply prefixing them with a minus (-
) makes me a bit iffy, but as long it works reliably, it’s fine for me.
It’s definitely nice to see that we can remove headers with wildcards (and don’t need to explicitely override each of them).
# Restricting HTTP verbs
Next, we also want to ensure that we only proxy GET and HEAD requests, just like the initial Nginx config:
|
|
And let’s try:
|
|
It works!
# Caching
Alright, let’s move on to setting up the proxy cache. After some searching around online, I was rather disappointed: Caddy does not have a built-in solution for caching upstream responses. There are several plugins (“Caddy modules”) available online: CDP Cache, Souin Caddy module, Caddy Cache Handler and Caddy Cache - oh no wait, the last is only for Caddy v1 and already deprecated - this is exactly the reason why I’m not a fan of “external” plugins: when there is a serious bug or security vulnerability, you have no idea if it’s ever going to be addressed.
Anyway, looking at the remaining options, Souin and Cache Handler seem rather complex and focused on distributed caching, which is not something I necessarily need in my setup. CDP Cache looks much simpler, but the quality of the repository is not very convincing.
In any case, all these modules have a huge drawback: they are external modules that need to be compiled together with Caddy. Caddy offers the xcaddy command-line tool, which allows you to easily build Caddy with plugins.
|
|
That’s all well and good for local development, but how is that supposed to work in a containerized environment? You don’t want to build the server binary every time you start the container!
The recommended approach is building your container image with the required modules:
|
|
Fantastic! Now I not only need to maintain the Caddy version and configuration, but also an additional container image and CI pipeline that needs to be kept up-to-date and rebuild regularly!
I’m skipping this step for now.
# Metrics
Let’s shift the focus to another aspect of a cloud-native deployment: monitoring. To get OpenMetrics-compatible data (suitable for Prometheus and friends) out of Nginx, an external exporter is required. Caddy already has this feature built-in (it’s one of the “standard modules”).
The following Caddy config snippet exposes the metrics path on a separate port (here: 8081
).
This makes it easy to scrape the metrics internally (e.g. inside a Kubernetes cluster), but avoids exposing the metrics globally (without authentication), which is the case with the default configuration.
This snippet also sets up a small healthcheck endpoint, which can be useful for health probes from the orchestration manager (e.g. Kubernetes Readiness Probes or Nomad Service Checks).
|
|
Let’s give it a try:
|
|
The healthcheck endpoint works as expected.
The metrics endpoint also returns data, but this data is not very useful because it just has a rather uninformative server="srv0"
label.
Ideally, I’d like to be able to distinguish the metrics based on hostnames / vhosts.
This (obvious?) feature is currently not supported, but has been requested on Caddy’s issue tracker.
Additionally, the way Caddy generates metrics appears to be implemented in an inefficient manner, going so far that a feature toggle was introduced to disable the metrics endpoint.
While these issues will probably eventually be fixed, they are definitely not a good impression for a “modern” web server.
However, it is laudable that the Caddy docs have a dedicated page explaining what each of these metrics mean.
# Putting it all together
Finally, let’s put all of the small configuration snippets together into a Caddyfile
:
|
|
This Caddyfile has 30 lines of content (grep -v -E '[[:space:]]*#' Caddyfile | grep . | wc -l
).
The Nginx config at the top of this post has 32 - but that one also implements a local file cache!
If we omit the statements relating to caching, it comes in at 20 lines - one third less compared to Caddy!
# Conclusion
I was definitely not expecting this result when I started this exploration quest. Using the number of config lines is certainly not the best comparison metric. But to be completely honest, I’m kind of disappointed that I already had to discover two significant stumbling blocks while configuring Caddy (multiple rewrite directives and metrics). Maybe I simply had too high expectations for Caddy after all the hype I have been hearing online.
In addition, after this short experience I’m also not a fan of the Caddyfile format, because depending on the level of detail, you need to use different kinds of directives. For example, basic URL rewriting can be achieved with:
|
|
But for slightly more advanced regex matching, I can not use the same syntax:
|
|
Instead, I need to define a separate named matcher:
|
|
I can totally understand this behavior from a technical point of view, but from the user’s perspective it is quite confusing.
Overall, I’m not satisfied with the experience. At least for now, it looks like I will stick with Nginx - at least until Caddy has “proper” metrics support.