Namespace-scoped Kubernetes Service Discovery for Prometheus

Recently I was helping a colleague setting up Prometheus monitoring for an application. This application must be installed into the Kubernetes cluster without cluster-admin rights. Therefore ClusterRoleBindings are not an option. Instead we need to use a regular RoleBinding for attaching the the ServiceAccount to the Role.

If you are not familiar with the relation between roles, service accounts and role bindings, please refer to Kubernetes' documentation on RBAC.

It boils down to adding the following YAML snippets to the Prometheus deployment (inspired by the Prometheus Helm Chart):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: prometheus-server-role
  labels:
    app.kubernetes.io/name: prometheus-server
rules:
  - apiGroups: [""]
    resources:
      - endpoints
      - pods
      - services
    verbs:
      - get
      - list
      - watch
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
  name: prometheus-server-sa
  labels:
    app.kubernetes.io/name: prometheus-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: prometheus-server-rb
  labels:
    app.kubernetes.io/name: prometheus-server
subjects:
- kind: ServiceAccount
  name: prometheus-server-sa
  namespace: mynamespace
roleRef:
  kind: Role
  name: prometheus-server-role
  apiGroup: rbac.authorization.k8s.io

and adding the service account to the PodTemplate specification:

1
2
3
4
5
6
7
8
apiVersion: apps/v1
kind: StatefulSet
...
spec:
  ...
  template:
    spec:
      serviceAccountName: prometheus-server-sa

Please note that with a namespace-scoped service account Prometheus won’t be able to scrape the commonly used jobs kubernetes-apiservers, kubernetes-nodes and kubernetes-nodes-cadvisor because they are using the node role

So far so good. Unfortunately, it still didn’t work. We were using the following, simplified Prometheus configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 1m

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets:
	- localhost:9090

  - job_name: 'kubernetes-service-endpoints'
    kubernetes_sd_configs:
      - role: endpoints
    relabel_configs: ...

And Prometheus was giving us these errors:

$ kubectl logs deploy/prometheus-server -c prometheus-server | tail
Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:mynamespace:prometheus-server" cannot list resource "pods" in API group "" at the cluster scope

The last two words of the log message contain the the key: cluster scope. Prometheus is still trying to scrape at the cluster-level, even though the associated service account only allows it to scrape at the namespace level.

Unfortunately, Prometheus documentation on Kubernetes Service Discovery is not very clear about how to configure namespace-only scraping. This Github issue comment led me to the correct configuration option:

1
2
3
4
5
kubernetes_sd_configs:
  - job_name: 'foobar'
    namespaces:
      names:
        - "mynamespace"

For each job definition under the kubernetes_sd_configs key we need to specify for which namespace(s) it is activate. If no namespaces are specified, Prometheus will try to scrape all namespaces — and if it is not allowed to do that, the entire process will fail.

The updated example configuration shown above looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  scrape_configs:
    - job_name: prometheus
      static_configs:
        - targets:
          - localhost:9090

    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
        - role: endpoints
+         namespaces:
+          names:
+            - "mynamespace"
      relabel_configs:...

After this configuration change Prometheus discovered all the namespace-local services.

If you want to achieve the same effect when using the Prometheus Helm Chart (v13+), you need to specify the following Helm values:

1
2
3
4
server:
  useExistingClusterRoleName: false
  namespaces:
    - "mynamespace"

Happy scraping!