picture of a laptop keyboard

Running Pihole on K3s

2024-08-30
Mark Hughes

K3s on Pi

I've been running Pihole and a few other services on my Raspberry Pi for a while now, with a Wireguard VPN enabling me to block adverts on my phone when I’m out and about.

For no objective reason, and simply because I want to, I’ve decided that I’m going to set my Pi up as a single node (until I buy more Pis to join) kubernetes cluster that will run all the applications and services it currently runs, some via a docker-compose, others as services running locally. Even though I decided to do this on a whim, there are some benefits: - It makes it easy to set up monitoring for all services that end up running in the cluster. - Kubernetes scheduling and autohealing means that if anything goes wrong, or the Pi reboots, the services will come back online without intervention - I'm aiming to do the CKA exam soon and this will be good practice for me. - Lastly, and least importantly, eventually, I'll buy another Pi and join it to the cluster, effectively doubling resources available to all my services with minimal configuration.

First steps

Up until now I've been running my Pi with just a 64GB SD card, but with a Kubernetes cluster running with a lot of services, I figured now would be a great time to start with a fresh Ubuntu install on a 1TB USB hard drive. Installing an OS is outside the scope of what I wanted to write about and there's plenty of guides on that so I won't cover it here, but, reader, if you need any advice, feel free to comment or email me.

Installing Kubernetes

This step is very simple, but if you have a domain name and want to use it to connect to the cluster from outside your network, then you’ll need to make sure you add it into the tls-san argument or you’ll get denied access with certificate errors. I've decided to use K3s because it's a lightweight version of Kubernetes with some unecessary stuff stripped out, and it comes bundled with Traefik, a reverse proxy and ingress controller that is very useful when it comes to routing to our services. The command to download and install k3s on your pi is this (replace with your domain name).

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--tls-san <YOURDOMAIN>" sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.30.3+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.30.3+k3s1/sha256sum-arm64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.30.3+k3s1/k3s-arm64
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s

You can check it’s up and running using the kubectl utility

root@pi:~# kubectl get no
NAME   STATUS   ROLES                  AGE   VERSION
pi     Ready    control-plane,master   99m   v1.30.3+k3s1

to access the cluster from another machine, you’ll need to copy the config from the pi into your machine’s ~/.kube/config I’ve renamed the cluster and user from default, and replaced the local IP of the server address with my domain name so I can access it from outside the local network. but it should look something like this

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDjZz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    server: https://my.domain:6443
  name: pi-k3s
contexts:
- context:
    cluster: pi-k3s
    user: pi
  name: pi-k3s
current-context: pi-k3s
kind: Config
preferences: {}
users:
- name: pi
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ.....1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    client-key-data: LS0tLS1CRUdJTiBFQyBQUkl.....QgRUMgUFJJVkFURSBLRVktLS0tLQo=

If you copy that into your local kube config you should be able to connect to the cluster and interact via kubectl.

Lens

Lens is a nifty tool that provides a UI for accessing your cluster. What I really like about it is that it has a feature where it installs a light-prometheus exporter which enables it to visualise basic monitoring in the app. I can see at a glance how many pods I’ve got running, as well as memory/CPU usage.

Here you can see you can look at different resources on a cluster, by application/node/workload/namespace etc, there’s a terminal for interacting with the api, and you can inspect a resource.

To use Lens, once you've installed it, on the left sidebar go to catalog > clusters, then using the + button in the bottom corner, use add from kubeconfig to open a dialog where you can paste the config that was created earlier.

Lens will connect to your cluster and enable you to view all your kubernetes resources in the GUI.

Monitoring with Grafana Cloud

I think it would be interesting to have grafana cloud monitoring to keep an eye on performance over time, but I didn't want to spend a huge amount of time on this part so I've used the official helm chart to deploy this.

Sign into Grafana Cloud and through the initial setup a grafana url and prometheus server will be created. You'll need the details for those later.

Prometheus config

We are going to need to get the values for the prometheus server before we can install it, here is the page for mine: https://grafana.com/orgs/markphughes17/hosted-metrics/6190172 Of course, only I have access to view mine but the url for yours will look similar.

here we can see the URL, username, and we can generate a token. Make a note of all these, and ensure the access policy allows the token to write.

Do the same for loki and tempo, making a note of the values.

Helm chart install

First create a values.yaml file

cluster:
  name: k8spi
externalServices:
  prometheus:
    host: <PROMETHEUS_URL>
    basicAuth:
      username: <PROMETHEUS_USER>
      password: <PROMETHEUS_TOKEN>
  loki:
    host: <LOKI_URL>
    basicAuth:
      username: <LOKI_USER>
      password: <LOKI_TOKEN>
  tempo:
    host: <TEMPO_URL>
    basicAuth:
      username: <TEMPO_USER>
      password: <TEMPO_TOKEN>
metrics:
  enabled: true
  alloy:
    metricsTuning:
      useIntegrationAllowList: true
  cost:
    enabled: true
  node-exporter:
    enabled: true
    metricsTuning:
      includeMetrics: ["node_hwmon_temp_celsius", "node_thermal_zone_temp", "node_netstat.*", "node_load.*"]
logs:
  enabled: true
  pod_logs:
    enabled: true
  cluster_events:
    enabled: true
traces:
  enabled: true
receivers:
  grpc:
    enabled: true
  http:
    enabled: true
  zipkin:
    enabled: true
  grafanaCloudMetrics:
    enabled: true
opencost:
  enabled: true
  opencost:
    exporter:
      defaultClusterId: k8spi
    prometheus:
      external:
        url: <PROMETHEUS_URL>
kube-state-metrics:
  enabled: true
prometheus-node-exporter:
  enabled: true
prometheus-operator-crds:
  enabled: true
alloy: {}
alloy-events: {}
alloy-logs: {}

add grafana’s helm repo with these commands: helm repo add grafana helm repo update then run helm upgrade --install --atomic --timeout 300s grafana-k8s-monitoring grafana/k8s-monitoring --namespace "monitoring" --values values.yaml

If it’s all correct, you should soon start seeing your cluster appear in grafana cloud

I have imported dashboard Raspberry Pi K3S Nodes Overview (17586) for my metrics and at first I noticed that it wasn’t showing the pi temperature, which I’m interested in, and some other metrics that were included in the dashboard. After digging around I found that there is an allowlist which doesn't include those metrics, and I was missing this line in the config to enable them, so if you find a metric not being collected that you expect to see, have a look at the includeMetrics list.

metricsTuning:
      includeMetrics: ["node_hwmon_temp_celsius", "node_thermal_zone_temp", "node_netstat.*", "node_load.*"]

With that enabled, I immediately started getting the temperature information and other missing metrics.

Traefik

Now I have the cluster up and running, I want to make sure I can access it externally with the Pi’s ip address and ideally my domain, too. Traefik, the ingress controller I mentioned earlier, comes with a dashboard which is disabled by default, so I enabled it by creating a helmchartconfig file and applying it

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    additionalArguments:
      - "--api"
      - "--api.dashboard=true"
      - "--api.insecure=true"
    ports:
      traefik:
        expose: true
    providers:
      kubernetesCRD:
        allowCrossNamespace: true

This simply enables the dashboard and tells k3s to expose it on the default port (which happens to be 9000)

Now, from my laptop or PC I can access the dashboard at my.domain:9000/dashboard

TLS Certificates

I want to be able to use tls certificates to secure my applications. I tried a few different ways of using cert-manager with no success, before I found that traefik can manage let’sencrypt certificates for me, and makes the whole process super simply

1 add the following arguments to the additionalArguments block in the traefik helmChartConfig and apply it.

      - "--log.level=DEBUG"
      - "--certificatesresolvers.le.acme.email=youremmail@gmail.com"
      - "--certificatesresolvers.le.acme.storage=/data/acme.json"
      - "--certificatesresolvers.le.acme.tlschallenge=true"
      - "--certificatesresolvers.le.acme.caServer=https://acme-staging-v02.api.letsencrypt.org/directory"

Next I deployed a whoami pod with an ingressroute to test it. This can be ignored if you like, I only did it to see the ingressroute working with certificates

apiVersion: apps/v1
kind: Deployment
metadata:
  name: whoami
  namespace: default
  labels:
    app: whoami
spec:
  replicas: 1
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
      - image: docker.io/containous/whoami:v1.5.0
        name: whoami
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
  namespace: default
spec:
  ports:
  - name: whoami
    port: 80
    targetPort: 80
  selector:
    app: whoami
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: whoami
  namespace: default
spec:
  entryPoints:
    - websecure
  routes:
  - kind: Rule
    match: Host(`whoami.my.domain`)
    services:
    - name: whoami
      port: 80
  tls:
    certResolver: le

Apply that with kubectl kubectl apply -f whoami.yaml In your browser, you can navigate to whoami.my.domain and you should see some text on the page, and the padlock in the address bar to show you have valid certificates

Pihole

Now I have the cluster up and running, and with monitoring, I want to migrate all the services running on the server into it, starting with pihole. There is a popular helm chart that can be used to set up a pihole installation which I considered using but I think there will be changes I have to make to get it working, I want to use my domain and have pihole routing to a subdomain via traefik, and I don’t think I’ll get the same learning value as if I incrementally create components myself, which is the way I've chosen to do it.

storage

The first thing I need to do is set up the storage that Pihole will need to use. Because Kubernetes pods are ephemeral (that is, they, and any information they hold, are completely deleted when a Pod restarts or is lost), we need to make sure that when a pod is lost, the configuration and query logs are kept and reloaded when a new pod is created. I first created a namespace for it kubectl create ns pihole, then I created two persistent volume and two persisten volume claim resources, one for pihole's configuration in etc, and one for dnsmasq. The volume doesn't actually need a namespace, but the PVC does. This enables the deployment to bind the pod to the pvc when it's created.

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pihole-etc-pv
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  hostPath:
    path: /tmp/pihole-etc
    type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pihole-dnsmasq-pv
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  hostPath:
    path: /tmp/pihole-dnsmasq
    type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pihole-etc-claim
  namespace: pihole
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  volumeName: pihole-etc-pv
  storageClassName: ""
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pihole-dnsmasq-claim
  namespace: pihole
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  volumeName: pihole-dnsmasq-pv
  storageClassName: ""

deployment

The next step is to create a deployment and get a pihole pod running. and wrote a deployment file, you can see where the volumeMounts have been created at the bottom of the deployment, and then the volumes part of the spec references the volumemounts and PVCs by name. Also note where I've set the environment variables, the VIRTUAL_HOST value is the domain I'll be using for it, the ServerIP and FTLCONF_LOCAL_IPV4 values are the local ip address of the Pi Also note all the ports I've enabled, we need 80 and 443 for the WebUI, 53 and 67 for DNS, 547 for DHCP (this is optional), Another important detail here is the imagePullPolicy value. I didn't have this set and would find that when a new pod spins up, the node has no internet momentarily (the pod that isn't running is the DNS resolver), and the deployment would by default try to pull the image from the external repository, which of course it can't do. Setting the value to ifNotPresent tells it to use the image that has already been downloaded, so it can initialise the container, get DNS running and restore internet to the node. It took me some time to realise this and was getting to be a cause of frustration before I figured this out.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pihole
  namespace: pihole
  labels:
    app: pihole
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pihole
  template:
    metadata:
      labels:
        app: pihole
    spec:
      volumes:
      - name: pihole-local-etc-volume
        persistentVolumeClaim:
          claimName: pihole-etc-claim
      - name: pihole-local-dnsmasq-volume
        persistentVolumeClaim:
          claimName: pihole-dnsmasq-claim
      containers:
      - name: pihole
        image: pihole/pihole:latest
        imagePullPolicy: "IfNotPresent"
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        ports:
        - containerPort: 80
          name: pihole-http
          protocol: TCP
        - containerPort: 53
          name: dns
          protocol: TCP
        - containerPort: 53
          name: dns-udp
          protocol: UDP
        - containerPort: 443
          name: pihole-ssl
          protocol: TCP
        - containerPort: 67
          name: client-udp
          protocol: UDP
        - containerPort: 547
          name: dhcp-ipv6
          protocol: UDP
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
        env:
        - name: VIRTUAL_HOST
          value: 'pihole.my.domain'
        - name: 'ServerIP'
          value: '192.168.0.215'
        - name: WEBPASSWORD
          value: "PASSWORD"
        - name: TZ
          value: "Europe/London"
        - name: FTLCONF_LOCAL_IPV4
          value: "192.168.0.215"
        volumeMounts:
        - name: pihole-local-etc-volume
          mountPath: "/etc/pihole"
        - name: pihole-local-dnsmasq-volume
          mountPath: "/etc/dnsmasq.d"

When I deploy this kubectl create -f deployment.yaml a pod gets created in the pihole namespace. Next I need to figure out how to access it.

services and ingressroutes

So the pod exists and is running, but we can't access it yet. To have it open to the internet and reachable on the necessary ports, we need to create service and ingress resources. I've set up services for TCP and UDP separately, and used a traefik Custom Resource Definition (CRD) for the ingress called an IngressRoute. It's basically a specialised Ingress resource designed for Traefik. We created one for whoami as well but I didn't talk about it then. There's a standard IngressRoute for the webui, with a URL, that will make it reachable in a browser, with certificates, like before. Then there are 3 IngressRouteUDP resources, for DNS, and DCHP on both IPV4 and IPV6. The DHCP ones are optional, only necessary if you need to use Pihole as a DHCP server on your network.

---
apiVersion: v1
kind: Service
metadata:
  name: pihole-svc-tcp
  namespace: pihole
spec:
  selector:
    app: pihole
  ports:
  - name: pihole-admin
    port: 80
    targetPort: pihole-http
  - name: pihole-dns
    port: 53
    targetPort: dns
    protocol: TCP
  - name: pihole-dns-67
    port: 67
    targetPort: 67
    protocol: TCP
  externalTrafficPolicy: Local
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: 192.168.0.215
---
apiVersion: v1
kind: Service
metadata:
  name: pihole-svc
  namespace: pihole
spec:
  selector:
    app: pihole
  ports:
  - name: pihole-admin
    port: 80
    targetPort: 80
    protocol: UDP
  - name: dns-udp
    port: 53
    targetPort: 53
    protocol: UDP
  - name: dhcp-udp
    port: 67
    targetPort: 67
    protocol: UDP
  - name: dhcp-ipv6
    port: 547
    targetPort: 547
    protocol: UDP
  externalTrafficPolicy: Local
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: 192.168.0.215
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: pihole
  namespace: pihole
spec:
  entryPoints:
    - websecure
  routes:
  - kind: Rule
    match: Host(`pihole.my.domain`)
    services:
    - name: pihole-svc-tcp
      port: 80
  tls:
    certResolver: le
---
apiVersion: traefik.io/v1alpha1
kind: IngressRouteUDP
metadata:
  name: ingressrouteudp
  namespace: pihole
spec:
  entryPoints:
    - dns-udp
  routes:
  - services:
    - name: pihole-svc
      port: 53
      weight: 10
      nativeLB: true
---
apiVersion: traefik.io/v1alpha1
kind: IngressRouteUDP
metadata:
  name: ingressroutedhcp
  namespace: pihole
spec:
  entryPoints:
    - dhcp-udp
  routes:
  - services:
    - name: pihole-svc
      port: 67
      weight: 10
      nativeLB: true
---
apiVersion: traefik.io/v1alpha1
kind: IngressRouteUDP
metadata:
  name: ingressroutedhcpipv6
  namespace: pihole
spec:
  entryPoints:
    - dhcp-ipv6
  routes:
  - services:
    - name: pihole-svc
      port: 547
      weight: 10
      nativeLB: true

Persistence

Now, I've described above the steps that got me a working pihole deployment, blocking adverts on my network with a Web UI that I could access on its own subdomain, but it still wasn't quite right. I was facing an issue whereby when the pod or node gets restarted, all my config and data would be lost, as if it was a clean install, even with the storage set up correctly. I checked them over and tweaked them dozens of times trying to find what was wrong with the PVs, the PVCs, and the volume mounts but there was nothing I could find wrong there. I opened an SSH connection to the pi and checked to see if the information and configs I'd set up were on the filesystem, and there it was. It seemed it just wasn't being used when a new pod is created, so I did some thinking (and some Googling) and after some time found the solution.

I have the directory /tmp/pihole-etc/ mounted to /etc/pihole/ and I was able to sort this with some changes to the configs in there from the pi I added my adlists to /tmp/pihole-etc/migration_backup/adlists.list

https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts.txt
https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
https://v.firebog.net/hosts/AdguardDNS.txt

then I changed /tmp/pihole-etc/pihole-FTL.conf to include these entries (some may not be needed but I wanted to try them out). I think the DBIMPORT one was the important one for getting clients and queries back, it seems without that it won't import the old FTL database, just create a brand new one.

#; Pi-hole FTL config file
#; Comments should start with #; to avoid issues with PHP and bash reading this file
MACVENDORDB=/macvendor.db
LOCAL_IPV4=192.168.0.215
RATE_LIMIT=900/60
BLOCK_ICLOUD_PR=true
SHOW_DNSSEC=true
NAMES_FROM_NETDB=true
DBIMPORT=yes
DEBUG_DATABASE=true
DEBUG_QUERIES=true
DEBUG_CLIENTS=true
DEBUG_ALIASCLIENTS=true

I tested this by deleting the pihole pod, and when the new one came up, I was pleased to see that it was now showing the existing clients and queries that were made before I restarted it.

So now I have a Kubernetes cluster up and running, with pihole running in a self-healing deployment, blocking adverts on my network at home, a web dashboard I can access from anywhere, as well as Grafana automatically monitoring all of the cluster, which will grow over time as more applications are added to it. Running it locally or in a docker container was much simpler, but I think the benefits of using Kubernetes for it are worth the effort, and I've really gained a lot more from the exercise than I have been from watching tutorial videos.