Introduction

In this guide we will launch a highly-available three Node Kubernetes cluster on Equinix Metal using Talos as the Node OS, as well as bootstrap, and controlPlane provider for Cluster-API.

Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.

Talos is a modern OS designed to be secure, immutable, and minimal.

A globally-available bare metal “as-a-service” that can be deployed and interconnected in minutes.

The folks over at Equinix Metal have a wonderful heart for supporting Open Source communities.

  • Why is this important?: In general: Orchestrating a container based OS such as Talos (Flatcar, Fedora CoreOS, or RancherOS) shifts focus from the Nodes to the workloads. In terms of Talos: Currently the documentation for running an OS such as Talos in Equinix Metal for Kubernetes with Cluster-API is not so well documented and therefore inaccessible. It's important to fill in the gaps of knowledge.

Dependencies

What you'll need for this guide:

  • talosctl

  • kubectl

  • packet-cli

  • the ID and API token of existing Equinix Metal project

  • an existing Kubernetes cluster with a public IP (such as kind, minikube, or a cluster already on Equinix Metal)

Prelimiary steps

In order to talk to Equinix Metal, we'll export environment variables to configure resources and talk via packet-cli.

Set the correct project to create and manage resources in:

read -p 'PACKET_PROJECT_ID: ' PACKET_PROJECT_ID

The API key for your account or project:

read -p 'PACKET_API_KEY: ' PACKET_API_KEY

Export the variables to be accessible from packet-cli and clusterctl later on:

export PACKET_PROJECT_ID PACKET_API_KEY PACKET_TOKEN=$PACKET_API_KEY

In the existing cluster, a public LoadBalancer IP will be needed. I have already installed nginx-ingress in this cluster, which has got a Service with the cluster's elastic IP. We'll need this IP address later for use in booting the servers. If you have set up your existing cluster differently, it'll just need to be an IP that we can use.

export LOAD_BALANCER_IP="$(kubectl -n nginx-ingress get svc nginx-ingress-ingress-nginx-controller -o=jsonpath='{.status.loadBalancer.ingress[0].ip}')"

Setting up Cluster-API

Install Talos providers for Cluster-API bootstrap and controlplane in your existing cluster:

clusterctl init -b talos -c talos -i packet

This will install Talos's bootstrap and controlPlane controllers as well as the Packet / Equinix Metal infrastructure provider.

Important note:

  • the bootstrap-talos controller in the cabpt-system namespace must be running a version greater than v0.2.0-alpha.8. The version can be displayed in with clusterctl upgrade plan when it's installed.

Setting up Matchbox

Currently, since Equinix Metal have not yet added support for Talos, it is necessary to install Matchbox to boot the servers (There is an issue in progress and feedback for adding support).

  • What is Matchbox?:

Matchbox is a service that matches bare-metal machines to profiles that PXE boot and provision clusters.

Here is the manifest for a basic matchbox installation:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: matchbox
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      name: matchbox
  template:
    metadata:
      labels:
        name: matchbox
    spec:
      containers:
        - name: matchbox
          image: quay.io/poseidon/matchbox:v0.9.0
          env:
            - name: MATCHBOX_ADDRESS
              value: "0.0.0.0:8080"
            - name: MATCHBOX_LOG_LEVEL
              value: "debug"
          ports:
            - name: http
              containerPort: 8080
          livenessProbe:
            initialDelaySeconds: 5
            httpGet:
              path: /
              port: 8080
          resources:
            requests:
              cpu: 30m
              memory: 20Mi
            limits:
              cpu: 50m
              memory: 50Mi
          volumeMounts:
            - name: data
              mountPath: /var/lib/matchbox
            - name: assets
              mountPath: /var/lib/matchbox/assets
      volumes:
        - name: data
          hostPath:
            path: /var/local/matchbox/data
        - name: assets
          hostPath:
            path: /var/local/matchbox/assets
---
apiVersion: v1
kind: Service
metadata:
  name: matchbox
  annotations:
    metallb.universe.tf/allow-shared-ip: nginx-ingress
spec:
  type: LoadBalancer
  selector:
    name: matchbox
  ports:
    - name: http
      protocol: TCP
      port: 8080
      targetPort: 8080

Save it as matchbox.yaml

The manifests above were inspired by the manifests in the matchbox repo. For production it might be wise to use:

  • an Ingress with full TLS
  • a ReadWriteMany storage provider instead hostPath for scaling

With the manifests ready to go, we'll install Matchbox into the matchbox namespace on the existing cluster with the following commands:

kubectl create ns matchbox
kubectl -n matchbox apply -f ./matchbox.yaml

You may need to patch the Service.spec.externalIPs to have an IP to access it from if one is not populated:

kubectl -n matchbox patch \
  service matchbox \
  -p "{\"spec\":{\"externalIPs\":[\"$LOAD_BALANCER_IP\"]}}"

Once the pod is live, We'll need to create a directory structure for storing Talos boot assets:

kubectl -n matchbox exec -it \
  deployment/matchbox -- \
  mkdir -p /var/lib/matchbox/{profiles,groups} /var/lib/matchbox/assets/talos

Inside the Matchbox container, we'll download the Talos boot assets for Talos version 0.8.1 into the assets folder:

kubectl -n matchbox exec -it \
  deployment/matchbox -- \
  wget -P /var/lib/matchbox/assets/talos \
  https://github.com/talos-systems/talos/releases/download/v0.8.1/initramfs-amd64.xz \
  https://github.com/talos-systems/talos/releases/download/v0.8.1/vmlinuz-amd64

Now that the assets have been downloaded, run a checksum against them to verify:

kubectl -n matchbox exec -it \
  deployment/matchbox -- \
  sh -c "cd /var/lib/matchbox/assets/talos && \
    wget -O- https://github.com/talos-systems/talos/releases/download/v0.8.1/sha512sum.txt 2> /dev/null \
    | sed 's,_out/,,g' \
    | grep 'initramfs-amd64.xz\|vmlinuz-amd64' \
    | sha512sum -c -"

Since there's only one Pod in the Matchbox deployment, we'll export it's name to copy files into it:

export MATCHBOX_POD_NAME=$(kubectl -n matchbox get pods -l name=matchbox -o=jsonpath='{.items[0].metadata.name}')

Profiles in Matchbox are JSON configurations for how the servers should boot, where from, and their kernel args. Save this file as profile-default-amd64.json

{
  "id": "default-amd64",
  "name": "default-amd64",
  "boot": {
    "kernel": "/assets/talos/vmlinuz-amd64",
    "initrd": [
      "/assets/talos/initramfs-amd64.xz"
    ],
    "args": [
      "initrd=initramfs-amd64.xz",
      "init_on_alloc=1",
      "init_on_free=1",
      "slub_debug=P",
      "pti=on",
      "random.trust_cpu=on",
      "console=tty0",
      "console=ttyS1,115200n8",
      "slab_nomerge",
      "printk.devkmsg=on",
      "talos.platform=packet",
      "talos.config=none"
    ]
  }
}

Groups in Matchbox are a way of letting servers pick up profiles based on selectors. Save this file as group-default-amd64.json

{
  "id": "default-amd64",
  "name": "default-amd64",
  "profile": "default-amd64",
  "selector": {
    "arch": "amd64"
  }
}

We'll copy the profile and group into their respective folders:

kubectl -n matchbox \
  cp ./profile-default-amd64.json \
  $MATCHBOX_POD_NAME:/var/lib/matchbox/profiles/default-amd64.json
kubectl -n matchbox \
  cp ./group-default-amd64.json \
  $MATCHBOX_POD_NAME:/var/lib/matchbox/groups/default-amd64.json

List the files to validate that they were written correctly:

kubectl -n matchbox exec -it \
  deployment/matchbox -- \
  sh -c 'ls -alh /var/lib/matchbox/*/'

Testing Matchbox

Using curl, we can verify Matchbox's running state:

curl http://$LOAD_BALANCER_IP:8080

To test matchbox, we'll create an invalid userdata configuration for Talos, saving as userdata.txt:

#!talos

Feel free to use a valid one.

Now let's talk to Equinix Metal to create a server pointing to the Matchbox server:

packet-cli device create \
 --hostname talos-pxe-boot-test-1 \
 --plan c1.small.x86 \
 --facility sjc1 \
 --operating-system custom_ipxe \
 --project-id "$PACKET_PROJECT_ID" \
 --ipxe-script-url "http://$LOAD_BALANCER_IP:8080/ipxe?arch=amd64" \
 --userdata-file=./userdata.txt

In the meanwhile, we can watch the logs to see how things are:

kubectl -n matchbox logs deployment/matchbox -f --tail=100

Looking at the logs, there should be some get requests of resources that will be used to boot the OS.

Notes:

  • fun fact: you can run Matchbox on Android using Termux.

The cluster

Preparing the cluster

Here we will declare the template that we will shortly generate our usable cluster from:

kind: TalosControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
  version: ${KUBERNETES_VERSION}
  replicas: ${CONTROL_PLANE_MACHINE_COUNT}
  infrastructureTemplate:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: PacketMachineTemplate
    name: "${CLUSTER_NAME}-control-plane"
  controlPlaneConfig:
    init:
      generateType: init
      configPatches:
        - op: replace
          path: /machine/install
          value:
            disk: /dev/sda
            image: ghcr.io/talos-systems/installer:v0.8.1
            bootloader: true
            wipe: false
            force: false
        - op: add
          path: /machine/kubelet/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/apiServer/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/controllerManager/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/extraManifests
          value:
          - https://github.com/packethost/packet-ccm/releases/download/v1.1.0/deployment.yaml
        - op: add
          path: /cluster/allowSchedulingOnMasters
          value: true
    controlplane:
      generateType: controlplane
      configPatches:
        - op: replace
          path: /machine/install
          value:
            disk: /dev/sda
            image: ghcr.io/talos-systems/installer:v0.8.1
            bootloader: true
            wipe: false
            force: false
        - op: add
          path: /machine/kubelet/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/apiServer/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/controllerManager/extraArgs
          value:
            cloud-provider: external
        - op: add
          path: /cluster/allowSchedulingOnMasters
          value: true
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketMachineTemplate
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
  template:
    spec:
      OS: custom_ipxe
      ipxeURL: "http://${IPXE_SERVER_IP}:8080/ipxe?arch=amd64"
      billingCycle: hourly
      machineType: "${CONTROLPLANE_NODE_TYPE}"
      sshKeys:
        - "${SSH_KEY}"
      tags: []
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
  name: "${CLUSTER_NAME}"
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
        - ${POD_CIDR:=192.168.0.0/16}
    services:
      cidrBlocks:
        - ${SERVICE_CIDR:=172.26.0.0/16}
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: PacketCluster
    name: "${CLUSTER_NAME}"
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
    kind: TalosControlPlane
    name: "${CLUSTER_NAME}-control-plane"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketCluster
metadata:
  name: "${CLUSTER_NAME}"
spec:
  projectID: "${PACKET_PROJECT_ID}"
  facility: "${FACILITY}"
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-worker-a
  labels:
    cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
    pool: worker-a
spec:
  replicas: ${WORKER_MACHINE_COUNT}
  clusterName: ${CLUSTER_NAME}
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
      pool: worker-a
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
        pool: worker-a
    spec:
      version: ${KUBERNETES_VERSION}
      clusterName: ${CLUSTER_NAME}
      bootstrap:
        configRef:
          name: ${CLUSTER_NAME}-worker-a
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: TalosConfigTemplate
      infrastructureRef:
        name: ${CLUSTER_NAME}-worker-a
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: PacketMachineTemplate
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: PacketMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-worker-a
spec:
  template:
    spec:
      OS: custom_ipxe
      ipxeURL: "http://${IPXE_SERVER_IP}:8080/ipxe?arch=amd64"
      billingCycle: hourly
      machineType: "${WORKER_NODE_TYPE}"
      sshKeys:
        - "${SSH_KEY}"
      tags: []
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: TalosConfigTemplate
metadata:
  name: ${CLUSTER_NAME}-worker-a
  labels:
    cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
spec:
  template:
    spec:
      generateType: init

Inside of TalosControlPlane.spec.controlPlaneConfig.init, I'm very much liking the use of generateType: init paired with configPatches. This enables:

  • configuration to be generated;
  • management of certificates out of the cluster operator's hands;
  • another level of standardisation; and
  • overrides to be added where needed

Notes:

  • the ClusterAPI template above uses Packet-Cloud-Controller manager version 1.1.0

Templating your configuration

Set environment variables for configuration:

export CLUSTER_NAME="talos-metal"
export FACILITY=sjc1
export KUBERNETES_VERSION=v1.20.2
export POD_CIDR=10.244.0.0/16
export SERVICE_CIDR=10.96.0.0/12
export CONTROLPLANE_NODE_TYPE=c1.small.x86
export CONTROL_PLANE_MACHINE_COUNT=3
export WORKER_NODE_TYPE=c1.small.x86
export WORKER_MACHINE_COUNT=0
export SSH_KEY=""
export IPXE_URL=$LOAD_BALANCER_IP

In the variables above, we will create a cluster which has three small controlPlane nodes to run workloads.

Render the manifests

Render your cluster configuration from the template:

clusterctl config cluster "$CLUSTER_NAME" \
  --from ./talos-packet-cluster-template.yaml \
  -n "$CLUSTER_NAME" > "$CLUSTER_NAME"-cluster-capi.yaml

Creating the cluster

With the template for the cluster rendered to how wish to deploy it, it's now time to apply it:

kubectl create ns "$CLUSTER_NAME"
kubectl -n "$CLUSTER_NAME" apply -f ./"$CLUSTER_NAME"-cluster-capi.yaml

The cluster will now be brought up, we can see the progress by taking a look at the resources:

kubectl -n "$CLUSTER_NAME" get machines,clusters,packetmachines,packetclusters

Note: As expected, the cluster may take some time to appear and be accessible.

Not long after applying, a KubeConfig is available. Fetch the KubeConfig from the existing cluster with:

kubectl -n "$CLUSTER_NAME" get secrets \
  "$CLUSTER_NAME"-kubeconfig -o=jsonpath='{.data.value}' \
  | base64 -d > $HOME/.kube/"$CLUSTER_NAME"

Using the KubeConfig from the new cluster, check out the status of it:

kubectl --kubeconfig $HOME/.kube/"$CLUSTER_NAME" cluster-info

Once the APIServer is reachable, create configuration for how the Packet-Cloud-Controller-Manager should talk to Equinix-Metal:

kubectl --kubeconfig $HOME/.kube/"$CLUSTER_NAME" -n kube-system \
  create secret generic packet-cloud-config \
  --from-literal=cloud-sa.json="{\"apiKey\": \"${PACKET_API_KEY}\",\"projectID\": \"${PACKET_PROJECT_ID}\"}"

Since we're able to talk to the APIServer, we can check how all Pods are doing:

export CLUSTER_NAME="talos-metal"
kubectl --kubeconfig $HOME/.kube/"$CLUSTER_NAME"\
  -n kube-system get pods

Listing Pods shows that everything is live and in a good state:

NAMESPACE     NAME                                                     READY   STATUS    RESTARTS   AGE
kube-system   coredns-5b55f9f688-fb2cb                                 1/1     Running   0          25m
kube-system   coredns-5b55f9f688-qsvg5                                 1/1     Running   0          25m
kube-system   kube-apiserver-665px                                     1/1     Running   0          19m
kube-system   kube-apiserver-mz68q                                     1/1     Running   0          19m
kube-system   kube-apiserver-qfklt                                     1/1     Running   2          19m
kube-system   kube-controller-manager-6grxd                            1/1     Running   0          19m
kube-system   kube-controller-manager-cf76h                            1/1     Running   0          19m
kube-system   kube-controller-manager-dsmgf                            1/1     Running   0          19m
kube-system   kube-flannel-brdxw                                       1/1     Running   0          24m
kube-system   kube-flannel-dm85d                                       1/1     Running   0          24m
kube-system   kube-flannel-sg6k9                                       1/1     Running   0          24m
kube-system   kube-proxy-flx59                                         1/1     Running   0          24m
kube-system   kube-proxy-gbn4l                                         1/1     Running   0          24m
kube-system   kube-proxy-ns84v                                         1/1     Running   0          24m
kube-system   kube-scheduler-4qhjw                                     1/1     Running   0          19m
kube-system   kube-scheduler-kbm5z                                     1/1     Running   0          19m
kube-system   kube-scheduler-klsmp                                     1/1     Running   0          19m
kube-system   packet-cloud-controller-manager-77cd8c9c7c-cdzfv         1/1     Running   0          20m
kube-system   pod-checkpointer-4szh6                                   1/1     Running   0          19m
kube-system   pod-checkpointer-4szh6-talos-metal-control-plane-j29lb   1/1     Running   0          19m
kube-system   pod-checkpointer-k7w8h                                   1/1     Running   0          19m
kube-system   pod-checkpointer-k7w8h-talos-metal-control-plane-lk9f2   1/1     Running   0          19m
kube-system   pod-checkpointer-m5wrh                                   1/1     Running   0          19m
kube-system   pod-checkpointer-m5wrh-talos-metal-control-plane-h9v4j   1/1     Running   0          19m

With the cluster live, it's now ready for workloads to be deployed!

Talos Configuration

In order to manage Talos Nodes outside of Kubernetes, we need to create and set up configuration to use.

Create the directory for the config:

mkdir -p $HOME/.talos

Discover the IP for the first controlPlane:

export TALOS_ENDPOINT=$(kubectl -n "$CLUSTER_NAME" \
  get machines \
  $(kubectl -n "$CLUSTER_NAME" \
    get machines -l cluster.x-k8s.io/control-plane='' \
    --no-headers --output=jsonpath='{.items[0].metadata.name}') \
    -o=jsonpath="{.status.addresses[?(@.type=='ExternalIP')].address}" | awk '{print $2}')

Fetch the talosconfig from the existing cluster:

kubectl get talosconfig \
  -n $CLUSTER_NAME \
  -l cluster.x-k8s.io/cluster-name=$CLUSTER_NAME \
  -o yaml -o jsonpath='{.items[0].status.talosConfig}' > $HOME/.talos/"$CLUSTER_NAME"-management-plane-talosconfig.yaml

Write in the configuration the endpoint IP and node IP:

talosctl \
  --talosconfig $HOME/.talos/"$CLUSTER_NAME"-management-plane-talosconfig.yaml \
  config endpoint $TALOS_ENDPOINT
talosctl \
  --talosconfig $HOME/.talos/"$CLUSTER_NAME"-management-plane-talosconfig.yaml \
  config node $TALOS_ENDPOINT

Now that the talosconfig has been written, try listing all containers:

export CLUSTER_NAME="talos-metal"
# removing ip; omit ` | sed ...` for regular use
talosctl --talosconfig $HOME/.talos/"$CLUSTER_NAME"-management-plane-talosconfig.yaml containers | sed -r 's/(\b[0-9]{1,3}\.){3}[0-9]{1,3}\b'/"x.x.x.x      "/

Here's the containers running on this particular node, in containerd (not k8s related):

NODE            NAMESPACE   ID         IMAGE                                  PID    STATUS
x.x.x.x         system      apid       talos/apid                             3046   RUNNING
x.x.x.x         system      etcd       gcr.io/etcd-development/etcd:v3.4.14   3130   RUNNING
x.x.x.x         system      networkd   talos/networkd                         2879   RUNNING
x.x.x.x         system      routerd    talos/routerd                          2888   RUNNING
x.x.x.x         system      timed      talos/timed                            2976   RUNNING
x.x.x.x         system      trustd     talos/trustd                           3047   RUNNING

Clean up

Tearing down the entire cluster and resources associated with it, can be achieved by

i. Deleting the cluster:

kubectl -n "$CLUSTER_NAME" delete cluster "$CLUSTER_NAME"

ii. Deleting the namespace:

kubectl delete ns "$CLUSTER_NAME"

iii. Removing local configurations:

rm \
  $HOME/.talos/"$CLUSTER_NAME"-management-plane-talosconfig.yaml \
  $HOME/.kube/"$CLUSTER_NAME"

What have I learned from this?

  • (always learning) how wonderful the Kubernetes community is: there are so many knowledgable individuals who are so ready for collaboration and adoption - it doesn't matter the SIG or group.
  • how modular Cluster-API is: Cluster-API components (bootstrap, controlPlane, core, infrastructure) can be swapped out and meshed together in very cool ways.

Credits

Integrating Talos into this project would not be possible without help from Andrew Rynhard (Talos Systems), huge thanks to him for reaching out for pairing and co-authoring.

Notes and references


Hope you've enjoyed the output of this project! Thank you!