NSO Kubernetes Lab

frjansso · ‎11-07-2018

NOTE: This is WIP, and I'm aiming to have this hosted on dCloud as a hands on lab. Stay tuned!

The aim with this lab is to build a Kubernetes (k8s) cluster with three nodes. In this cluster we will install NSO, one master and two slaves. Using these we going to experiment with leader election and fail over.

The master node (kube-1) is hosting the bulk part of the k8s machinery, i.e. API, controller and scheduler. In a production environment, you will typically not schedule any pods on the master node. For this lab we'll remove that restriction and use the master node as well.

The first test will not have persistent storage for NSO, so if all pods die, all data is lost. We will therefore move on to install a persistent storage provider (GlusterFS) on all three nodes. With this in place, we will setup NSO again, but now on persistent storage. Leader election and failover will still work as before. But now we can take an outage of all three NSO pods.

Credentials

The nodes are called kube-1, kube-2 and kube-3.

Kube-1 will serve as as the k8s master.

Credentials

The credentials on the nodes are:

username: test

password: cisco123

Install Kubernetes on Three Nodes

This lab is based on this document: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

SSH to the Master Node (kube-1)

ssh test@X.X.X.X

Initialize k8s on the master node

Make sure docker is working and can pull images

test@kube-1:~$ docker run --rm -it alpine:latest /bin/sh
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
8e3ba11ec2a2: Pull complete
Digest: sha256:7043076348bf5040220df6ad703798fd8593a0918d06d3ce30c6c93be117e430
Status: Downloaded newer image for alpine:latest
/ # exit

Next we're going to use kubeadm to setup the master node, later followed by the slave nodes.

Sometimes kubeadm times out when downloading the images, therefore it's a good idea to pull the images first. Please run the following until all images are pulled.

test@kube-1:~$ kubeadm config images pull
[config/images] Pulled k8s.gcr.io/kube-apiserver-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-controller-manager-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-scheduler-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-proxy-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/pause:3.1
[config/images] Pulled k8s.gcr.io/etcd-amd64:3.2.18
failed to pull image "k8s.gcr.io/coredns:1.1.3": exit status 1

If you get an error as above, please run again.

test@kube-1:~$ kubeadm config images pull
[config/images] Pulled k8s.gcr.io/kube-apiserver-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-controller-manager-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-scheduler-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/kube-proxy-amd64:v1.11.2
[config/images] Pulled k8s.gcr.io/pause:3.1
[config/images] Pulled k8s.gcr.io/etcd-amd64:3.2.18
[config/images] Pulled k8s.gcr.io/coredns:1.1.3

Pod Networking

When installing k8s, you have a wide variety of pod networking to choose from. For the lab, we've decided to use Flannel (https://github.com/coreos/flannel). "Flannel is responsible for providing a layer 3 IPv4 network between multiple nodes in a cluster. Flannel does not control how containers are networked to the host, only how the traffic is transported between hosts."

To install k8s, we'll use the kubeadm init command, and since we're going to use Flannel as our pod network, we will pass the 10.244.0.0/16 CIDR to kubeadm. This means that our containers will end up with 10.244.x.y addresses.

test@kube-1:~$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
[init] using Kubernetes version: v1.11.2
[preflight] running pre-flight checks
I0820 14:49:38.345563    5854 kernel_validator.go:81] Validating kernel version
I0820 14:49:38.345804    5854 kernel_validator.go:96] Validating kernel config
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [kube-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.4]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [kube-1 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [kube-1 localhost] and IPs [10.0.0.4 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 40.001870 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.11" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node kube-1 as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node kube-1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-1" as an annotation
[bootstraptoken] using token: pahwji.frci87slyux729v0
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
 
Your Kubernetes master has initialized successfully!
 
To start using your cluster, you need to run the following as a regular user:
 
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
 
You can now join any number of machines by running the following on each node
as root:
 
  kubeadm join 10.0.0.4:6443 --token pahwji.frci87slyux729v0 --discovery-token-ca-cert-hash sha256:8fdd45a2a0004788cde9c6b89fcdec56931fcda1c74322ac7bccfa6f88a14ee2

Now get kubectl into the user's home directory

test@kube-1:~$ mkdir -p $HOME/.kube
test@kube-1:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
test@kube-1:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

If we check the nodes, we'll see our master node, but it's not ready. To fix that, we need to install a pod network..

test@kube-1:~$ kubectl get nodes
NAME      STATUS     ROLES     AGE       VERSION
kube-1    NotReady  master    33s       v1.11.2

Install Flannel

To install Flannel we simply apply the YAML provided by the Flannel project:

test@kube-1:~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

After Flannel is installed, you should shortly see the master node coming up

test@kube-1:~$ kubectl get nodes
NAME      STATUS    ROLES     AGE       VERSION
kube-1    Ready     master    21m       v1.11.2

Allow scheduling on the master node

To allow scheduling of workloads on the master node, we need to untaint it.

test@kube-1:~$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/kube-1 untainted

If we don't do this, no pods will be scheduled on the master node. In a production system this makes sense, but for this lab we want to use all three nodes.

Add Slave Nodes

Next thing is to add our two slave nodes, kube-2 and kube-3, to the k8s cluster.

The kubeadm init command for kubeadm will print a join command that can be used by the slaves, but you can also generate a new one:

test@kube-1:~$ kubeadm token create --print-join-command
kubeadm join 10.0.0.4:6443 --token w23h6m.ze6f4j7c0xanskd2 --discovery-token-ca-cert-hash sha256:bbb0d19cdc33a13db449121676326dcf5b140925bc0ab973365d5a17b7158008

NOTE: You cannot copy the join commands from here, you have to generate your own as explained above or copy the one outputted by the kubeadm init command.

NOTE: You have to do the following on both kube-2 and kube-3.

SSH to kube-2 and kube-3

test@kube-1:~$ ssh kube-2
 
test@kube-2:~$ sudo kubeadm join 10.0.0.4:6443 --token qohtw7.ufsn3xma6imzj80n --discovery-token-ca-cert-hash sha256:4629539b
39ea7f7663e80fc2ea902a7b39e467df396d848c4242c13088ad61a2
[preflight] running pre-flight checks
I0821 09:28:45.475918     673 kernel_validator.go:81] Validating kernel version
I0821 09:28:45.476369     673 kernel_validator.go:96] Validating kernel config
[discovery] Trying to connect to API Server "10.0.0.4:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.0.0.4:6443"
[discovery] Requesting info from "https://10.0.0.4:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.0.0.4:6443"
[discovery] Successfully established connection with API Server "10.0.0.4:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-2" as an annotation
 
This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.
 
Run 'kubectl get nodes' on the master to see this node join the cluster.
 
test@kube-2:~$ exit

Please redo the same for kube-3.

After a while the new nodes should come up

test@kube-1:~$ kubectl get nodes
NAME      STATUS    ROLES     AGE       VERSION
kube-1    Ready     master    59m       v1.11.2
kube-2    Ready     <none>    35m       v1.11.2
kube-3    Ready     <none>    36s       v1.11.2

We now have a working k8s cluster.

Create a local Docker registry

We're going to push our containers to a local registry, this will speed up and simplify things when we build our own NSO containers.

To start the registry, on kube-1:

test@kube-1:~$ docker run -d -p 5000:5000 --restart=always --name registry registry:2

To test the docker registry, first pull an image from Docker Hub

test@kube-1:~$ docker pull alpine:latest
latest: Pulling from library/alpine
Digest: sha256:7043076348bf5040220df6ad703798fd8593a0918d06d3ce30c6c93be117e430
Status: Image is up to date for alpine:latest

Next, let's tag it for our local registry

test@kube-1:~$ docker tag alpine:latest kube-1:5000/alpine:latest

Now push it into the registry

test@kube-1:~$ docker push kube-1:5000/alpine:latest
The push refers to a repository [10.0.0.4:5000/alpine]
73046094a9b8: Pushed
latest: digest: sha256:7e8e16517347ec0b19fac1700682cc0b905b4c10d91a5f409452d2327fe49b8d size: 528

To make sure we can pull from our registry, we'll remove the local image

test@kube-1:~$ docker rmi kube-1:5000/alpine
Untagged: kube-1:5000/alpine:latest
Untagged: kube-1:5000/alpine@sha256:7e8e16517347ec0b19fac1700682cc0b905b4c10d91a5f409452d2327fe49b8d
Deleted: sha256:11cd0b38bc3ceb958ffb2f9bd70be3fb317ce7d255c8a4c3f4af30e298aa1aab
Deleted: sha256:73046094a9b835e443af1a9d736fcfc11a994107500e474d0abf399499ed280c

Then we pull it back

test@kube-1:~$ docker pull kube-1:5000/alpine:latest
latest: Pulling from alpine
f031366bd5fe: Pull complete
Digest: sha256:7e8e16517347ec0b19fac1700682cc0b905b4c10d91a5f409452d2327fe49b8d
Status: Downloaded newer image for kube-1:5000/alpine:latest

All good!

Create NSO base image

We're going to create an NSO base image that we'll base the project image on. This is not necessary, but I've found that it saves time during development to have a base image onto which I install the NSO project.

Go the the k8s-lab/base-container directory, where you'll find a number of files, the most interesting files is the Dockerfile.

It will install a couple of packages needed by NSO. It also installs curl which will be helpful later on.

FROM ubuntu:16.04
 
COPY nso-4.6.1.3.linux.x86_64.installer.bin /tmp/nso
RUN apt-get update; \
    apt-get install -y openssh-client default-jre-headless python curl; \
    /tmp/nso /app/nso; \
    echo '. /app/nso/ncsrc' >> /root/.bashrc; \
    apt-get -y clean autoclean; \
    apt-get -y autoremove; \
    rm -rf /tmp/* /var/tmp/* /var/lib/{apt,dpkg,cache,log}/
 
EXPOSE 8080 830 2022 2023 4569

It installs NSO in /app/nso and cleans up after apt. Last it exposes a number of ports that NSO uses.

To build and push the image to the local registry please run (NOTE, this can take a couple of minutes).

test@kube-1:~$ ./build.sh
Sending build context to Docker daemon 188.5 MB
Step 1/4 : FROM ubuntu:16.04
---> 7aa3602ab41e
Step 2/4 : COPY nso-4.6.1.3.linux.x86_64.installer.bin /tmp/nso
---> Using cache
---> 5a3f5d3eb822
Step 3/4 : RUN apt-get update;     apt-get install -y openssh-client default-jre-headless python git curl;     /tmp/nso /app/nso;     echo '. /app/nso/ncsrc' >> /root/.bashrc;     apt-get -y clean autoclean;     apt-get -y autoremove;     rm -rf /tmp/*
/var/tmp/* /var/lib/{apt,dpkg,cache,log}/
 
...
 
Successfully built a35fb1e9e6d3
The push refers to a repository [kube-1:5000/nso-base]
6b7ec8654278: Pushed
869384c52942: Pushed
bcff331e13e3: Pushed
2166dba7c95b: Pushed
5e95929b2798: Pushed
c2af38e6b250: Pushed
0a42ee6ceccb: Pushed
latest: digest: sha256:bbd34574f8af11322ce726d160f48bb9766367d9ebcdd8aece8629bfd1f6f0d6 size: 1783

NSO Container

Next we will build our NSO container that contains the packages we use for this test. Please navigate to k8s-lab/nso-container

This Dockerfile is quite simple, just adding the NSO project to the base container.

from kube-1:5000/nso-base
 
COPY nso-app /app/nso_project
COPY run-nso.sh /.
 
EXPOSE 4570
 
ENTRYPOINT ["/run-nso.sh"]

To build, simply run

test@kube-1:~$ ./build.sh
Sending build context to Docker daemon     64kB
Step 1/5 : from kube-1:5000/nso-base
---> 7c3dc8b25152
Step 2/5 : COPY nso-app /app/nso_project
---> cec6b567551b
Step 3/5 : COPY run-nso.sh /.
---> c671eee6e239
Step 4/5 : EXPOSE 4570
---> Running in f8c5440dd665
Removing intermediate container f8c5440dd665
---> 53814cc9fce8
Step 5/5 : ENTRYPOINT ["/run-nso.sh"]
---> Running in 51cc1c16771a
Removing intermediate container 51cc1c16771a
---> f1dd50c21e62
Successfully built f1dd50c21e62
Successfully tagged kube-1:5000/nso-k8s-lab:latest
The push refers to repository [kube-1:5000/nso-k8s-lab]
c8c8f4c5557e: Pushed
7f37e62a2dda: Pushed
e291435ad52b: Mounted from nso-base
4458e05c75c6: Mounted from nso-base
d7232280c8c4: Mounted from nso-base
07663827a77f: Mounted from nso-base
87a2d0000622: Mounted from nso-base
4a7a5ec0f29e: Mounted from nso-base
8823818c4748: Mounted from nso-base
latest: digest: sha256:3a80da141f03dc3205c91d9afec2c7606edaa5d19acaa951c434ef7a6bcf2ab9 size: 2199

Now we have the NSO container we'll use for the rest of the lab.

NSO Service and RBAC

To access NSO we'll create a couple of k8s services, we'll also install a k8s role for leader election.

Please go to the k8s-lab/nso-service directory

We will register two services for NSO, please see nso-service.yml

kind: Service
apiVersion: v1
metadata:
  name: nso-svc
spec:
  clusterIP: None
  ports:
  - protocol: TCP
    port: 8080
    name: webui
  selector:
    app: nso-app
---
kind: Service
apiVersion: v1
metadata:
  name: nso-svc-master
spec:
  ports:
  - protocol: TCP
    port: 2024
    name: ssh
  - protocol: TCP
    port: 8080
    name: webui

nso-svc

The service will service all NSO instances and could be used for RO access, port 8080 is exposed.

nso-svc-master

This service will service only the NSO HA master and will expose ports 8080 and 2024.

To get leader election to work, we also need to give the default user access to k8s endpoints, please see rbac.yml

# Create role to allow user to read endpoints
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: endpoint-reader
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["endpoints"]
  verbs: ["get", "watch", "list", "update"]
---
# This role binding allows "default" to read endpoints in the "default" namespace.
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: read-endpoints
  namespace: default
subjects:
- kind: User
  name: system:serviceaccount:default:default
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: endpoint-reader
  apiGroup: rbac.authorization.k8s.io

Make sure you apply both these files

test@kube-1:~$ kubectl create -f nso-service.yml
service/nso-svc created
service/nso-svc-master created
 
test@kube-1:~$ kubectl create -f rbac.yml
role.rbac.authorization.k8s.io/endpoint-reader created
rolebinding.rbac.authorization.k8s.io/read-endpoints created

Make sure the services are registered

test@kube-1:~/nfs/kube-ha/k8s-lab/nso-service$ kubectl get svc
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
kubernetes       ClusterIP   10.96.0.1       <none>        443/TCP             6h
nso-svc          ClusterIP   None            <none>        8080/TCP            14s
nso-svc-master   ClusterIP   10.108.126.92   <none>        2024/TCP,8080/TCP   14s

It will also simplify if we export the IP address to NSO master

test@kube-1:~$ export NSO_IP=$(kubectl get service nso-svc-master | tail -n +2 | awk '{print $3}')
 
test@kube-1:~$ echo $NSO_IP
10.102.2.60

Test deployment of NSO

In our first test, we will deploy NSO with one master and two slaves, distributed on the three nodes.

To select a master between themselves, we're launching a leader elector container as a sidecar to NSO. The leader-elector exposes a HTTP API, where each pod can query for the master. This is done by a HTTP GET on URL http://localhost:4040. The call returns the hostname of the master.

Please go to the k8s-lab/test-deployment directory and look at nso.yml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nso
spec:
  selector:
    matchLabels:
      app: nso-app
  serviceName: nso-svc
  # Deploy three instances of NSO
  replicas: 3
  template:
    metadata:
      labels:
        app: nso-app
    spec:
      # Specify anti-affinity to make sure NSO nodes are not
      # scheduled on the same k8s node
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - nso-app
              topologyKey: "kubernetes.io/hostname"
      containers:
        - name: nso-master
          image: kube-1:5000/nso-k8s-lab
          # Probes for liveness and readiness
          livenessProbe:
            httpGet:
              path: /restconf/data/ncs-state
              port: 8080
              httpHeaders:
                - name: Authorization
                  # echo -n "admin:admin" | base64
                  value: Basic YWRtaW46YWRtaW4=
          readinessProbe:
            httpGet:
              path: /restconf/data/ncs-state
              port: 8080
              httpHeaders:
                - name: Authorization
                  # echo -n "admin:admin" | base64
                  value: Basic YWRtaW46YWRtaW4=
          ports:
          - containerPort: 2024
            name: ssh
          - containerPort: 8080
            name: webui
        # The leader elector side car. HTTP GET on http://localhost:4040
        # will return the leader
        - name: elector
          image: fredrikjanssonse/leader-elector:0.6
          args:
            - --election=nso-svc
            - --http=localhost:4040
          ports:
            - containerPort: 4040
              protocol: TCP

And to deploy the test

test@kube-1:~$ kubectl create -f nso.yml
statefulset.apps/nso created

NOTE: This might take a while since the NSO image have to be pulled from the registry to each of the k8s nodes.

To continuously check status you can watch the pods.

NOTE: the nodes on which each NSO pod is scheduled on may vary on your system.

test@kube-1:~$ watch kubectl get pod -o=wide
NAME      READY     STATUS    RESTARTS   AGE       IP           NODE      NOMINATED NODE
nso-0     2/2       Running   0          2m        10.244.2.2   kube-3    <none>
nso-1     2/2       Running   0          1m        10.244.1.2   kube-2    <none>
nso-2     2/2       Running   0          1m        10.244.0.4   kube-1    <none>

After a while you should see the pods deployed, one on each node.

To see which was elected as leader, you have a couple of options

test@kube-1:~$ kubectl describe endpoints nso-svc
Name:         nso-svc
Namespace:    default
Labels:       <none>
Annotations:  control-plane.alpha.kubernetes.io/leader={"holderIdentity":"nso-0","leaseDurationSeconds":10,"acquireTime":"2018-08-21T22:17:15Z","renewTime":"2018-08-21T22:19:44Z","leaderTransitions":0}
Subsets:
  Addresses:          10.244.0.4,10.244.1.2,10.244.2.2
  NotReadyAddresses:  <none>
  Ports:
    Name   Port  Protocol
    ----   ----  --------
    webui  8080  TCP
 
Events:  <none>

Or you can log into one of the pods

test@kube-1:~$ kubectl exec -it nso-0 bash
Defaulting container name to nso-master.
Use 'kubectl describe pod/nso-0 -n default' to see all of the containers in this pod.
 
root@nso-0:/# curl http://localhost:4040
{"name":"nso-0"}
 
root@nso-0:/# exit

We can also query the NSO HA status

test@kube-1:~$ kubectl exec -it nso-0 bash
Defaulting container name to nso-master.
Use 'kubectl describe pod/nso-0 -n default' to see all of the containers in this pod.
 
root@nso-0:/# ncs_load -F o -o -p /ncs-state/ha
{
  "data": {
    "tailf-ncs-monitoring:ncs-state": {
      "ha": {
        "mode": "master",
        "node-id": "nso-0",
        "connected-slave": ["nso-1", "nso-2"]
      }
    }
  }
}
 
root@nso-0:/# exit

So we can see that NSO have connected two slaves

Test HA

To test HA we can modify some data in NSO. For this there is a simple action in NSO, the action will simply increase a counter (in config) and store which host (i.e. the master) last made a change.

We can invoke the test action as follows

test@kube-1:~$ http_proxy="" curl -H "Content-Type: application/yang-data+json" -d "" -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test/ping

And to check the result

test@kube-1:~$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test
 
<ha-test xmlns="http://example.com/ha-test"  xmlns:ha-test="http://example.com/ha-test">
  <count>2</count>
  <last-host>nso-0</last-host>
</ha-test>

So in my run above, I can see the nso-0 is the master.

Let's kill the nso-0 pod and force a leadership change.

Sometimes the restart of the master is so quick, that a new leader is not elected. If that happens, we can use a trick to force selection of a new leader. The pods are configured to not run on the same node (see nso.yml), so if we disable scheduling on one node, we can prevent one of the pods to come up.

First let's see where each pod is running

test@kube-1:~$ kubectl get pods -o=wide
NAME                     READY     STATUS    RESTARTS   AGE       IP            NODE      NOMINATED NODE
nso-0                    2/2       Running   0          1m        10.244.2.10   kube-3   <none>
nso-1                    2/2       Running   0          59s       10.244.1.11   kube-2    <none>
nso-2                    2/2       Running   0          42s       10.244.0.4    kube-1    <none>

Here I see that nso-0, the leader, is running on kube-3. Let's disable scheduling on that node

NOTE: Make sure you select the correct node for your system.

test@kube-1:~$ kubectl cordon kube-3
node/kube-3 cordoned

And now if we delete nso-0, it will not come up until we uncordon the node

test@kube-1:~$ kubectl delete pod nso-0
pod "nso-0" deleted

Now we can redo the action test again

test@kube-1:~$ http_proxy="" curl -H "Content-Type: application/yang-data+json" -d "" -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test/ping

And to check the result

test@kube-1:~$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test
 
 
<ha-test xmlns="http://example.com/ha-test"  xmlns:ha-test="http://example.com/ha-test">
  <count>4</count>
  <last-host>nso-1</last-host>
</ha-test>

And voila, the new master is nso-1 and more importantly, the data is not lost.

Let's uncordon our node again to let nso-0 be scheduled.

test@kube-1:~$ kubectl uncordon kube-3
node/kube-3 uncordoned
 
test@kube-1:~/nfs/kube-ha/k8s-lab/test-deployment$ kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
nso-0                    1/2       Running   0          1m
nso-1                    2/2       Running   0          14m
nso-2                    2/2       Running   0          14m

All good... but what will happen if we kill all pods?

test@kube-1:~$ kubectl get pods | grep nso | awk '{print $1}' | xargs kubectl delete pod
pod "nso-0" deleted
pod "nso-1" deleted
pod "nso-2" deleted

Wait for the pods to come back up

test@kube-1:~$ watch kubectl get pod -o=wide
NAME      READY     STATUS    RESTARTS   AGE       IP           NODE      NOMINATED NODE
nso-0     2/2       Running   0          2m        10.244.2.2   kube-3    <none>
nso-1     2/2       Running   0          1m        10.244.1.2   kube-2    <none>
nso-2     2/2       Running   0          1m        10.244.0.4   kube-1    <none>

Now let's check the config again

test@kube-1:~$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test

Empty CDB, when all pods are gone and there is no persistent storage, all data is lost.

Let's delete the current deployment

$ kubectl delete -f nso.yml
statefulset.apps "nso" deleted

Shortly the pods should be removed

$ kubectl get pod
No resources found.

NSO with Persistent Storage

Install GlusterFS

We'll install GlusterFS as our replicated storage.

On the master node, we've already checked out a git repo that will help setting up GlusterFS.

First thing we need to do is load a couple of modules on the k8s nodes. Please go to k8s-lab/install-glusterfs.

There you'll run an ansible playbook that will load the relevant modules on all the hosts.

test@kube-1:~$ ansible-playbook -i hosts.yml glusterfs.yml
 
PLAY [Build Hosts] ***************************************************************************************************
 
TASK [Gathering Facts] ***********************************************************************************************
ok: [kube-2]
ok: [kube-3]
ok: [kube-1]
 
TASK [Modprobe] ******************************************************************************************************
ok: [kube-3] => (item=dm_mirror)
ok: [kube-1] => (item=dm_mirror)
ok: [kube-2] => (item=dm_mirror)
ok: [kube-2] => (item=dm_snapshot)
ok: [kube-3] => (item=dm_snapshot)
ok: [kube-1] => (item=dm_snapshot)
ok: [kube-2] => (item=dm_thin_pool)
ok: [kube-1] => (item=dm_thin_pool)
ok: [kube-3] => (item=dm_thin_pool)
 
PLAY RECAP ***********************************************************************************************************
kube-1                     : ok=2    changed=0    unreachable=0    failed=0
kube-2                     : ok=2    changed=0    unreachable=0    failed=0
kube-3                     : ok=2    changed=0    unreachable=0    failed=0

Please go to ~/gluster-k8s/deploy.

There you'll find a file called topology.json

{
  "clusters": [
    {
      "nodes": [
        {
          "node": {
            "hostnames": {
              "manage": [
                "kube-1"
              ],
              "storage": [
                "10.0.0.4"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb"
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "kube-2"
              ],
              "storage": [
                "10.0.0.5"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb",
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "kube-3"
              ],
              "storage": [
                "10.0.0.6"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb",
          ]
        }
      ]
    }
  ]
}

It already have the correct names and IP address for your k8s nodes.

To install GlusterFS we simply run the gk-deply script, with -g to provision gluster pods as well.

$ ./gk-deploy -v -g
...
 
Deployment complete!

Now navigate back to k8s-lab/install-glusterfs, there you'll find a file called storage-class.yml.

NOTE: This file will need to bo modified to point to the heketi/glusterfs service

$ kubectl get service heketi
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
heketi                     ClusterIP   10.98.79.86     <none>        8080/TCP            15m

Edit storage-class.yml to match the IP address above

$ vim storage-class.yml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: glusterfs-storage
provisioner: kubernetes.io/glusterfs
parameters:
  resturl: "http://10.98.79.86:8080"

NOTE: If you're behind a proxy, you'll need to add the IP address above to the no_proxy for k8s. Please modify /etc/kubernetes/manifests/kube-controller-manager.yaml

...
    - name: no_proxy
      value: .cisco.com,localhost,127.0.0.1,10.0.0.0/8,kube-1,kube-2,192.168.0.0/16,10.98.79.86
...

Modifying this file will automatically reload k8s controller service.

Now let's create the storage class

$ kubectl create -f storage-class.yml
storageclass.storage.k8s.io/glusterfs-storage created

To test the storage, we will create a persistent volume claim, please see pvc.yml

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
    name: test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: glusterfs-storage

$ kubectl create -f pvc.yml
persistentvolumeclaim/test created

After a while you should see

$ kubectl get pvc
NAME      STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
test      Bound     pvc-be2c9bcd-a61e-11e8-9703-0050568923a4   1Gi        RWO            glusterfs-storage   24s

We can also check the persistent volume created from this claim

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM          STORAGECLASS        REASON    AGE
pvc-be2c9bcd-a61e-11e8-9703-0050568923a4   1Gi        RWO            Delete           Bound     default/test   glusterfs-storage             1m

When we know it works, we're can delete the volume claim

$ kubectl delete -f pvc.yml
persistentvolumeclaim "test" deleted

After about a minute, the claim and the volume should be gone

$ kubectl get pv,pvc
No resources found.

With that we have a persistent, distributed storage installed.

Let's move back to NSO to utilize this.

NSO on GlusterFS

The next step is to run our NSO containers on the persistent storage.

The directories that needs to be persisted in an NSO project are logs, ncs-cdb and the state directory.

Please navigate to k8s-lab/persistent-nso and check out nso.yml. It has been modified from the previous version

FIXME: Export NC port

FIXME: Use HA state for readiness and liveliness

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nso
spec:
  selector:
    matchLabels:
      app: nso-app
  serviceName: nso-svc
  replicas: 3
  template:
    metadata:
      labels:
        app: nso-app
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - nso-app
              topologyKey: "kubernetes.io/hostname"
      containers:
        - name: nso-master
          image: kube-1:5000/nso-k8s-lab
          livenessProbe:
            httpGet:
              path: /restconf/data/ncs-state
              port: 8080
              httpHeaders:
                - name: Authorization
                  # echo -n "admin:admin" | base64
                  value: Basic YWRtaW46YWRtaW4=
          readinessProbe:
            httpGet:
              path: /restconf/data/ncs-state
              port: 8080
              httpHeaders:
                - name: Authorization
                  # echo -n "admin:admin" | base64
                  value: Basic YWRtaW46YWRtaW4=
          ports:
          - containerPort: 2024
            name: ssh
          - containerPort: 8080
            name: webui
          volumeMounts:
            - name: logs
              mountPath: /app/nso_project/logs
            - name: state
              mountPath: /app/nso_project/state
            - name: cdb
              mountPath: /app/nso_project/ncs-cdb
 
        - name: elector
          image: fredrikjanssonse/leader-elector:0.6
          args:
            - --election=nso-svc
            - --http=localhost:4040
          ports:
            - containerPort: 4040
              protocol: TCP
  volumeClaimTemplates:
  - metadata:
      name: cdb
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: glusterfs-storage
      resources:
        requests:
          storage: 1Gi
  - metadata:
      name: logs
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: glusterfs-storage
      resources:
        requests:
          storage: 1Gi
  - metadata:
      name: state
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: glusterfs-storage
      resources:
        requests:
          storage: 1Gi

FIXME: Certificates for NSO SSL, from Bruce

So for each of cdb, logs and state, we will create a persistent volume claim for 1Gb of storage.

Let's deploy this

$ kubectl create -f nso.yml
statefulset.apps/nso created

This will take some time and you can check the status

$ watch kubectl get pods,pvc,pv
 
Every 2.0s: kubectl get pods,pvc,pv                                                                                                                                                                                                                       kube-1: Wed Aug 22 10:15:32 2018
 
NAME                         READY     STATUS    RESTARTS   AGE
pod/glusterfs-4dgmv          1/1       Running   0          17m
pod/glusterfs-b45nm          1/1       Running   2          2h
pod/glusterfs-gvqfq          1/1       Running   1          2h
pod/heketi-86f98754c-cmt6s   1/1       Running   4          2h
pod/nso-0                    2/2       Running   0          1m
pod/nso-1                    2/2       Running   0          43s
pod/nso-2                    0/2       Pending   0          8s
 
NAME                                STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
persistentvolumeclaim/cdb-nso-0     Bound     pvc-cf316753-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   1m
persistentvolumeclaim/cdb-nso-1     Bound     pvc-e05c7112-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   43s
persistentvolumeclaim/cdb-nso-2     Pending                                                                        glusterfs-storage   8s
persistentvolumeclaim/logs-nso-0    Bound     pvc-cf31f0ac-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   1m
persistentvolumeclaim/logs-nso-1    Bound     pvc-e059086b-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   43s
persistentvolumeclaim/logs-nso-2    Pending                                                                        glusterfs-storage   8s
persistentvolumeclaim/state-nso-0   Bound     pvc-cf3289c0-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   1m
persistentvolumeclaim/state-nso-1   Bound     pvc-e05ba39d-a62e-11e8-a520-0050568923a4   1Gi        RWO            glusterfs-storage   43s
persistentvolumeclaim/state-nso-2   Pending                                                                        glusterfs-storage   8s
 
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                 STORAGECLASS        REASON    AGE
persistentvolume/pvc-cf316753-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/cdb-nso-0     glusterfs-storage             55s
persistentvolume/pvc-cf31f0ac-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/logs-nso-0    glusterfs-storage             59s
persistentvolume/pvc-cf3289c0-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/state-nso-0   glusterfs-storage             55s
persistentvolume/pvc-e059086b-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/logs-nso-1    glusterfs-storage             25s
persistentvolume/pvc-e05ba39d-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/state-nso-1   glusterfs-storage             21s
persistentvolume/pvc-e05c7112-a62e-11e8-a520-0050568923a4   1Gi        RWO            Delete           Bound     default/cdb-nso-1     glusterfs-storage             29s

After a while you should see all pods get status running, at that point press ctrl-c to break the watch.

Let's create some configuration data in NSO

$ http_proxy="" curl -H "Content-Type: application/yang-data+json" -d "" -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test/ping

And to check the result

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test
 
<ha-test xmlns="http://example.com/ha-test"  xmlns:ha-test="http://example.com/ha-test">
  <count>1</count>
  <last-host>nso-0</last-host>
</ha-test>

Now let's get the HA status

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ncs-state/ha
 
<ha xmlns="http://tail-f.com/yang/ncs-monitoring"  xmlns:tfnm="http://tail-f.com/yang/ncs-monitoring">
  <mode>master</mode>
  <node-id>nso-0</node-id>
  <connected-slave>nso-1</connected-slave>
  <connected-slave>nso-2</connected-slave>
</ha>

Test HA Failover

By running the command above, we can see that nso-0 is the master.

First test is simply to kill that pod and make sure everything comes up again.

NOTE Make sure you kill the pod that's the leader when you run.

$ kubectl delete pod nso-0
pod "nso-0" deleted

After a short while, nso-0 will be back. Sometimes, depending on timing, nso-0 might still be the leader.

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ncs-state/ha
 
<ha xmlns="http://tail-f.com/yang/ncs-monitoring"  xmlns:tfnm="http://tail-f.com/yang/ncs-monitoring">
  <mode>master</mode>
  <node-id>nso-0</node-id>
  <connected-slave>nso-1</connected-slave>
  <connected-slave>nso-2</connected-slave>
</ha>

If nso-0 is still the leader, we can use the cordon/uncordon trick to force selection of a new leader.

First let's see where each pod is running

$ kubectl get pods -o=wide
NAME                     READY     STATUS    RESTARTS   AGE       IP            NODE      NOMINATED NODE
glusterfs-4dgmv          1/1       Running   0          1h        10.0.0.4      kube-1    <none>
glusterfs-b45nm          1/1       Running   2          3h        10.0.0.6      kube-3    <none>
glusterfs-gvqfq          1/1       Running   1          3h        10.0.0.5      kube-2    <none>
heketi-86f98754c-cmt6s   1/1       Running   4          3h        10.244.1.7    kube-2    <none>
nso-0                    2/2       Running   0          1m        10.244.2.10   kube-3   <none>
nso-1                    2/2       Running   0          59s       10.244.1.11   kube-2    <none>
nso-2                    2/2       Running   0          42s       10.244.0.4    kube-1    <none>

Here I see that nso-0, the leader, is running on kube-3. Let's disable scheduling on that node

$ kubectl cordon kube-3
node/kube-3 cordoned

And now if we delete nso-0, it will not come up until we uncordon the node

$ kubectl delete pod nso-0
pod "nso-0" deleted

And now if we check the pods

test@kube-1:~/nfs/kube-ha/k8s-lab/persistent-nso$ kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
glusterfs-4dgmv          1/1       Running   0          1h
glusterfs-b45nm          1/1       Running   2          3h
glusterfs-gvqfq          1/1       Running   1          3h
heketi-86f98754c-cmt6s   1/1       Running   4          3h
nso-0                    0/2       Pending  0          23s
nso-1                    2/2       Running   0          3m
nso-2                    2/2       Running   0          3m

We'll also see a new leader elected

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ncs-state/ha
 
<ha xmlns="http://tail-f.com/yang/ncs-monitoring"  xmlns:tfnm="http://tail-f.com/yang/ncs-monitoring">
  <mode>master</mode>
  <node-id>nso-1</node-id>
  <connected-slave>nso-2</connected-slave>
</ha>

Last, we can uncordon the node to let nso-0 be scheduled.

$ kubectl uncordon kube-3
node/kube-3 uncordoned
 
$ kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
glusterfs-4dgmv          1/1       Running   0          1h
glusterfs-b45nm          1/1       Running   2          3h
glusterfs-gvqfq          1/1       Running   1          3h
heketi-86f98754c-cmt6s   1/1       Running   4          3h
nso-0                    2/2       Running   0          2m
nso-1                    2/2       Running   0          5m
nso-2                    2/2       Running   0          4m
 
$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ncs-state/ha
 
<ha xmlns="http://tail-f.com/yang/ncs-monitoring"  xmlns:tfnm="http://tail-f.com/yang/ncs-monitoring">
  <mode>master</mode>
  <node-id>nso-1</node-id>
  <connected-slave>nso-0</connected-slave>
  <connected-slave>nso-2</connected-slave>
</ha>

Let's do the HA test again

$ http_proxy="" curl -H "Content-Type: application/yang-data+json" -d "" -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test/ping

And to check the result

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test
 
<ha-test xmlns="http://example.com/ha-test"  xmlns:ha-test="http://example.com/ha-test">
  <count>3</count>
  <last-host>nso-1</last-host>
</ha-test>

All good!

Now if we redo the test where we kill all the NSO pods

$ kubectl get pods | grep nso | awk '{print $1}' | xargs kubectl delete pod
pod "nso-0" deleted
pod "nso-1" deleted
pod "nso-2" deleted

Deleting and spinning up the pods may take a while, watch the status

$ watch kubectl get pod -o=wide

When the NSO pods are up, let's check the config again

$ http_proxy="" curl -u admin:admin http://$NSO_IP:8080/restconf/data/ha-test
 
<ha-test xmlns="http://example.com/ha-test"  xmlns:ha-test="http://example.com/ha-test">
  <count>3</count>
  <last-host>nso-1</last-host>
</ha-test>

All came up with no data lost.

Clean up Persistent Volume Claims

By default, when the NSO stateful set is deleted, k8s will not delete the volume claims. To clean up we can do this (NOTE: this will remove all PVCs):

$ kubectl get pvc | tail -n +2 | awk '{print $1}'  | xargs kubectl delete pvc
persistentvolumeclaim "cdb-nso-0" deleted
persistentvolumeclaim "cdb-nso-1" deleted
persistentvolumeclaim "cdb-nso-2" deleted
persistentvolumeclaim "logs-nso-0" deleted
persistentvolumeclaim "logs-nso-1" deleted
persistentvolumeclaim "logs-nso-2" deleted
persistentvolumeclaim "state-nso-0" deleted
persistentvolumeclaim "state-nso-1" deleted
persistentvolumeclaim "state-nso-2" deleted

missmansirao · ‎11-18-2018

Thanks for sharing the info regarding cluster building by using 3 nodes in Kubernetes (k8s). In the article the explanation part is very impressive and overall the entire procedure was vey informative.

Thanks for the article.

Jinhoe · ‎12-12-2022

Hi, I could not follow at "Create NSO base image"

It stated "Go to the k8s-lab/base-container directory, where you'll find a number of files, the most interesting files is the Dockerfile."

But where is the k8s-lab/base-container directory?