docs/usecases/contiv/MANUAL_INSTALL.md

   1 # Manual Installation
   2 This document describes how to clone the Contiv respository and then use [kubeadm][1] to manually install Kubernetes
   3 with Contiv-VPP networking on one or more bare metal or VM hosts.
   4
   5 ## Clone the Contiv Respository
   6 To clone the Contiv respository enter the following command:
   7 ```
   8 git clone https://github.com/contiv/vpp/<repository-name>
   9 ```
  10 **Note:** Replace *<repository-name>* with the name you want assigned to your cloned contiv repository.
  11
  12 The cloned repository has important folders that contain content that are referenced in this Contiv documentation; those folders are noted below:
  13 ```
  14 vpp-contiv2$ ls
  15 build       build-root  doxygen  gmod       LICENSE      Makefile   RELEASE.md   src
  16 build-data  docs        extras   INFO.yaml  MAINTAINERS  README.md  sphinx_venv  test
  17 ```
  18 ## Preparing Your Hosts
  19
  20 ### Host-specific Configurations
  21 - **VmWare VMs**: the vmxnet3 driver is required on each interface that will
  22   be used by VPP. Please see [here][13] for instructions how to install the
  23   vmxnet3 driver on VmWare Fusion.
  24
  25 ### Setting up Network Adapter(s)
  26 #### Setting up DPDK
  27 DPDK setup must be completed **on each node** as follows:
  28
  29 - Load the PCI UIO driver:
  30   ```
  31   $ sudo modprobe uio_pci_generic
  32   ```
  33
  34 - Verify that the PCI UIO driver has loaded successfully:
  35   ```
  36   $ lsmod | grep uio
  37   uio_pci_generic        16384  0
  38   uio                    20480  1 uio_pci_generic
  39   ```
  40
  41   Please note that this driver needs to be loaded upon each server bootup,
  42   so you may want to add `uio_pci_generic` into the `/etc/modules` file,
  43   or a file in the `/etc/modules-load.d/` directory. For example, the
  44   `/etc/modules` file could look as follows:
  45   ```
  46   # /etc/modules: kernel modules to load at boot time.
  47   #
  48   # This file contains the names of kernel modules that should be loaded
  49   # at boot time, one per line. Lines beginning with "#" are ignored.
  50   uio_pci_generic
  51   ```
  52 #### Determining Network Adapter PCI Addresses
  53 You need the PCI address of the network interface that VPP will use for the multi-node pod interconnect. On Debian-based
  54 distributions, you can use `lshw`(*):
  55
  56 ```
  57 $ sudo lshw -class network -businfo
  58 Bus info          Device      Class      Description
  59 ====================================================
  60 pci@0000:00:03.0  ens3        network    Virtio network device
  61 pci@0000:00:04.0  ens4        network    Virtio network device
  62 ```
  63 **Note:** On CentOS/RedHat/Fedora distributions, `lshw` may not be available by default, install it by issuing the following command:
  64     ```
  65     yum -y install lshw
  66     ```
  67
  68 #### Configuring vswitch to Use Network Adapters
  69 Finally, you need to set up the vswitch to use the network adapters:
  70
  71 - [Setup on a node with a single NIC][14]
  72 - [Setup a node with multiple NICs][15]
  73
  74 ### Using a Node Setup Script
  75 You can perform the above steps using the [node setup script][17].
  76
  77 ## Installing Kubernetes with Contiv-VPP CNI plugin
  78 After the nodes you will be using in your K8s cluster are prepared, you can
  79 install the cluster using [kubeadm][1].
  80
  81 ### (1/4) Installing Kubeadm on Your Hosts
  82 For first-time installation, see [Installing kubeadm][6]. To update an
  83 existing installation,  you should do a `apt-get update && apt-get upgrade`
  84 or `yum update` to get the latest version of kubeadm.
  85
  86 On each host with multiple NICs where the NIC that will be used for Kubernetes
  87 management traffic is not the one pointed to by the default route out of the
  88 host, a [custom management network][12] for Kubernetes must be configured.
  89
  90 #### Using Kubernetes 1.10 and Above
  91 In K8s 1.10, support for huge pages in a pod has been introduced. For now, this
  92 feature must be either disabled or memory limit must be defined for vswitch container.
  93
  94 To disable huge pages, perform the following
  95 steps as root:
  96 * Using your favorite editor, disable huge pages in the kubelet configuration
  97   file (`/etc/systemd/system/kubelet.service.d/10-kubeadm.conf` or `/etc/default/kubelet` for version 1.11+):
  98 ```
  99   Environment="KUBELET_EXTRA_ARGS=--feature-gates HugePages=false"
 100 ```
 101 * Restart the kubelet daemon:
 102 ```
 103   systemctl daemon-reload
 104   systemctl restart kubelet
 105 ```
 106
 107 To define memory limit, append the following snippet to vswitch container in deployment yaml file:
 108 ```
 109                         resources:
 110               limits:
 111                 hugepages-2Mi: 1024Mi
 112                 memory: 1024Mi
 113
 114 ```
 115 or set `contiv.vswitch.defineMemoryLimits` to `true` in [helm values](https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md).
 116
 117 ### (2/4) Initializing Your Master
 118 Before initializing the master, you may want to [remove][8] any
 119 previously installed K8s components. Then, proceed with master initialization
 120 as described in the [kubeadm manual][3]. Execute the following command as
 121 root:
 122 ```
 123 kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16
 124 ```
 125 **Note:** `kubeadm init` will autodetect the network interface to advertise
 126 the master on as the interface with the default gateway. If you want to use a
 127 different interface (i.e. a custom management network setup), specify the
 128 `--apiserver-advertise-address=<ip-address>` argument to kubeadm init. For
 129 example:
 130 ```
 131 kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 --apiserver-advertise-address=192.168.56.106
 132 ```
 133 **Note:** The CIDR specified with the flag `--pod-network-cidr` is used by
 134 kube-proxy, and it **must include** the `PodSubnetCIDR` from the `IPAMConfig`
 135 section in the Contiv-vpp config map in Contiv-vpp's deployment file
 136 [contiv-vpp.yaml](https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/values.yaml). Pods in the host network namespace
 137 are a special case; they share their respective interfaces and IP addresses with
 138 the host. For proxying to work properly it is therefore required for services
 139 with backends running on the host to also **include the node management IP**
 140 within the `--pod-network-cidr` subnet. For example, with the default
 141 `PodSubnetCIDR=10.1.0.0/16` and `PodIfIPCIDR=10.2.1.0/24`, the subnet
 142 `10.3.0.0/16` could be allocated for the management network and
 143 `--pod-network-cidr` could be defined as `10.0.0.0/8`, so as to include IP
 144 addresses of all pods in all network namespaces:
 145 ```
 146 kubeadm init --token-ttl 0 --pod-network-cidr=10.0.0.0/8 --apiserver-advertise-address=10.3.1.1
 147 ```
 148
 149 If Kubernetes was initialized successfully, it prints out this message:
 150 ```
 151 Your Kubernetes master has initialized successfully!
 152 ```
 153
 154 After successful initialization, don't forget to set up your .kube directory
 155 as a regular user (as instructed by `kubeadm`):
 156 ```bash
 157 mkdir -p $HOME/.kube
 158 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
 159 sudo chown $(id -u):$(id -g) $HOME/.kube/config
 160 ```
 161
 162 ### (3/4) Installing the Contiv-VPP Pod Network
 163 If you have already used the Contiv-VPP plugin before, you may need to pull
 164 the most recent Docker images on each node:
 165 ```
 166 bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/pull-images.sh)
 167 ```
 168
 169 Install the Contiv-VPP network for your cluster as follows:
 170
 171 - If you do not use the STN feature, install Contiv-vpp as follows:
 172   ```
 173   kubectl apply -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml
 174   ```
 175
 176 - If you use the STN feature, download the `contiv-vpp.yaml` file:
 177   ```
 178   wget https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml
 179   ```
 180   Then edit the STN configuration as described [here][16]. Finally, create
 181   the Contiv-vpp deployment from the edited file:
 182   ```
 183   kubectl apply -f ./contiv-vpp.yaml
 184   ```
 185
 186 Beware contiv-etcd data is persisted in `/var/etcd` by default. It has to be cleaned up manually after `kubeadm reset`.
 187 Otherwise outdated data will be loaded by a subsequent deployment.
 188
 189 You can also generate random subfolder, alternatively:
 190
 191 ```
 192 curl --silent https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml | sed "s/\/var\/etcd\/contiv-data/\/var\/etcd\/contiv-data\/$RANDOM/g" | kubectl apply -f -
 193 ```
 194
 195 #### Deployment Verification
 196 After some time, all contiv containers should enter the running state:
 197 ```
 198 root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv
 199 NAME                           READY     STATUS    RESTARTS   AGE       IP               NODE
 200 ...
 201 contiv-etcd-gwc84              1/1       Running   0          14h       192.168.56.106   cvpp
 202 contiv-ksr-5c2vk               1/1       Running   2          14h       192.168.56.106   cvpp
 203 contiv-vswitch-l59nv           2/2       Running   0          14h       192.168.56.106   cvpp
 204 ```
 205 In particular, make sure that the Contiv-VPP pod IP addresses are the same as
 206 the IP address specified in the `--apiserver-advertise-address=<ip-address>`
 207 argument to kubeadm init.
 208
 209 Verify that the VPP successfully grabbed the network interface specified in
 210 the VPP startup config (`GigabitEthernet0/4/0` in our case):
 211 ```
 212 $ sudo vppctl
 213 vpp# sh inter
 214               Name               Idx       State          Counter          Count
 215 GigabitEthernet0/4/0              1         up       rx packets                  1294
 216                                                      rx bytes                  153850
 217                                                      tx packets                   512
 218                                                      tx bytes                   21896
 219                                                      drops                        962
 220                                                      ip4                         1032
 221 host-40df9b44c3d42f4              3         up       rx packets                126601
 222                                                      rx bytes                44628849
 223                                                      tx packets                132155
 224                                                      tx bytes                27205450
 225                                                      drops                         24
 226                                                      ip4                       126585
 227                                                      ip6                           16
 228 host-vppv2                        2         up       rx packets                132162
 229                                                      rx bytes                27205824
 230                                                      tx packets                126658
 231                                                      tx bytes                44634963
 232                                                      drops                         15
 233                                                      ip4                       132147
 234                                                      ip6                           14
 235 local0                            0        down
 236 ```
 237
 238 You should also see the interface to kube-dns (`host-40df9b44c3d42f4`) and to the
 239 node's IP stack (`host-vppv2`).
 240
 241 #### Master Isolation (Optional)
 242 By default, your cluster will not schedule pods on the master for security
 243 reasons. If you want to be able to schedule pods on the master, (e.g., for a
 244 single-machine Kubernetes cluster for development), then run:
 245
 246 ```
 247 kubectl taint nodes --all node-role.kubernetes.io/master-
 248 ```
 249 More details about installing the pod network can be found in the
 250 [kubeadm manual][4].
 251
 252 ### (4/4) Joining Your Nodes
 253 To add a new node to your cluster, run as root the command that was output
 254 by kubeadm init. For example:
 255 ```
 256 kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>
 257 ```
 258 More details can be found int the [kubeadm manual][5].
 259
 260 #### Deployment Verification
 261 After some time, all contiv containers should enter the running state:
 262 ```
 263 root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv
 264 NAME                           READY     STATUS    RESTARTS   AGE       IP               NODE
 265 contiv-etcd-gwc84              1/1       Running   0          14h       192.168.56.106   cvpp
 266 contiv-ksr-5c2vk               1/1       Running   2          14h       192.168.56.106   cvpp
 267 contiv-vswitch-h6759           2/2       Running   0          14h       192.168.56.105   cvpp-slave2
 268 contiv-vswitch-l59nv           2/2       Running   0          14h       192.168.56.106   cvpp
 269 etcd-cvpp                      1/1       Running   0          14h       192.168.56.106   cvpp
 270 kube-apiserver-cvpp            1/1       Running   0          14h       192.168.56.106   cvpp
 271 kube-controller-manager-cvpp   1/1       Running   0          14h       192.168.56.106   cvpp
 272 kube-dns-545bc4bfd4-fr6j9      3/3       Running   0          14h       10.1.134.2       cvpp
 273 kube-proxy-q8sv2               1/1       Running   0          14h       192.168.56.106   cvpp
 274 kube-proxy-s8kv9               1/1       Running   0          14h       192.168.56.105   cvpp-slave2
 275 kube-scheduler-cvpp            1/1       Running   0          14h       192.168.56.106   cvpp
 276 ```
 277 In particular, verify that a vswitch pod and a kube-proxy pod is running on
 278 each joined node, as shown above.
 279
 280 On each joined node, verify that the VPP successfully grabbed the network
 281 interface specified in the VPP startup config (`GigabitEthernet0/4/0` in
 282 our case):
 283 ```
 284 $ sudo vppctl
 285 vpp# sh inter
 286               Name               Idx       State          Counter          Count
 287 GigabitEthernet0/4/0              1         up
 288 ...
 289 ```
 290 From the vpp CLI on a joined node you can also ping kube-dns to verify
 291 node-to-node connectivity. For example:
 292 ```
 293 vpp# ping 10.1.134.2
 294 64 bytes from 10.1.134.2: icmp_seq=1 ttl=64 time=.1557 ms
 295 64 bytes from 10.1.134.2: icmp_seq=2 ttl=64 time=.1339 ms
 296 64 bytes from 10.1.134.2: icmp_seq=3 ttl=64 time=.1295 ms
 297 64 bytes from 10.1.134.2: icmp_seq=4 ttl=64 time=.1714 ms
 298 64 bytes from 10.1.134.2: icmp_seq=5 ttl=64 time=.1317 ms
 299
 300 Statistics: 5 sent, 5 received, 0% packet loss
 301 ```
 302 ### Deploying Example Applications
 303 #### Simple Deployment
 304 You can go ahead and create a simple deployment:
 305 ```
 306 $ kubectl run nginx --image=nginx --replicas=2
 307 ```
 308
 309 Use `kubectl describe pod` to get the IP address of a pod, e.g.:
 310 ```
 311 $ kubectl describe pod nginx | grep IP
 312 ```
 313 You should see two ip addresses, for example:
 314 ```
 315 IP:             10.1.1.3
 316 IP:             10.1.1.4
 317 ```
 318
 319 You can check the pods' connectivity in one of the following ways:
 320 * Connect to the VPP debug CLI and ping any pod:
 321 ```
 322   sudo vppctl
 323   vpp# ping 10.1.1.3
 324 ```
 325 * Start busybox and ping any pod:
 326 ```
 327   kubectl run busybox --rm -ti --image=busybox /bin/sh
 328   If you don't see a command prompt, try pressing enter.
 329   / #
 330   / # ping 10.1.1.3
 331
 332 ```
 333 * You should be able to ping any pod from the host:
 334 ```
 335   ping 10.1.1.3
 336 ```
 337
 338 #### Deploying Pods on Different Nodes
 339 to enable pod deployment on the master, untaint the master first:
 340 ```
 341 kubectl taint nodes --all node-role.kubernetes.io/master-
 342 ```
 343
 344 In order to verify inter-node pod connectivity, we need to tell Kubernetes
 345 to deploy one pod on the master node and one POD on the worker. For this,
 346 we can use node selectors.
 347
 348 In your deployment YAMLs, add the `nodeSelector` sections that refer to
 349 preferred node hostnames, e.g.:
 350 ```
 351   nodeSelector:
 352     kubernetes.io/hostname: vm5
 353 ```
 354
 355 Example of whole JSONs:
 356 ```
 357 apiVersion: v1
 358 kind: Pod
 359 metadata:
 360   name: nginx1
 361 spec:
 362   nodeSelector:
 363     kubernetes.io/hostname: vm5
 364   containers:
 365     - name: nginx
 366
 367           : nginx
 368 ```
 369
 370 ```
 371 apiVersion: v1
 372 kind: Pod
 373 metadata:
 374   name: nginx2
 375 spec:
 376   nodeSelector:
 377     kubernetes.io/hostname: vm6
 378   containers:
 379     - name: nginx
 380       image: nginx
 381 ```
 382
 383 After deploying the JSONs, verify they were deployed on different hosts:
 384 ```
 385 $ kubectl get pods -o wide
 386 NAME      READY     STATUS    RESTARTS   AGE       IP           NODE
 387 nginx1    1/1       Running   0          13m       10.1.36.2    vm5
 388 nginx2    1/1       Running   0          13m       10.1.219.3   vm6
 389 ```
 390
 391 Now you can verify the connectivity to both nginx PODs from a busybox POD:
 392 ```
 393 kubectl run busybox --rm -it --image=busybox /bin/sh
 394
 395 / # wget 10.1.36.2
 396 Connecting to 10.1.36.2 (10.1.36.2:80)
 397 index.html           100% |*******************************************************************************************************************************************************************|   612   0:00:00 ETA
 398
 399 / # rm index.html
 400
 401 / # wget 10.1.219.3
 402 Connecting to 10.1.219.3 (10.1.219.3:80)
 403 index.html           100% |*******************************************************************************************************************************************************************|   612   0:00:00 ETA
 404 ```
 405
 406 ### Uninstalling Contiv-VPP
 407 To uninstall the network plugin itself, use `kubectl`:
 408 ```
 409 kubectl delete -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml
 410 ```
 411
 412 ### Tearing down Kubernetes
 413 * First, drain the node and make sure that the node is empty before
 414 shutting it down:
 415 ```
 416   kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
 417   kubectl delete node <node name>
 418 ```
 419 * Next, on the node being removed, reset all kubeadm installed state:
 420 ```
 421   rm -rf $HOME/.kube
 422   sudo su
 423   kubeadm reset
 424 ```
 425
 426 * If you added environment variable definitions into
 427   `/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`, this would have been a process from the [Custom Management Network file][10], then remove the definitions now.
 428
 429 ### Troubleshooting
 430 Some of the issues that can occur during the installation are:
 431
 432 - Forgetting to create and initialize the `.kube` directory in your home
 433   directory (As instructed by `kubeadm init --token-ttl 0`). This can manifest
 434   itself as the following error:
 435   ```
 436   W1017 09:25:43.403159    2233 factory_object_mapping.go:423] Failed to download OpenAPI (Get https://192.168.209.128:6443/swagger-2.0.0.pb-v1: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")), falling back to swagger
 437   Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
 438   ```
 439 - Previous installation lingering on the file system.
 440   `'kubeadm init --token-ttl 0` fails to initialize kubelet with one or more
 441   of the following error messages:
 442   ```
 443   ...
 444   [kubelet-check] It seems like the kubelet isn't running or healthy.
 445   [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
 446   ...
 447   ```
 448
 449 If you run into any of the above issues, try to clean up and reinstall as root:
 450 ```
 451 sudo su
 452 rm -rf $HOME/.kube
 453 kubeadm reset
 454 kubeadm init --token-ttl 0
 455 rm -rf /var/etcd/contiv-data
 456 rm -rf /var/bolt/bolt.db
 457 ```
 458
 459 [1]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
 460 [3]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#initializing-your-master
 461 [4]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network
 462 [5]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#joining-your-nodes
 463 [6]: https://kubernetes.io/docs/setup/independent/install-kubeadm/
 464 [8]: #tearing-down-kubernetes
 465 [10]: https://github.com/contiv/vpp/blob/master/docs/CUSTOM_MGMT_NETWORK.md#setting-up-a-custom-management-network-on-multi-homed-nodes
 466 [11]: ../vagrant/README.md
 467 [12]: https://github.com/contiv/vpp/tree/master/docs/CUSTOM_MGMT_NETWORK.md
 468 [13]: https://github.com/contiv/vpp/tree/master/docs/VMWARE_FUSION_HOST.md
 469 [14]: https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md
 470 [15]: https://github.com/contiv/vpp/tree/master/docs/MULTI_NIC_SETUP.md
 471 [16]: https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md#configuring-stn-in-contiv-vpp-k8s-deployment-files
 472 [17]: https://github.com/contiv/vpp/tree/master/k8s/README.md#setup-node-sh