如何透過 Ansible Playbooks 部屬 Kubernetes+GPU 叢集

如何透過 Ansible Playbooks 部屬 Kubernetes+GPU 叢集

本篇記錄部署過程,主要參考「開發 Ansible Playbooks 部署 Kubernetes v1.11.x HA 叢集」延伸 kube-ansible 感謝 KaiRen 改版後增加 Nvidia Docker 為 ansible 部署過程中,協助增加 NVIDIA Docker 與 k8s-device-plugin,完成 Node 節點環境的 GPU 資源使用(內文為裸機加載GPU部之署過程記錄)。

節點資訊

本次安裝作業系統採用 Ubuntu 16.04 Desktop,測試環境為實體主機:

本次 Kubernetes 安裝版本:

  • Kubernetes v1.11.2
  • Etcd v3.2.9
  • containerd v1.1.2

節點資訊

本次安裝作業系統採用Ubuntu 16.04 Desktop,測試環境為實體主機:

IP Address Hostname CPU Memory Extra Device
192.168.0.98 VIP
192.168.0.81 k8s-m1 4 16G
192.168.0.82 k8s-m2 4 16G
192.168.0.83 k8s-m3 4 16G
192.168.0.84 k8s-g1 4 16G GTX 1060 6G
192.168.0.85 k8s-g2 4 16G GTX 1060 6G
192.168.0.86 k8s-g3 4 16G GTX 1060 6G
192.168.0.87 k8s-g4 4 16G GTX 1060 6G

所有節點事前準備

安裝前需要確認以下幾個項目:

  • 所有節點的網路之間可以互相溝通。
  • 部署節點對其他節點不需要 SSH 密碼即可登入。
  • 所有節點都擁有 Sudoer 權限,並且不需要輸入密碼。
  • 所有節點需要安裝 Python。
  • 所有節點需要設定 /etc/host 解析到所有主機。
  • 部署節點需要安裝 Ansible。

部署節點對其他節點不需要 SSH 密碼即可登入:

1
$  echo "ubuntu ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ubuntu && sudo chmod 440 /etc/sudoers.d/ubuntu

確認環境網路DNS設定:

1
2
3
4
5
6
7
8
$ echo "nameserver 8.8.8.8" >> /etc/resolvconf/resolv.conf.d/tail
$ resolvconf -u

$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
nameserver 8.8.8.8

GPU節點事前準備 (Node)

由於GPU使用需要事先安裝 CUDA & NVIDIA Driver 於環境部分:

透過 APT 安裝 NVIDIA Driver(v410.79) 與 CUDA 10

1
2
3
4
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
$ sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
$ sudo apt-get update && sudo apt-get install -y cuda

部署節點(Master)

Ubuntu 16.04 安裝 Ansible:

1
2
3
$ sudo apt-get install -y software-properties-common git cowsay
$ sudo apt-add-repository -y ppa:ansible/ansible
$ sudo apt-get update && sudo apt-get install -y ansible

測試 NVIDIA Dirver 與 CUDA 是否有安裝完成:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ cat /usr/local/cuda/version.txt
CUDA Version 10.0.130

$ sudo nvidia-smi

Fri Dec 9 10:25:24 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:03:00.0 Off | N/A |
| 38% 28C P8 5W / 120W | 0MiB / 6077MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

透過 Ansible 部署 Kubernetes

這邊執行由 kairen 撰寫的 kube-ansible 專案,並透過 Ansible 來部署 Kubernetes HA 叢集,透過 Git 取得專案:

1
2
$ git clone https://github.com/kairen/kube-ansible.git
$ cd kube-ansible

Kubernetes 叢集 + GPU

修改 inventory/hosts.ini 來描述被部署的節點與群組關係:

這邊為設定節點/etc/host解析到所有主機,直接在主機IP後面直接ssh登入資訊

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ vim inventory/hosts.ini

[etcds]
192.168.0.[81:83] ansible_user=ubuntu ansible_password=password

[masters]
192.168.0.[81:83] ansible_user=ubuntu ansible_password=password

[nodes]
192.168.0.84 ansible_user=ubuntu ansible_password=password
192.168.0.85 ansible_user=ubuntu ansible_password=password
192.168.0.86 ansible_user=ubuntu ansible_password=password
192.168.0.87 ansible_user=ubuntu ansible_password=passowrd

[kube-cluster:children]
masters
nodes

ansible_user 為節點系統 SSH 的使用者名稱。
ansible_password 為節點系統 SSH 的使用者密碼。

接著編輯 group_vars/all.yml 來根據需求設定功能,如以下範例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
$ vim group_vars/all.yml

---

kube_version: 1.11.2

# Container runtime,
# Supported: docker, nvidia-docker, containerd.
container_runtime: nvidia-docker

# Container network,
# Supported: calico, flannel.
cni_enable: true
container_network: calico
cni_iface: "enp0s25" # CNI 網路綁定的網卡

# Kubernetes HA extra variables.
vip_interface: "enp0s25" # VIP 綁定的網卡
vip_address: 192.168.0.98 # VIP 位址

# etcd extra variables.
etcd_iface: "enp0s25" # etcd 綁定的網卡

# Kubernetes extra addons
enable_ingress: true
enable_dashboard: true
enable_logging: false
enable_monitoring: true
enable_metric_server: true

grafana_user: "admin"
grafana_password: "p@ssw0rd"

上面綁定網卡若沒有輸入,通常會使用節點預設網卡(一般來說是第一張網卡)。

這邊測試發現,需要事先確認確認,所有節點中每個節點上的網卡名稱是否一致,如實驗環境 Ubuntu16.04 網卡名稱都為 enp0s25

完成設定 group_vars/all.yml 檔案後,就可以先透過 Ansible 來檢查叢集狀態:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
$ ansible -i inventory/hosts.ini all -m ping

192.168.0.81 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.82 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.83 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.84 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.85 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.86 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.0.87 | SUCCESS => {
"changed": false,
"ping": "pong"
}

接續檢查每個 Node 節點 GPU Driver 是否成功運行狀態:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
$ ansible -i inventory/hosts.ini all -a "nvidia-smi" -b

192.168.0.84 | SUCCESS | rc=0 >>
Thu Dec 9 12:00:54 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:03:00.0 Off | N/A |
| 38% 29C P8 4W / 120W | 0MiB / 6077MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+


192.168.0.87 | SUCCESS | rc=0 >>
Thu Dec 9 12:00:57 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:03:00.0 On | N/A |
| 40% 33C P8 7W / 120W | 323MiB / 6077MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1202 G /usr/lib/xorg/Xorg 171MiB |
| 0 3191 G compiz 149MiB |
+-----------------------------------------------------------------------------+

當叢集確認沒有問題後,即可執行cluster.yml來部署 Kubernetes 叢集:

1
$ ansible-playbook -i inventory/hosts.ini cluster.yml

查看元件狀態

1
2
3
4
5
6
7
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-1 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-0 Healthy {"health": "true"}
1
2
3
4
5
6
7
8
9
$ kubectl get no
NAME STATUS ROLES AGE VERSION
k8s-m1 Ready master 2m v1.11.2
k8s-m2 Ready master 2m v1.11.2
k8s-m3 Ready master 2m v1.11.2
k8s-n1 Ready <none> 2m v1.11.2
k8s-n2 Ready <none> 2m v1.11.2
k8s-n3 Ready <none> 2m v1.11.2
k8s-n4 Ready <none> 2m v1.11.2

測試GPU節點是否可以正常運作

這邊簡易部署 gpu-pod 測試節點 kubernetes divice pligin 可以正常使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
restartPolicy: Never
containers:
- image: nvidia/cuda
name: cuda
command: ["nvidia-smi"]
resources:
limits:
nvidia.com/gpu: 1
EOF
pod "gpu-pod" created


$ kubectl get po -a -o wide
Flag --show-all has been deprecated, will be removed in an upcoming release
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
gpu-pod 0/1 Completed 0 1h 10.244.1.5 k8s-n1 <none>

$ kubectl logs gpu-pod
Sun Dec 9 10:26:43 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:02:00.0 On | N/A |
| 40% 24C P8 N/A / 75W | 62MiB / 4032MiB | 1% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

Addons 部署

1
$ ansible-playbook -i inventory/hosts.ini addons.yml

完成後即可透過 kubectl 來檢查服務,如 kubernetes-dashboard:

1
2
3
4
5
6
$ kubectl get po,svc -n kube-system -l k8s-app=kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/kubernetes-dashboard-6948bdb78-7424h 1/1 Running 0 2m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes-dashboard ClusterIP 10.108.226.213 <none> 443/TCP 1h

完成後,即可透過 API Server 的 Proxy 來存取 https://192.168.0.98:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login

登入查詢kubernetes-dashboard Token:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ kubectl -n kube-system get secret
NAME TYPE DATA AGE
deployment-controller-token-kmcmz kubernetes.io/service-account-token 3 1h


$ kubectl -n kube-system describe secret deployment-controller-token-kmcmz
Name: deployment-controller-token-kmcmz
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name=deployment-controller
kubernetes.io/service-account.uid=e4e91ed4-fb9b-11e8-baef-d05099d079fb

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1428 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkZXBsb3ltZW50LWNvbnRyb2xsZXItdG9rZW4ta21jbXoiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGVwbG95bWVudC1jb250cm9sbGVyIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZTRlOTFlZDQtZmI5Yi0xMWU4LWJhZWYtZDA1MDk5ZDA3OWZiIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRlcGxveW1lbnQtY29udHJvbGxlciJ9.IRQUhsVU4AJ36-qNClW7htzFJis1Mf_YSySIBKYuZ7uuaCzGcXRZtJ-nPo0SFBq7XufBMydjKwKP6tmsG1NsjttC3ETX-OnCV7u9BW0DK4HX6YloS-6Ik2rN9nHOa5iRpSNwCB2l6axGofoLkIosRCYMhdUyI5E9ZIrNKV-AvKehZkFtxXQCE3DbWGiklj1QPVq2oypfkwBEZG4GSlFkxPoIkzQQTbmZDfH036hi9DpBcUJIU41IJb9npdx65NA39Oskjdwiym1z_JlAhlhnE-uCPc-IjHirw_bEcn7mhDBf-1O2kr0IVmAbczFi82aoCagTDtUjBLP7BJ3k0v0gxQ

顯示畫面:

重置叢集狀態

最後若想要重新部署叢集的話,可以透過 reset-cluster.yml 來清除叢集:

1
$ ansible-playbook -i inventory/hosts.ini reset-cluster.yml

[#補充]部署修正網卡一致


[#補充] 若無需要 HA 部署 (單M,單N-測試)

IP Address Hostname CPU Memory
192.168.0.13 VIP
192.168.0.10 k8s-m1 4 16G
192.168.0.11 k8s-n1 4 16G

配置 inventory/hosts.ini 範例:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ vim inventory/hosts.ini
[etcds]
192.168.0.10 ansible_user=ubuntu ansible_password=password

[masters]
192.168.0.10 ansible_user=ubuntu ansible_password=password

[nodes]
192.168.0.11 ansible_user=ubuntu ansible_password=password

[kube-cluster:children]
masters
nodes

修正 inventory/group_vars/all.yml 範例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ vim inventory/group_vars/all.yml
---

kube_version: 1.11.2

# Container runtime,
# Supported: docker, nvidia-docker, containerd.
container_runtime: nvidia-docker

# Container network,
# Supported: calico, flannel.
cni_enable: true
container_network: calico
cni_iface: "eth0"

# Kubernetes HA extra variables.
vip_interface: "eth0"
vip_address: 192.168.0.13

# etcd extra variables.
etcd_iface: "eth0"

# Kubernetes extra addons
enable_ingress: true
enable_dashboard: true
enable_logging: false
enable_monitoring: true
enable_metric_server: true

grafana_user: "admin"
grafana_password: "p@ssw0rd"

以上[補充範例]為一台Master&一台Node節點透過部署,並修正確認網卡名稱為一致,而vip配置部分統一設值為master資訊,並且重新運行ansible HA腳本即可執行成功。

評論