安装要求

2核cpu或以上, 2G内存或更多
Master: 192.168.1.100
node1: 192.168.1.101
node2: 192.168.1.102
CentOS 7.9.2009
k8s v1.25.0

参考:

https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

准备环境

所有主机上都需要执行

  1. 关闭swap分区,否则在之后的部署kubeadm init时会报如下错误
    [WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
    1
    2
    # swapoff -a
    # sed -ri 's/.*swap.*/#&/' /etc/fstab
  2. 设置主机解析,否则在之后的部署kubeadm init时会报如下错误
    [WARNING Hostname]: hostname “master” could not be reached
    [WARNING Hostname]: hostname “master”: lookup master on 8.8.8.8:53: no such host
    error execution phase preflight: [preflight] Some fatal errors occurred:
    1
    2
    3
    4
    5
    cat >> /etc/hosts <<EOF
    192.168.1.100 master
    192.168.1.101 node1
    192.168.1.102 node2
    EOF
  3. 关闭selinux
    1
    2
    $ setenforce 0
    $ sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
  4. 转发 IPv4 并让 iptables 看到桥接流量,未做转发配置初始化会有如下报错
    error: exit status 1
    [ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    $ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF

    $ modprobe overlay
    $ modprobe br_netfilter

    $ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward = 1
    EOF

    $ sysctl --system
    参考:

    https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/

安装容器运行时

目前支持4中容器运行时

  1. containerd
  2. CRI-O
  3. Docker Engine
  4. Mirantis Container Runtime

这里选择安装containerd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# 安装必要的一些系统工具
$ yum install -y yum-utils device-mapper-persistent-data lvm2
# 添加软件源信息
$ yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 安装containerd.io
$ yum -y install containerd.io
# 配置containerd沙盒镜像使用啊里云,如果不修改,有可能会无法下载到k8s.gcr.io/pause:3.6镜像,并报如下错误

"RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\":

#导出默认配置
$ containerd config default > /etc/containerd/config.toml
#编辑配置文件
$ vim /etc/containerd/config.toml
#修改sandbox_image行替换为aliyun的pause镜像

sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
:wq

也可直接编辑
$ vim /etc/containerd/config.toml
修改
disabled_plugins = ["cri"] 为 disabled_plugins = []

增加
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"

:wq

# 启动containerd服务
$ systemctl start containerd

容器运行时没正常安装或未运行或运行失败,kubeadm init会有如下报错

1
2
[preflight] Some fatal errors occurred:  
[ERROR CRI]: container runtime is not running: output: time="2022-08-25T07:12:12-04:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""

安装CNI plugins

tgz包下载:https://github.com/containernetworking/plugins/releases

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ mkdir -p /opt/cni/bin
$ tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.1.1.tgz
./
./macvlan
./static
./vlan
./portmap
./host-local
./vrf
./bridge
./tuning
./firewall
./host-device
./sbr
./loopback
./dhcp
./ptp
./ipvlan
./bandwidth

参考:

https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/

安装 kubeadm、kubelet 和 kubectl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

$ yum install -y --nogpgcheck kubelet kubeadm kubectl
由于官网未开放同步方式, 索引gpg检查失败, 请用 --nogpgcheck 选项安装

启动
$ systemctl enable kubelet && systemctl start kubelet

k8s初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ kubeadm init --image-repository=registry.aliyuncs.com/google_containers --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --v=5

#不加--service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16选项,在之后添加pod网络插件flannel时会报错
E0827 14:33:49.890426 1 main.go:330] Error registering network: failed to acquire lease: node "master" pod cidr not assigned

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.1.100:6443 --token 9lc7bn.3ui7volivxt9sqbr --discovery-token-ca-cert-hash sha256:fb4183a090fddb58ea61f847e49629d459bb7f760bc2db8b160206567d364348

初始化失败报如下错误,请注意检查/etc/containerd/config.toml的配置,disabled_plugins,sandbox_image等配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

.......

runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421

重新初始化k8s

1
2
3
4
5
6
# 重置此前的配置
$ kubeadm reset

# 初始化
$ kubeadm init --image-repository=registry.aliyuncs.com/google_containers --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --v=5

配置用户可以运行kubectl管理k8s集群

1
2
3
4
5
6
7
#普通用户
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

#root用户
$ export KUBECONFIG=/etc/kubernetes/admin.conf

成功安装k8s

可以使用kubectl命令进行后续操作,而crictl命令则相当与之前的docker命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ crictl img
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory"

处理:
更改默认endpoint
配置containerd的默认CRI套接字
$ crictl config runtime-endpoint unix:///run/containerd/containerd.sock


$ crictl img
IMAGE TAG IMAGE ID SIZE
docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin v1.1.0 fcecffc7ad4af 3.82MB
docker.io/rancher/mirrored-flannelcni-flannel v0.19.1 252b2c3ee6c86 20.5MB
registry.aliyuncs.com/google_containers/coredns v1.9.3 5185b96f0becf 14.8MB
registry.aliyuncs.com/google_containers/etcd 3.5.4-0 a8a176a5d5d69 102MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.25.0 4d2edfd10d3e3 34.2MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.25.0 1a54c86c03a67 31.3MB
registry.aliyuncs.com/google_containers/kube-proxy v1.25.0 58a9a0c6d96f2 20.3MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.25.0 bef2cf3115095 15.8MB
registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12e 302kB
registry.aliyuncs.com/google_containers/pause 3.8 4873874c08efc 311kB
$ crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME
20ae213111853 22 minutes ago Ready kube-flannel-ds-8r6w5 kube-flannel 0 (default)
2da9542f8d5db 25 minutes ago Ready kube-proxy-n6qgc kube-system 0 (default)
b648098bb9180 25 minutes ago Ready kube-scheduler-master kube-system 0 (default)
e37e41edcc6d7 25 minutes ago Ready kube-controller-manager-master kube-system 0 (default)
7cab0673a83ee 25 minutes ago Ready kube-apiserver-master kube-system 0 (default)
ce1bdf631686f 25 minutes ago Ready etcd-master kube-system 0 (default)

注: 初始化也可以使用kubeadm config print init-defaults > kubeadm.yaml导出yaml文件编辑对应需要修改的地方,然后执行kubeadm init –config=kubeadm.yaml

node节点同样需要安装kubectl kubelet kubeadm及以上配置,但不需要执行kubeadm init,而是执行kubeadm join命令

1
kubeadm join 192.168.1.100:6443 --token 9lc7bn.3ui7volivxt9sqbr --discovery-token-ca-cert-hash sha256:fb4183a090fddb58ea61f847e49629d459bb7f760bc2db8b160206567d364348

在master主机上查看node状态

1
2
3
4
5
$ kubectl get node
NAME STATUS ROLES AGE VERSION
master NotReady control-plane 4h38m v1.25.0
node1 NotReady <none> 1m9s v1.25.0
node2 NotReady <none> 20s v1.25.0

NotReady状态和coredns 停滞在 Pending 状态,都是因为未安装pod网络插件造成的

参考:

coredns 停滞在 Pending 状态
安装扩展(Addons)

1
2
3
4
5
6
7
8
9
$ kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-c676cc86f-cqxqt 0/1 Pending 0 141m
coredns-c676cc86f-kc5fg 0/1 Pending 0 141m
etcd-master 1/1 Running 0 141m
kube-apiserver-master 1/1 Running 0 141m
kube-controller-manager-master 1/1 Running 0 141m
kube-proxy-wdcn4 1/1 Running 0 141m
kube-scheduler-master 1/1 Running 0 141m

安装 Pod 的网络插件 flannel

flannel 安装说明

1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

$ kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-c676cc86f-cqxqt 1/1 Running 0 4h24m
coredns-c676cc86f-kc5fg 1/1 Running 0 4h24m

$ kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane 4h47m v1.25.0
node1 Ready <none> 10m11s v1.25.0
node2 Ready <none> 9m22s v1.25.0

测试K8S集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
$ vim nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80

:wq

$ kubectl apply -f nginx-deployment.yaml
deployment.apps/nginx-deployment created

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-deployment-7fb96c846b-jvlpd 0/1 ContainerCreating 0 35s
nginx-deployment-7fb96c846b-xdrkp 0/1 ContainerCreating 0 35s
nginx-deployment-7fb96c846b-z452n 0/1 ContainerCreating 0 35s

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-deployment-7fb96c846b-jvlpd 1/1 Running 0 42s
nginx-deployment-7fb96c846b-xdrkp 1/1 Running 0 42s
nginx-deployment-7fb96c846b-z452n 1/1 Running 0 42s

参考:

服务运行后的端口
kubeadm init