更新时间:2025 年 4 月
Kubernetes 版本:1.32.5
Kubespray 版本:2.28.0
containerd 版本:
简介
kubespray 官方文档:Readme (kubespray.io)
kubespray 的 github 地址:kubernetes-sigs/kubespray: Deploy a Production Ready Kubernetes Cluster (github.com)
kubespray 当前支持的操作系统版本
- Flatcar Container Linux by Kinvolk
- Debian Bookworm, Bullseye
- Ubuntu 20.04, 22.04, 24.04
- CentOS/RHEL 8, 9
- Fedora 39, 40
- Fedora CoreOS (see fcos Note)
- openSUSE Leap 15.x/Tumbleweed
- Oracle Linux 8, 9
- Alma Linux 8, 9
- Rocky Linux 8, 9
- Kylin Linux Advanced Server V10 (experimental: see kylin linux notes)
- Amazon Linux 2 (experimental: see amazon linux notes)
- UOS Linux (experimental: see uos linux notes)
- openEuler (experimental: see openEuler notes)
注:
- 基于 Upstart 和 SysV init 的操作系统版本暂不支持
- Kernel requirements (Linux 内核版本最好 >= 4.19)
版本选择
kubespray 版本与 kubenetes 版本及相关组件对应关系: Releases · kubernetes-sigs/kubespray (github.com)
Kubernetes 的 github 地址:Releases · kubernetes/kubernetes (github.com)
例如:选择 kubespray v2.27.0
对应默认 Kubernetes 版本为 Kubernetes 1.32.2
最新主线支持的组件版本
- Core
- kubernetes v1.32.5
- etcd v3.5.16
- docker 28.0
- containerd v2.0.5
- cri-o v1.32.0 (experimental: see CRI-O Note. Only on fedora, ubuntu and centos based OS)
- Network Plugin
- cni-plugins v1.4.1
- calico 3.29.3
- cilium v1.17.2
- flannel 0.22.0
- kube-ovn 1.12.21
- kube-router 2.1.1
- multus 4.1.0
- kube-vip 0.8.0
- Application
- cert-manager 1.15.3
- coredns 1.11.3
- ingress-nginx 1.12.1
- argocd 2.14.5
- helm 3.16.4
- metallb 0.13.9
- registry 2.8.1
- Storage Plugin
- aws-ebs-csi-plugin 0.5.0
- azure-csi-plugin 1.10.0
- cinder-csi-plugin 1.30.0
- gcp-pd-csi-plugin 1.9.2
- local-path-provisioner 0.0.24
- local-volume-provisioner 2.5.0
- node-feature-discovery 0.16.4
安装要求
软件需求
- Kubernetes 版本 v1.30+
- 运行 Ansible 命令的计算机上需要安装 Ansible v2.14+、 Jinja 2.11+ 和 python-netaddr
- 目标服务器需要能够上网,以下载对应的 docker 镜像。否则需要事先下载好镜像导入私网仓库,并进行额外配置
- 服务器需要配置允许 IPv4 转发
- 如果使用到了 IPv6,则需要配置允许 IPv6 转发
- 运行 Ansible 命令主机的 SSH 密钥必须复制到部署集群的所有服务器中
- 防火墙设置适当的规则策略(配置放行端口),或者直接关闭防火墙
- 如果从非 root 用户帐户运行 kubespray,则应在目标服务器中配置正确的特权升级方法 并指定
ansible_become
标志或命令参数--become
或-b
硬件需求
- Master(控制面节点)
- 内存 2 GB 以上
- Node(工作节点)
- 内存 1 GB 以上
主机列表
操作系统均为 AlmaLinux release 9.6
主机 | IP | 配置 | 安装服务 | 角色 |
---|---|---|---|---|
admin.kubespray.local | 192.168.111.190 (需联网下载资源) | 2C2G | nginx docker kubespray(自带 ansible) | ansible 控制主机 局域网 DNS 服务(coredns) harbor 私有仓库 http 文件下载服务 DNF/YUM 仓库 kubespray 运行主机 |
kube-cp-01.kubespray.local | 192.168.111.191 | 2C4G | etcd | k8s 控制面节点-1 |
kube-node-01.kubespray.local | 192.168.111.192 | 2C2G | k8s 工作节点-1 | |
kube-node-02.kubespray.local | 192.168.111.193 | 2C2G | k8s 工作节点-2 | |
kube-node-03.kubespray.local | 192.168.111.194 | 2C2G | k8s 工作节点-3 |
其中 admin 主机作为管理主机,用于运行 ansible 命令,并且提供一些下载功能
Admin 主机准备
仅在 admin 主机上操作
环境准备
时间设置
注:后面同 k8s 主机一起批量操作
内核参数设置
注:后面同 k8s 主机一起批量操作
防火墙设置
SELinux 设置
$
setenforce 0
sed -ri 's@(^SELINUX)=.*@\1=disabled@g' /etc/selinux/config && sed -ri 's@(^SELINUX)=.*@\1=disabled@g' /etc/sysconfig/selinux
firewalld 设置
放开一些防火墙端口
# 持久化当前配置
$ firewall-cmd --runtime-to-permanent
# 资源下载端口
$ firewall-cmd --permanent --zone=public --add-port=8000/tcp
# harbor 端口
$ firewall-cmd --permanent --zone=public --add-service=https
# 加载配置到运行时
$ firewall-cmd --reload
也可以用 ipset 的方法
dnf -y install ipset
# 创建区域
firewall-cmd --permanent --new-zone=kubernetes-local
firewall-cmd --permanent --zone=kubernetes-local --set-target=ACCEPT
# 创建 ipset
firewall-cmd --permanent --new-ipset=kubernetes-local-ips --type=hash:net
# 往 ipset 添加 IP
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=192.168.111.191/32
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=192.168.111.192/32
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=192.168.111.193/32
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=192.168.111.194/32
# 区域设置 ipset
firewall-cmd --permanent --zone=kubernetes-local --add-source=ipset:kubernetes-local-ips
# 重新加载配置
firewall-cmd --reload
# POD 网络 CIDR 和 Service 网络 CIDR
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=10.100.0.0/16
firewall-cmd --permanent --ipset=kubernetes-local-ips --add-entry=10.200.0.0/16
firewall-cmd --reload
firewall-cmd --info-ipset=kubernetes-local-ips
常用工具安装
$ dnf -y install vim wget tree
安装 Docker
下载仓库文件(RHEL 系)
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
替换仓库为国内源
官方源在国内访问速度较慢,可以替换为国内源(此处列举『清华源』和『阿里源』,选一即可)
# 替换软件仓库为 TUNA
$ sed -i 's+download.docker.com+mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo
# 替换软件仓库为 ali
$ sed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
更新缓存
$ dnf makecache
安装
# 查看版本
$ dnf list docker-ce.x86_64 --showduplicates | sort -r
docker-ce.x86_64 3:28.1.1-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.1.0-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.0.4-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.0.3-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.0.2-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.0.1-1.el9 docker-ce-stable
docker-ce.x86_64 3:28.0.0-1.el9 docker-ce-stable
docker-ce.x86_64 3:27.5.1-1.el9 docker-ce-stable
.......
# 安装最新稳定版本
dnf -y install docker-ce
# 安装指定版本 dnf -y install docker-ce-[VERSION].[架构]
$ dnf -y install docker-ce-25.0.5-1.el9.x86_64
配置(可选)
# 创建数据目录
$ mkdir -p /data/docker
# 创建配置目录
$ mkdir -p /etc/docker/
# 注:192.168.111.1:10811 是外网的代理,没有可以不配置 proxies 段
$ cat > /etc/docker/daemon.json << EOF
{
"bip": "172.17.0.1/16",
"data-root": "/data/docker",
"dns": [
"114.114.114.114",
"119.29.29.29"
],
"registry-mirrors": [
"https://docker.m.daocloud.io",
"https://dockerproxy.net",
"https://docker.aiden-work.tech/"
],
"proxies": {
"http-proxy": "socks5://192.168.111.1:10811",
"https-proxy": "socks5://192.168.111.1:10811",
"no-proxy": "127.0.0.1,localhost,admin.kubespray.local"
},
"log-driver": "json-file",
"log-opts": {
"max-size": "20m",
"max-file": "5"
},
"storage-driver": "overlay2",
"storage-opt": [ "overlay2.size=200G"],
"live-restore": false
}
EOF
启动
# 启动并设置开机启动
$ systemctl enable --now docker
部署 CoreDNS
部署 CoreDNS 作为局域网 DNS
部署
基础变量设置
COREDNS_VERSION=1.12.1
COREDNS_HOME=/opt/coredns
COREDNS_CONF_HOME=/etc/coredns
COREDNS_PID_FILE=/var/run/coredns.pid
mkdir -p ${COREDNS_HOME}/bin
mkdir -p ${COREDNS_CONF_HOME}
下载解压
下载地址:Releases · coredns/coredns
curl -x http://192.168.111.1:10811 \
-o /usr/local/src/coredns_${COREDNS_VERSION}_linux_amd64.tgz \
-L https://github.com/coredns/coredns/releases/download/v${COREDNS_VERSION}/coredns_${COREDNS_VERSION}_linux_amd64.tgz
tar \
-zxvf /usr/local/src/coredns_${COREDNS_VERSION}_linux_amd64.tgz \
-C ${COREDNS_HOME}/bin
配置环境变量
cat > /etc/profile.d/coredns.sh << EOF
export COREDNS_HOME=${COREDNS_HOME}
export COREDNS_CONF_HOME=${COREDNS_CONF_HOME}
export PATH=\${PATH}:\${COREDNS_HOME}/bin
EOF
source /etc/profile
配置
用户使用 Corefile 来配置 CoreDNS。当 CoreDNS 启动时,如果未给出 -conf
标志,它将在当前目录中查找名为 Corefile
的文件。该文件由一个或多个服务器块组成。每个服务器块列出一个或多个插件。这些插件可以通过指令进一步配置
注意:Corefile 中插件的顺序并不决定插件链的执行顺序。插件的执行顺序由
plugin.cfg
中的顺序决定
其余特点
- Corefile 中的注释以
- 可以在 Corefile 的任何地方使用环境变量替换。语法是
{$ENV_VAR}
- 可以使用 import 插件导入其他文件
注:本文主要使用到
hosts
插件,详情参考:hosts
配置:引用外部文件
创建 hosts
文件
cat > /etc/coredns/hosts.kubespray.local << EOF
192.168.111.190 admin.kubespray.local
192.168.111.191 kube-cp-01.kubespray.local
192.168.111.192 kube-node-01.kubespray.local
192.168.111.193 kube-node-02.kubespray.local
192.168.111.194 kube-node-03.kubespray.local
EOF
主配置
cat > ${COREDNS_CONF_HOME}/Corefile << EOF
# 内部局域网解析:只处理 *.local 域名
local:53 {
# 使用 hosts 插件读取内部解析记录
hosts /etc/coredns/hosts.kubespray.local {
fallthrough
}
# 缓存 DNS 查询结果
cache 300
# 日志配置
log
errors
# 如果有多条记录,轮训响应
loadbalance
}
# 全局解析:转发其他所有查询
.:53 {
forward . 8.8.8.8 114.114.114.114 {
max_concurrent 1000
}
log
errors
cache 600
}
EOF
启动
测试启动
coredns -conf ${COREDNS_CONF_HOME}/Corefile
加入 systemd 管理
$ cat > /usr/lib/systemd/system/coredns.service << EOF
[Unit]
Description=CoreDNS DNS Server
After=network.target
[Service]
Type=simple
ExecStart=${COREDNS_HOME}/bin/coredns -conf ${COREDNS_CONF_HOME}/Corefile -pidfile ${COREDNS_PID_FILE}
PIDFile=${COREDNS_PID_FILE} # 与 -pidfile 路径一致
ExecStopPost=/bin/rm -f ${COREDNS_PID_FILE} # 服务停止后删除 PID 文件
Restart=on-failure
User=root
[Install]
WantedBy=multi-user.target
EOF
启动并设置开机启动
systemctl daemon-reload
systemctl enable --now coredns
配置局域网 DNS
注:局域网内主机配置
nmcli c modify ens160 ipv4.dns 192.168.111.190
nmcli con up ens160
部署 HTTP 服务
Nginx 主要为私网主机提供部分安装资源的下载
安装 Nginx
配置 nginx 的 yum 仓库
# 此处的配置默认使用 stable 版本,根据需要安装的版本修改 enabled
$ cat << EOF > /etc/yum.repos.d/nginx.repo
[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/\$releasever/\$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
[nginx-mainline]
name=nginx mainline repo
baseurl=http://nginx.org/packages/mainline/centos/\$releasever/\$basearch/
gpgcheck=1
enabled=0
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
EOF
安装稳定版本
# 查看版本
$ dnf list nginx.x86_64 --showduplicates | sort -r
nginx.x86_64 2:1.28.0-1.el9.ngx nginx-stable
nginx.x86_64 2:1.26.3-1.el9.ngx nginx-stable
......
# 默认配置即为稳定版本,直接安装即可
$ dnf install nginx -y
# 或者:安装指定版本
$ dnf -y install nginx-1.26.3-1.el9.ngx.x86_64
# 查看版本信息
$ nginx -v
nginx version: nginx/1.28.0
# 启动 nginx,并设置开机启动
$ systemctl enable --now nginx
配置 Nginx
配置 Nginx 共享本地目录(如果有其他监听配置记得注释,防止端口冲突)
$ mkdir -p /data/files
$ cat > /etc/nginx/conf.d/default.conf << EOF
server {
listen 8000;
location ^~ /files {
root /data/;
autoindex on;
autoindex_exact_size on;
autoindex_localtime on;
}
location ^~ /repos {
root /data/;
autoindex on;
autoindex_exact_size on;
autoindex_localtime on;
}
}
EOF
加载配置
$ systemctl restart nginx
# 或
$ nginx -s reload
测试访问是否正常
$ curl -L http://admin.kubespray.local:8000/files
部署私有镜像仓库(harbor)
证书准备
harbor 证书相关配置参考:Harbor docs | Configure HTTPS Access to Harbor (goharbor.io)
创建 CA
创建 CA 私钥
mkdir -p /etc/pki/tls/
openssl genrsa -out /etc/pki/tls/ca.key 4096
创建 CA 证书
openssl req -x509 -new -nodes -sha512 -days 3650 \
-subj "/C=CN/ST=Shanghai/L=Shanghai/O=KMUST/OU=Personal/CN=kubespray.local" \
-key /etc/pki/tls/ca.key \
-out /etc/pki/tls/ca.crt
创建 Harbor 密钥证书
创建私钥
mkdir -p /etc/harbor/certs
openssl genrsa -out /etc/harbor/certs/admin.kubespray.local.key 4096
创建证书请求(CSR)
openssl req -sha512 -new \
-subj "/C=CN/ST=Shanghai/L=Shanghai/O=KMUST/OU=Personal/CN=admin.kubespray.local" \
-key /etc/harbor/certs/admin.kubespray.local.key \
-out /etc/harbor/certs/admin.kubespray.local.csr
创建 x509 v3 扩展文件
cat > /etc/harbor/certs/v3.ext <<-EOF
authorityKeyIdentifier=keyid,issuer
basicConstraints=CA:FALSE
keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names
[alt_names]
DNS.1=admin.kubespray.local
IP.1=192.168.111.190
#IP.2=172.17.0.1
EOF
基于扩展文件验证证书
openssl x509 -req -sha512 -days 3650 \
-extfile /etc/harbor/certs/v3.ext \
-CA /etc/pki/tls/ca.crt \
-CAkey /etc/pki/tls/ca.key \
-CAcreateserial \
-in /etc/harbor/certs/admin.kubespray.local.csr \
-out /etc/harbor/certs/admin.kubespray.local.crt
转换证书后缀
# docker daemon 会把 .crt 结尾的证书认为是 CA 的证书,把 .cert 结尾的证书认为是客户端证书
openssl x509 -inform PEM \
-in /etc/harbor/certs/admin.kubespray.local.crt \
-out /etc/harbor/certs/admin.kubespray.local.cert
查看生成的证书
$ ls -l /etc/harbor/certs/
total 20
-rw-r--r--. 1 root root 2159 Mar 12 16:29 admin.kubespray.local.cert
-rw-r--r--. 1 root root 2159 Mar 12 16:23 admin.kubespray.local.crt
-rw-r--r--. 1 root root 1716 Mar 12 16:22 admin.kubespray.local.csr
-rw-------. 1 root root 3272 Mar 12 16:22 admin.kubespray.local.key
-rw-r--r--. 1 root root 280 Mar 12 16:22 v3.ext
复制到对应位置
mkdir -p /data/certs/harbor
cp /etc/harbor/certs/admin.kubespray.local.{key,crt} /data/certs/harbor
安装与检查
安装文档参考:Harbor docs | Harbor Installation and Configuration (goharbor.io)
注:kubespray 提供了
kubespray-{version}/contrib/offline/manage-offline-container-images.sh
脚本可以在互联网下载镜像,构建本地 Registry,并上传镜像。但 Registry 不好管理镜像,项目中建议使用 harbor 作为本地镜像仓库
下载
$ HARBOR_VERSION="v2.13.1"
$ wget -P /usr/local/src/ \
https://github.com/goharbor/harbor/releases/download/${HARBOR_VERSION}/harbor-offline-installer-${HARBOR_VERSION}.tgz
# 有代理可以使用代理下载
$ wget -e https_proxy=http://192.168.111.1:10811 \
-P /usr/local/src/ \
https://github.com/goharbor/harbor/releases/download/${HARBOR_VERSION}/harbor-offline-installer-${HARBOR_VERSION}.tgz
解压
mkdir -p /opt/
tar -xf /usr/local/src/harbor-offline-installer-${HARBOR_VERSION}.tgz -C /opt/
配置
# 创建数据存放路径
$ mkdir -p /data/harbor/
$ cd /opt/harbor/
$ cp harbor.yml.tmpl harbor.yml
$ vim harbor.yml
# 修改域名为当前主机或当前主机 IP,需要与颁发证书时设置的一致
hostname: admin.kubespray.local
# 若无证书则需要注释 https 相关配置
# 配置 https 证书
https:
port: 443
certificate: /data/certs/harbor/admin.kubespray.local.crt
private_key: /data/certs/harbor/admin.kubespray.local.key
# 指定 harbor 登录 admin 用户的密码
harbor_admin_password: Harbor12345
# 数据存放目录
data_volume: /data/harbor
安装 harbor
$ ./install.sh
......
✔ ----Harbor has been installed and started successfully.----
启动 harbor
设置 systemd 管理
$ cat > /usr/lib/systemd/system/harbor.service << EOF
[Unit]
Description=Harbor
After=docker.service systemd-resolved.service
Requires=docker.service
Documentation=https://goharbor.io/
[Service]
RemainAfterExit=yes
ExecStart=/usr/bin/docker compose -f /opt/harbor/docker-compose.yml up
ExecStop=/usr/bin/docker compose -f /opt/harbor/docker-compose.yml down
[Install]
WantedBy=multi-user.target
EOF
启动
$
systemctl daemon-reload
systemctl stop harbor && systemctl enable --now harbor
检查
将私有 CA 证书加入系统的 CA 证书存储库
update-ca-trust
\cp /etc/pki/tls/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust extract
测试 TLS 握手是否正常
# 添加 hosts 解析(如果没有局域网 dns)
# $ echo "192.168.111.190 admin.kubespray.local" >> /etc/hosts
$ openssl s_client -connect admin.kubespray.local:443
测试 WEB 访问是否正常
$ curl https://admin.kubespray.local/
Docker 登录
创建证书存放路径
$ mkdir -p /etc/docker/certs.d/admin.kubespray.local/
复制 CA 证书到 Docker 路径
客户端登录只需要一个文件:CA证书(ca.crt
)或域名证书(<域名>.crt
)
但如果路径下存在私钥,则至少需要三个文件:
- CA证书(ca.crt)或域名证书(
<域名>.crt
) - 证书:
xx.cert
- 证书对应的私钥:
xx.key
\cp /etc/pki/tls/ca.crt /etc/docker/certs.d/admin.kubespray.local/
登录
docker login -u admin admin.kubespray.local
构建本地 DNF/YUM 源
创建软件存储目录
SYSTEM_PKG_PATH=/data/repos/system/almalinux/9/x86_64/Packages
DOCKER_PKG_PATH=/data/repos/apps/almalinux/docker-ce/9/x86_64/Packages
mkdir -p ${SYSTEM_PKG_PATH}
mkdir -p ${DOCKER_PKG_PATH}
创建下载脚本
$ mkdir -p /opt/k8s/ && cd /opt/k8s/
$
cat > /opt/k8s/download_packages.sh << EOF
#!/bin/bash
# 目标下载目录
DOWNLOAD_DIR="${SYSTEM_PKG_PATH}"
# 确保下载目录存在
mkdir -p "\${DOWNLOAD_DIR}"
# 包列表文件
PACKAGE_FILE="./packages.txt"
# 检查包列表文件是否存在
if [ ! -f "\${PACKAGE_FILE}" ]; then
echo "Package file \${PACKAGE_FILE} does not exist."
exit 1
fi
# 从文件读取包名并下载每个包及其依赖项
while IFS= read -r pkg || [[ -n "\${pkg}" ]]; do
echo "Processing \${pkg}..."
dnf download --resolve --alldeps --destdir="\${DOWNLOAD_DIR}" "\${pkg}" || {
echo "Failed to download \${pkg} and its dependencies."
exit 1
}
done < "\${PACKAGE_FILE}"
echo "Download completed successfully."
EOF
下载所需的 rpm 包及依赖
# 一些基础软件,ipvsadm 在使用 ipvs 模式时必须安装
$ cat > /opt/k8s/packages.txt << EOF
wget
vim
ipvsadm
ipset
make
EOF
# 由文件 {KUBESPRAY_HOME}/roles/kubernetes/preinstall/vars/centos.yml 确定需要安装的软件
$ cat >> /opt/k8s/packages.txt << EOF
device-mapper-libs
nss
conntrack
container-selinux
libseccomp
EOF
# 由文件 {KUBESPRAY_HOME}/roles/kubernetes/preinstall/defaults/main.yml 确定需要安装的软件
$ cat >> /opt/k8s/packages.txt << EOF
curl
rsync
socat
unzip
e2fsprogs
xfsprogs
ebtables
bash-completion
tar
EOF
# 下载
$ chmod +x /opt/k8s/download_packages.sh
$ /opt/k8s/download_packages.sh
下载 Docker/Containerd 的 rpm 包及依赖
$ cat << EOF | xargs -I{} sh -c "dnf repoquery --requires --resolve {} | xargs dnf download --downloaddir=${DOCKER_PKG_PATH}"
docker-ce
containerd.io
EOF
安装 DNF 源构建软件
$ dnf -y install createrepo
生成索引
# 生成索引
$
createrepo $(dirname ${SYSTEM_PKG_PATH})
createrepo $(dirname ${DOCKER_PKG_PATH})
# 检查
ls -lh $(dirname ${SYSTEM_PKG_PATH})
ls -lh $(dirname ${DOCKER_PKG_PATH})
配置 DNF 源
$ cat > /etc/yum.repos.d/local.repo << EOF
[local-os-repo]
name=local-os-repo
baseurl=http://admin.kubespray.local:8000/repos/system/almalinux/\$releasever/\$basearch/
enabled=1
gpgcheck=0
[local-docker-repo]
name=local-docker-repo
baseurl=http://admin.kubespray.local:8000/repos/apps/almalinux/docker-ce/\$releasever/\$basearch/
enabled=1
gpgcheck=0
EOF
测试
$ dnf install -y make --disablerepo=* --enablerepo=local-os-repo
Kubespray 安装
方法一:部署主线版本
升级 python 版本
注:AlmaLinux 默认Python 版本为 3.9,不支持 ansible 9.5.1+,需要升级版本
参考:Install Python 3.11 on Rocky Linux 9 / AlmaLinux 9 | ComputingForGeeks
dnf -y install python3.12
dnf -y install python3.12-pip
下载 kubespray
kubespray 下载地址:Releases · kubernetes-sigs/kubespray (github.com)
dnf -y install git
cd /opt
git clone https://ghfast.top/https://github.com/kubernetes-sigs/kubespray.git
安装 kubespray 依赖
参考:kubespray/docs/ansible.md at master · kubernetes-sigs/kubespray
# 查看 kubespray 依赖
# requirements 有多个版本,选择自己适合的版本即可
$ cd /opt/kubespray && cat requirements.txt
ansible==9.13.0
# Needed for community.crypto module
cryptography==44.0.1
# Needed for jinja2 json_query templating
jmespath==1.0.1
# Needed for ansible.utils.ipaddr
netaddr==1.3.0
# 创建虚拟环境
$ cd /opt/ && VENVDIR=kubespray-venv && python3.12 -m venv $VENVDIR
# 启用虚拟环境
$ cd /opt/ && source $VENVDIR/bin/activate
# 更新虚拟环境中 pip 版本
$ /opt/${VENVDIR}/bin/python3.12 -m pip install --upgrade pip
# ----------直接安装 kubespray 依赖-------------
# 不推荐,容易由于环境影响导致部分依赖无法安装
# $ pip3 install -r requirements.txt
# ----------下载后再安装(推荐)-----------------
# 创建目录
$ mkdir -p /opt/kubespray-req
# 下载(使用阿里云的 pip 源加速)
$ pip3 download -i https://mirrors.aliyun.com/pypi/simple/ \
-d /opt/kubespray-req \
-r /opt/kubespray/requirements.txt
# 安装依赖
$ pip3 install -U --no-index \
--find-links=/opt/kubespray-req \
-r /opt/kubespray/requirements.txt
方法二:部署稳定版本
参考:kubespray/docs/setting-up-your-first-cluster.md at master
安装 kubespray 及依赖
kubespray 下载地址:Releases · kubernetes-sigs/kubespray (github.com)
升级 python 版本
注:AlmaLinux 默认 Python 版本为 3.9,不支持 ansible 9.5.1+,需要升级版本
参考:Install Python 3.11 on Rocky Linux 9 / AlmaLinux 9 | ComputingForGeeks
dnf -y install python3.12
dnf -y install python3.12-pip
下载 kubespray
KUBESPRAY_VERSION=2.28.0
# 下载
$ wget -O /usr/local/src/kubespray-${KUBESPRAY_VERSION}.tar.gz \
https://codeload.github.com/kubernetes-sigs/kubespray/tar.gz/refs/tags/v${KUBESPRAY_VERSION}
# 通过代理下载
$ wget -e https_proxy=http://192.168.111.1:10811 \
-O /usr/local/src/kubespray-${KUBESPRAY_VERSION}.tar.gz \
https://codeload.github.com/kubernetes-sigs/kubespray/tar.gz/refs/tags/v${KUBESPRAY_VERSION}
# 解压
$ tar -xf /usr/local/src/kubespray-${KUBESPRAY_VERSION}.tar.gz -C /opt/
# 创建软连接
$ ln -sf /opt/kubespray-${KUBESPRAY_VERSION} /opt/kubespray
安装 kubespray 依赖
参考:kubespray/docs/ansible/ansible.md at master · kubernetes-sigs/kubespray
# 查看 kubespray 依赖
# requirements 有多个版本,选择自己适合的版本即可
$ cd /opt/kubespray && cat requirements.txt
ansible==9.13.0
# Needed for community.crypto module
cryptography==45.0.2
# Needed for jinja2 json_query templating
jmespath==1.0.1
# Needed for ansible.utils.ipaddr
netaddr==1.3.0
# 创建虚拟环境,需要使用 python 3.12 启用虚拟环境
$ cd /opt/ && VENVDIR=kubespray-venv && python3.12 -m venv $VENVDIR
# 启用虚拟环境
$ cd /opt/ && source $VENVDIR/bin/activate
# 更新虚拟环境中 pip 版本
$ /opt/kubespray-venv/bin/python3.12 -m pip install --upgrade pip
# ----------直接安装 kubespray 依赖-------------
# 不推荐,容易由于环境影响导致部分依赖无法安装
# $ pip3 install -r requirements.txt
# ----------下载后再安装(推荐)-----------------
# 创建目录
$ mkdir -p /opt/kubespray-req
# 下载(使用阿里云的 pip 源加速)
$ pip3 download -i https://mirrors.aliyun.com/pypi/simple/ \
-d /opt/kubespray-req \
-r /opt/kubespray/requirements.txt
# 安装依赖
$ pip3 install -U --no-index \
--find-links=/opt/kubespray-req \
-r /opt/kubespray/requirements.txt
方法三:使用容器部署(未测试)
基础变量设置
cd /opt/
KUBESPRAY_VERSION=2.28.0
KUBESPRAY_HOME=/opt/kubespray-${KUBESPRAY_VERSION}
mkdir -p ${KUBESPRAY_HOME}/inventory/kubespray.local
# 写入环境变量
cat > /etc/profile.d/kubespray.sh << EOF
export KUBESPRAY_VERSION=2.28.0
export KUBESPRAY_HOME=/opt/kubespray-${KUBESPRAY_VERSION}
EOF
source /etc/profile
# 如果没有生成过密钥
ssh-keygen
安装运行
$ docker pull quay.io/kubespray/kubespray:v${KUBESPRAY_VERSION}
$ docker run -it \
--mount type=bind,source="${HOME}"/.ssh/id_rsa,dst=/root/.ssh/id_rsa \
--mount type=bind,source=${KUBESPRAY_HOME}/inventory/kubespray.local,dst=/kubespray/inventory/kubespray.local \
quay.io/kubespray/kubespray:v${KUBESPRAY_VERSION} bash
# 编写 /kubespray/inventory/kubespray.local/inventory.ini
# 执行安装
> ansible-playbook -i /kubespray/inventory/kubespray.local/inventory.ini --private-key /root/.ssh/id_rsa cluster.yml
K8S 主机准备
所有主机均需设置,此处将会适当使用 ansible 批量执行命令
注:以下所有操作均在虚拟环境 kubespray-venv
中,在目录 /opt/k8s
中进行
创建 Ansible 主机清单
# 进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate
# 创建 ansible 项目目录
$ mkdir -p /opt/k8s && cd /opt/k8s
# 生成默认配置文件,并修改主机清单位置
$ ansible-config init --disabled > ansible.cfg
# 设置主机清单位置为当前目录 inventory=./ansible-hosts
$ sed -ri 's@;(inventory)=.*@\1=./ansible-hosts@g' ansible.cfg
# 创建 ansible 主机清单
$ cat > ./ansible-hosts << EOF
[k8s_hosts]
192.168.111.191
192.168.111.192
192.168.111.193
192.168.111.194
[local]
192.168.111.190
EOF
SSH 免密登录
配置 admin 主机对其他 k8s 主机的 ssh 免密登录,方便 ansible 运行命令
修改 sshd 配置
# 修改 admin 主机的全局 host key 校验配置
$ sed -ri 's@.*(StrictHostKeyChecking).*@\1 no@g' /etc/ssh/ssh_config
创建主机列表和脚本
# 创建主机列表
$ cat > k8s-hosts.list << EOF
192.168.111.190
192.168.111.191
192.168.111.192
192.168.111.193
192.168.111.194
EOF
# 创建密钥分发脚本
$ cat > ssh_key_auth.sh << EOF
#!/bin/bash
rpm -q sshpass &> /dev/null || dnf -y install sshpass
[ -f /root/.ssh/id_rsa ] || ssh-keygen -f /root/.ssh/id_rsa -P ''
# 根据情况修改密码
export SSHPASS=520123
while read IP;do
sshpass -e /usr/bin/ssh-copy-id -o StrictHostKeyChecking=no \$IP
done < k8s-hosts.list
EOF
分发密钥
$ bash ssh_key_auth.sh
验证
免密登录配置完成后,可以验证 ansible 是否能正常连接各主机
$ ansible all -m ping
# 输出信息
192.168.111.191 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"ping": "pong"
}
......
......
时间设置
时区设置
ansible all -m shell -a 'timedatectl set-timezone Asia/Shanghai'
时间同步设置
复制当前的 chrony 配置
cp /etc/chrony.conf ./chrony.conf
修改 chrony 配置
#### 设置 ####
NTP_SERVERS=( \
"0.cn.pool.ntp.org" \
"ntp.tuna.tsinghua.edu.cn" \
"ntp.tencent.com" \
"ntp.aliyun.com" \
"ntp.ntsc.ac.cn" \
)
# 删除所有现有的 pool 行
sed -i '/^pool /d' ./chrony.conf
# 在第三行插入新的 NTP 服务器
for NTP_SERVER in "${NTP_SERVERS[@]}"; do
sed -i "3i\pool ${NTP_SERVER} iburst" ./chrony.conf
done
# 重启 chronyd
systemctl restart chronyd
# 检查同步情况
chronyc sources
分发参数配置
# 分发参数配置
$ ansible all -m copy -a 'src=./chrony.conf dest=/etc/chrony.conf'
# 重启 chronyd
$ ansible all -m shell -a 'systemctl restart chronyd'
# 检查同步情况
chronyc sources
关闭 SELinux 策略
所有主机关闭 SELinux
# 关闭 SELinux
$ ansible all -m shell -a "setenforce 0"
$ ansible all -m shell -a "sed -ri 's@(^SELINUX)=.*@\1=disabled@g' /etc/selinux/config && sed -ri 's@(^SELINUX)=.*@\1=disabled@g' /etc/sysconfig/selinux"
加载所需的模块
所有主机加载模块,重启主机需要重新加载,建议配置到开机启动脚本
加载一些基础模块
$
# 加载 overlay 模块
ansible all -m shell -a 'modprobe -- overlay'
ansible all -m shell -a 'echo 'overlay' > /etc/modules-load.d/overlay.conf'
# 加载 br_netfilter 模块
ansible all -m shell -a 'modprobe -- br_netfilter'
ansible all -m shell -a 'echo 'br_netfilter' > /etc/modules-load.d/br_netfilter.conf'
# 加载 ip_conntrack 模块
ansible all -m shell -a 'modprobe -- ip_conntrack'
ansible all -m shell -a 'echo 'ip_conntrack' > /etc/modules-load.d/ip_conntrack.conf'
# 加载 nf_conntrack 模块
ansible all -m shell -a 'modprobe -- nf_conntrack'
ansible all -m shell -a 'echo 'nf_conntrack' > /etc/modules-load.d/nf_conntrack.conf'
加载 ipvs
相关模块
$
ansible all -m shell -a 'modprobe -- ip_vs'
ansible all -m shell -a 'modprobe -- ip_vs_rr'
ansible all -m shell -a 'modprobe -- ip_vs_wrr'
ansible all -m shell -a 'modprobe -- ip_vs_sh'
ansible all -m shell -a 'modprobe -- ip_vs_wlc'
ansible all -m shell -a 'modprobe -- ip_vs_lc'
$ cat > ./ipvs.conf << EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
ip_vs_wlc
ip_vs_lc
EOF
# 分发文件到其他主机
$ ansible all -m copy -a 'src=./ipvs.conf dest=/etc/modules-load.d/'
检查加载的模块
$ lsmod | grep -e 'ip_vs' -e 'nf_conntrack' -e 'ip_conntrack' -e 'br_netfilter'
系统参数配置
所有主机配置系统参数优化
SWAP 配置
$ ansible all -m shell -a 'swapoff -a'
$ ansible all -m shell -a 'sed -ri "/swap/ s@^@#@g" /etc/fstab'
资源限制
创建资源限制配置(PAM)
$ cat > ./pam-limits-sys.conf <<EOF
* - core unlimited
* - nproc unlimited
* - nofile 1048576
* - memlock unlimited
* - msgqueue unlimited
* - stack unlimited
EOF
创建资源限制配置(systemd)
$ cat > ./systemd-user-sys.conf << EOF
[Manager]
DefaultLimitCORE=infinity
DefaultLimitNPROC=infinity
DefaultLimitNOFILE=1048576
DefaultLimitMEMLOCK=infinity
DefaultLimitMSGQUEUE=infinity
EOF
$ cp /etc/systemd/system.conf ./system.conf
sed -ri 's@^#* *(DefaultLimitCORE).*@\1=infinity@' ./system.conf
sed -ri 's@^#* *(DefaultLimitNPROC).*@\1=infinity@' ./system.conf
sed -ri 's@^#* *(DefaultLimitNOFILE).*@\1=1048576@' ./system.conf
sed -ri 's@^#* *(DefaultLimitMEMLOCK).*@\1=infinity@' ./system.conf
sed -ri 's@^#* *(DefaultLimitMSGQUEUE).*@\1=infinity@' ./system.conf
分发参数配置
# 分发参数配置
$
ansible all -m copy -a 'src=./pam-limits-sys.conf dest=/etc/security/limits.d/'
ansible all -m copy -a 'src=./systemd-user-sys.conf dest=/etc/systemd/user.conf.d/'
ansible all -m copy -a 'src=./system.conf dest=/etc/systemd/'
# 应用参数配置
$ ansible all -m shell -a 'systemctl daemon-reexec'
内核参数
注:内核参数文档参考:kernel.org/doc/Documentation/sysctl/README
注:Pod 级别的内核参数设置可以参考:在 Kubernetes 集群中使用 sysctl | Kubernetes
创建内核参数配置
$ cat > ./90-sysctl.conf << EOF
###### TCP 连接快速释放设置 ######
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
###### TIME_WAIT 过多时设置 ######
net.ipv4.tcp_tw_reuse = 1
#net.ipv4.tcp_tw_recycle = 0
# 限制 TIME_WAIT 最大值,默认 8192
net.ipv4.tcp_max_tw_buckets=5000
###### 端口相关设置 ######
# 设定允许系统主动打开的端口范围,根据需要设置,默认 32768 60999
net.ipv4.ip_local_port_range = 32768 65530
###### 防 SYNC 攻击设置 ######
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_syn_retries=3
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_max_syn_backlog=8192
# 配置 TCP 重传的最大次数减少到 5 次,超时时间约为 6 秒,方便及时发现节点故障
# net.ipv4.tcp_retries2=5
###### 其他 TCP 设置 ######
# 系统当前因后台进程无法处理的新连接而溢出,则允许系统重置新连接
net.ipv4.tcp_abort_on_overflow=1
####### nf_conntrack 相关设置(k8s、docker 防火墙的 nat) #######
net.netfilter.nf_conntrack_max = 262144
net.nf_conntrack_max = 262144
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 3600
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
####### socket 相关设置 ######
net.core.somaxconn = 32768
net.core.netdev_max_backlog = 32768
###### 其他设置 #######
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.default.accept_source_route=0
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1
#
net.ipv4.conf.all.forwarding=1
net.ipv6.conf.all.forwarding=1
# 转发端口(nodeport)
net.ipv4.ip_local_reserved_ports=30000-32767
###### 内存相关设置 #######
vm.swappiness = 0
vm.max_map_count = 655360
vm.overcommit_memory = 0
# vm.min_free_kbytes = 1048576
vm.overcommit_memory=1
vm.panic_on_oom=0
###### 文件相关 #######
fs.file-max = 6573688
fs.nr_open = 1048576
fs.aio-max-nr = 1048576
####### K8S 相关设置 ######
# 必须先加载 br_netfilter 模块
# 二层的网桥在转发包时也会被 arptables/ip6tables/iptables 的 FORWARD 规则所过滤
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
###### 进程相关 #######
# 最大进程 id,默认值为 32768,最大值根据发行版有所不同
kernel.pid_max = 132768
kernel.threads-max = 123342
###### 其他内核相关 #######
kernel.keys.root_maxbytes=25000000
kernel.keys.root_maxkeys=1000000
kernel.panic=10
kernel.panic_on_oops=1
EOF
# kube-bench 相关参数配置,使用 ansible 安装时会自动设置
# 参考:{KUBESPRAY_HOME}/roles/kubernetes/preinstall/tasks/0080-system-configurations.yml
分发参数配置并应用
# 分发参数配置
$ ansible all -m copy -a 'src=./90-sysctl.conf dest=/etc/sysctl.d/'
# 应用参数配置
$ ansible all -m shell -a 'sysctl --system'
配置 DNS 解析
DNS 配置
注:有局域网 DNS 的情况
nmcli
模块相关文档:community.general.nmcli module – Manage Networking — Ansible Community Documentation
ansible all -m community.general.nmcli -a "type=ethernet conn_name=ens160 dns4=192.168.111.190 state=present" --become
ansible all -m shell -a "nmcli con up ens160" --become
HOSTS 配置
注:没有局域网 DNS 的情况
编写 hosts 解析文件
# 只需要解析 192.168.111.190 主机,其余主机 kubespray 会自动添加解析
$ cat > ./hosts << EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.111.190 admin.kubespray.local
EOF
复制文件到各个主机
$ ansible k8s_hosts -m copy -a 'src=./hosts dest=/etc/'
创建 DNF 仓库文件
禁用所有仓库(除了 local-*-repo
之外)
# 备份仓库配置
$ ansible k8s_hosts -m copy -a "src=/etc/yum.repos.d/ dest=/etc/yum.repos.d.bak-`date +'%Y%m%d'` remote_src=yes" -b
# 禁用所有仓库
$ ansible k8s_hosts -m shell -a "sed -i 's/enabled=1/enabled=0/g' /etc/yum.repos.d/*.repo " --become
启用新仓库(局域网)
ansible all -m ansible.builtin.yum_repository \
-a "name=local-os-repo
description=local-os-repo
baseurl=http://admin.kubespray.local:8000/repos/system/almalinux/\$releasever/\$basearch/
enabled=1
gpgcheck=0" \
--become
ansible all -m ansible.builtin.yum_repository \
-a "name=local-docker-repo
description=local-docker-repo
baseurl=http://admin.kubespray.local:8000/repos/apps/almalinux/docker-ce/\$releasever/\$basearch/
enabled=1
gpgcheck=0" \
--become
创建缓存
$ ansible k8s_hosts -m shell -a "dnf makecache" --become
添加证书信任
harbor 使用的是自签证书,所以需要在各个 k8s 主机添加证书信任
# 创建证书目录
$ ansible k8s_hosts -m shell -a "mkdir -p /etc/containerd/certs.d/admin.kubespray.local/" --become
# 复制证书
# 此处直接使用 CA 证书
$ ansible k8s_hosts -m copy -a 'src=/etc/pki/tls/ca.crt dest=/etc/containerd/certs.d/admin.kubespray.local/'
防火墙设置
节点数据
注:如果项目所有主机均已配置防火墙互通,则这一步可以省略
添加一个 zone,默认设置为 ACCEPT,并将集群中的主机添加到该 zone 当中
创建 playbook
$ cat << EOF > ./firewalld-nodes-playbook.yaml
---
- name: Configure firewall for kubespray-local zone
hosts: all
become: yes # 使用提升权限,因为修改防火墙配置通常需要root权限
tasks:
- name: Create kubespray-local zone
ansible.builtin.firewalld:
state: present
zone: kubespray-local
permanent: true
immediate: false
- name: Set default policy for kubespray-local to ACCEPT
ansible.builtin.firewalld:
zone: kubespray-local
target: ACCEPT
state: present
permanent: true
immediate: false
- name: Add sources to kubespray-local zone
ansible.builtin.firewalld:
zone: kubespray-local
source: "{{ item }}"
state: enabled
permanent: true
immediate: false
loop:
- "192.168.111.190/32"
- "192.168.111.191/32"
- "192.168.111.192/32"
- "192.168.111.193/32"
- "192.168.111.194/32"
- name: Reload firewall to apply changes
ansible.builtin.command: firewall-cmd --reload
EOF
应用 playbook
$ ansible-playbook -i ./ansible-hosts ./firewalld-nodes-playbook.yaml
容器数据
注:需要与 pod 和 service 的 CIDR 配置一致
创建 playbook
$ cat << EOF > ./firewalld-k8s-internal-playbook.yaml
---
- name: Configure Firewalld For K8S Net CIDR
hosts: k8s_hosts
# 需要管理员权限来修改防火墙规则
become: yes
tasks:
- name: Add Service CIDR 10.233.0.0/18 to trusted zone
ansible.posix.firewalld:
zone: trusted
source: 10.233.0.0/18
state: enabled
permanent: true
immediate: yes
- name: Add Pod CIDR 10.233.64.0/18 to trusted zone
ansible.posix.firewalld:
zone: trusted
source: 10.233.64.0/18
state: enabled
permanent: true
immediate: yes
- name: Reload firewall to apply changes
ansible.builtin.command: firewall-cmd --reload
EOF
应用 playbook
$ ansible-playbook -i ./ansible-hosts ./firewalld-k8s-internal-playbook.yaml
配置
进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate
复制一份默认的配置
$ cp -a /opt/kubespray/inventory/sample /opt/kubespray/inventory/kubespray.local \
&& cd /opt/kubespray/inventory/kubespray.local
配置 k8s 主机清单(新)
注:如果是容器安装,需要进入容器执行该操作
参考:Inventory
Kubespray 2.27.0 版本之后,不再提供生成主机清单的脚本,可以参考文档或者 /opt/kubespray/inventory/sample/inventory.ini
编辑主机清单
$ vim /opt/kubespray/inventory/kubespray.local/inventory.ini
kube-cp-01.kubespray.local ansible_host=192.168.111.191 etcd_member_name=etcd-01.kubespray.local # ip=私网IP
kube-node-01.kubespray.local ansible_host=192.168.111.192 # ip=私网IP
kube-node-02.kubespray.local ansible_host=192.168.111.193 # ip=私网IP
kube-node-03.kubespray.local ansible_host=192.168.111.194 # ip=私网IP
[kube_control_plane]
kube-cp-01.kubespray.local
[etcd:children]
kube_control_plane
[kube_node]
kube-node-01.kubespray.local
kube-node-02.kubespray.local
kube-node-03.kubespray.local
另一种主机清单格式(yaml)
# 可以根据实际情况修改作为控制面节点的主机、安装 etcd 的主机等配置
$ cat /opt/kubespray/inventory/kubespray.local/hosts.yaml
all:
hosts:
kube-cp-01.kubespray.local:
ansible_host: 192.168.111.191
ip: 192.168.111.191
access_ip: 192.168.111.191
kube-node-01.kubespray.local:
ansible_host: 192.168.111.192
ip: 192.168.111.192
access_ip: 192.168.111.192
kube-node-02.kubespray.local:
ansible_host: 192.168.111.193
ip: 192.168.111.193
access_ip: 192.168.111.193
kube-node-03.kubespray.local:
ansible_host: 192.168.111.194
ip: 192.168.111.194
access_ip: 192.168.111.194
children:
kube_control_plane:
hosts:
kube-cp-01.kubespray.local:
kube_node:
hosts:
kube-node-01.kubespray.local:
kube-node-02.kubespray.local:
kube-node-03.kubespray.local:
etcd:
hosts:
kube-cp-01.kubespray.local:
k8s_cluster:
children:
kube_control_plane:
kube_node:
calico_rr:
hosts: {}
个性化配置
Containerd 相关配置
注:当前 containerd (v1.7+)的配置不支持配置 registry 认证凭证
关注开发进展 - 1:https://github.com/containerd/containerd/issues/8228
关注开发进展 - 2:[WIP] Introduce credential plugin by lengrongfu · Pull Request #9872 · containerd/containerd (github.com)
关注开发进展 - 4:Support credential domain aliases in host configuration · Issue #10540 · containerd/containerd
生成 base64 请求头
$ echo -n "admin:Harbor12345" | base64
YWRtaW46SGFyYm9yMTIzNDU=
# $ echo -n "18487357220:0320Yxc520.." | base64
# MTg0ODczNTcyMjA6MDMyMFl4YzUyMC4u
修改配置模板
参考:Private Registry auth config when using hosts.toml · containerd/containerd · Discussion #6468
# 备份
cp /opt/kubespray/roles/container-engine/containerd/templates/hosts.toml.j2 /opt/kubespray/roles/container-engine/containerd/templates/hosts.toml.j2.bak-`date +"%Y%m%d"`
# 新增 auth 配置
$ vim /opt/kubespray/roles/container-engine/containerd/templates/hosts.toml.j2
server = "{{ item.server | default("https://" + item.prefix) }}"
{% for mirror in item.mirrors %}
[host."{{ mirror.host }}"]
capabilities = ["{{ ([ mirror.capabilities ] | flatten ) | join('","') }}"]
skip_verify = {{ mirror.skip_verify | default('false') | string | lower }}
override_path = {{ mirror.override_path | default('false') | string | lower }}
{% if mirror.ca is defined %}
ca = ["{{ ([ mirror.ca ] | flatten ) | join('","') }}"]
{% endif %}
{% if mirror.client is defined %}
client = [{% for pair in mirror.client %}["{{ pair[0] }}", "{{ pair[1] }}"]{% if not loop.last %},{% endif %}{% endfor %}]
{% endif %}
{% if mirror.auth is defined %}
[host."{{ mirror.host }}".header]
authorization = "Basic {{ mirror.auth }}"
{% endif %}
{% endfor %}
修改配置
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/all/containerd.yml
# 修改以下内容
containerd_storage_dir: "/data/containerd"
......
# Registries defined within containerd.
containerd_registries_mirrors:
- prefix: docker.io
mirrors:
- host: https://docker.m.daocloud.io
capabilities: ["pull", "resolve"]
- host: https://dockerproxy.cn
capabilities: ["pull", "resolve"]
- host: https://dockerpull.com
capabilities: ["pull", "resolve"]
- host: https://docker.aiden-work.tech
capabilities: ["pull", "resolve"]
- prefix: admin.kubespray.local
mirrors:
- host: https://admin.kubespray.local
capabilities: ["pull", "resolve", "push"]
skip_verify: false
ca: "/etc/containerd/certs.d/admin.kubespray.local/ca.crt"
auth: YWRtaW46SGFyYm9yMTIzNDU=
#
# header 携带 basic auth 的方式,不适用于 docker.io 、 阿里云 ACR
# - prefix: registry.cn-hangzhou.aliyuncs.com
# mirrors:
# - host: https://registry.cn-hangzhou.aliyuncs.com
# capabilities: ["pull", "resolve", "push"]
# skip_verify: false
# auth: MTg0ODczNTcyMjA6MDMyMFl4YzUyMC4u
# 注:暂时不能配置 containerd_registry_auth
# containerd_registry_auth:
# - registry: 10.0.0.2:5000
# username: user
# password: pass
......
生成的配置示例
server = "https://admin.kubespray.local"
[host."https://admin.kubespray.local"]
skip_verify = false
ca = "/etc/containerd/certs.d/admin.kubespray.local/ca.crt"
[host."https://admin.kubespray.local".header]
authorization = "Basic YWRtaW46SGFyYm9yMTIzNDU="
K8S 集群相关配置
注:当前 kubespray 支持的 kubernetes 版本可以参考
kubespray/roles/kubespray-defaults/defaults/main/checksums.yml
注:不推荐使用 kubespray 部署的 NodeLocal DNS
推荐自己手动安装,参考:在 Kubernetes 集群中使用 NodeLocal DNSCache | Kubernetes
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/k8s_cluster/k8s-cluster.yml
# 是否部署 nodelocaldns,小项目无需 nodelocaldns,大项目根据需要配置
enable_nodelocaldns: false
# 网络插件类型,默认 calico。可以选择 cilium, calico, kube-ovn, weave 或 flannel
kube_network_plugin: calico
# Service 的网段
kube_service_addresses: 10.233.0.0/18
# Pod 的网段
kube_pods_subnet: 10.233.64.0/18
# 每个节点分配 pod 网段的大小
# 常用于限制每个 node 的 pod 数量
kube_network_node_prefix: 24
# 直接限制每个 node 的 pod 数量
kubelet_max_pods: 110
# kube-proxy 的模式选择,默认为 ipvs,可以切换为 iptables
kube_proxy_mode: ipvs
# Kubernetes 集群名称
cluster_name: cluster.local
# 容器运行时选择
container_manager: containerd
# event 保留时间
## Amount of time to retain events. (default 1h0m0s)
event_ttl_duration: "1h0m0s"
##### 资源预留与 Pod 驱逐(根据实际情况配置) #######
##### 参考示例:https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#example-scenario
## kubernetes 资源预留
## 预留资源给 Kubernetes 的守护进程(如:kube-apiserver、kube-scheduler、kube-controller-manager 等)
# kube_reserved: false
## Uncomment to override default values
## The following two items need to be set when kube_reserved is true
# kube_reserved_cgroups_for_service_slice: kube.slice
# kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
# kube_memory_reserved: 256Mi
# kube_cpu_reserved: 100m
# kube_ephemeral_storage_reserved: 2Gi
# kube_pid_reserved: "1000"
## 针对主节点(master node)预留的资源
# Reservation for master hosts
# kube_master_memory_reserved: 512Mi
# kube_master_cpu_reserved: 200m
# kube_master_ephemeral_storage_reserved: 2Gi
# kube_master_pid_reserved: "1000"
# 系统资源预留
# system_reserved: true
## Uncomment to override default values
## The following two items need to be set when system_reserved is true
# system_reserved_cgroups_for_service_slice: system.slice
# system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
# system_memory_reserved: 512Mi
# system_cpu_reserved: 500m
# system_ephemeral_storage_reserved: 2Gi
## 针对主节点(master node)预留的资源
## Reservation for master hosts
# system_master_memory_reserved: 256Mi
# system_master_cpu_reserved: 250m
# system_master_ephemeral_storage_reserved: 2Gi
system_reserved: true
system_memory_reserved: 512Mi
system_cpu_reserved: 500m
system_ephemeral_storage_reserved: 2Gi
# 什么情况下执行 Pod 驱逐,参考 https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/kubelet-config-file/
## Eviction Thresholds to avoid system OOMs
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#eviction-thresholds
# eviction_hard: {}
# eviction_hard_control_plane: {}
eviction_hard: {memory.available: "500Mi", nodefs.available: "15%", nodefs.inodesFree: "15%", imagefs.available: "15%"}
etcd 相关配置
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/all/etcd.yml
# etcd 数据存储目录
etcd_data_dir: /data/etcd
# etcd 安装位置,默认 host,即安装在主机,可以设置为 docker 或 containerd
etcd_deployment_type: host
附加组件配置
可以配置一些常用的组件,例如 Dashboard、Ingress、Metrics Server 等等组件
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/k8s_cluster/addons.yml
# 例如安装 Kubernetes dashboard,则设置为 true
# dashboard_enabled: true
证书配置
自动更新证书配置
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/k8s_cluster/k8s-cluster.yml
## Automatically renew K8S control plane certificates on first Monday of each month
auto_renew_certificates: true
注:这只是
etcd
证书的有效期,其余证书由kubeadm
生成,默认一年(ca
十年)
$ vim /opt/kubespray/roles/kubespray_defaults/defaults/main/main.yml
certificates_duration: 36500
Calico 配置
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/k8s_cluster/k8s-net-calico.yml
# Set calico network backend: "bird", "vxlan" or "none"
# bird enable BGP routing, required for ipip and no encapsulation modes
# calico_network_backend: vxlan
# IP in IP and VXLAN is mutualy exclusive modes.
# set IP in IP encapsulation mode: "Always", "CrossSubnet", "Never"
# calico_ipip_mode: 'Never'
# set VXLAN encapsulation mode: "Always", "CrossSubnet", "Never"
# calico_vxlan_mode: 'Always'
NodePort 端口设置
根据需要进行设置
$ vim /opt/kubespray/roles/kubernetes/control-plane/defaults/main/main.yml
......
# 默认 30000-32767
kube_apiserver_node_port_range: "30000-32767"
.......
$ vim /opt/kubespray/roles/kubernetes/node/defaults/main.yml
......
# 默认 30000-32767
kube_apiserver_node_port_range: "30000-32767"
.......
CoreDNS(集群内)设置
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/all/all.yml
## Upstream dns servers
upstream_dns_servers:
- 192.168.111.190
- 114.114.114.114
- 223.5.5.5
- 119.29.29.29
- 8.8.8.8
允许不安全的内核参数(可选)
$ vim /opt/kubespray/roles/kubernetes/node/defaults/main.yml
###### 影响所有 kubelet,包括控制节点
# 根据实际需要添加配置,支持通配符
## Support parameters to be passed to kubelet via kubelet-config.yaml
kubelet_config_extra_args:
allowedUnsafeSysctls:
- "net.core.*"
- "net.ipv4.*"
## Parameters to be passed to kubelet via kubelet-config.yaml when cgroupfs is used as cgroup driver
kubelet_config_extra_args_cgroupfs:
systemCgroups: /system.slice
cgroupRoot: /
###### 只影响工作节点上的 kubelet,不包括控制节点
## Support parameters to be passed to kubelet via kubelet-config.yaml only on nodes, not masters
kubelet_node_config_extra_args: {}
###### 这两个设置,是以标签形式作为 kubelet 参数,现在已经弃用
## Support custom flags to be passed to kubelet
# kubelet_custom_flags: []
## Support custom flags to be passed to kubelet only on nodes, not masters
# kubelet_node_custom_flags: []
部署资源下载
kubespray/contrib/offline/manage-offline-container-images.sh
脚本可以在互联网下载镜像,构建本地 Registry,并上传镜像。但 Registry 不好管理镜像,此处将使用 Harbor 作为本地镜像仓库
查看所需资源
离线环境检查
# 修改下载组件的版本,不修改的话,默认用当前支持的最高版本
#$ vim /opt/kubespray/roles/kubespray_defaults/defaults/main/main.yml
# kube_version: 1.32.5
# 运行离线环境检查脚本
# 进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate
$ bash /opt/kubespray/contrib/offline/generate_list.sh -i /opt/kubespray/inventory/kubespray.local/inventory.ini
资源文件下载地址
$ cat /opt/kubespray/contrib/offline/temp/files.list
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubelet
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubectl
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubeadm
......
镜像下载地址
$ cat /opt/kubespray/contrib/offline/temp/images.list
docker.io/mirantis/k8s-netchecker-server:v1.2.2
docker.io/mirantis/k8s-netchecker-agent:v1.2.2
quay.io/coreos/etcd:v3.5.16
quay.io/cilium/cilium:v1.17.3
......
镜像资源准备
注:registry.k8s.io 的镜像资源需要连接外网
剔除镜像
registry.k8s.io/sig-storage/local-volume-provisioner:v2.5.0
sed -ri '/local-volume-provisioner/d' /opt/kubespray/contrib/offline/temp/images.list
下载镜像资源
$ cat /opt/kubespray/contrib/offline/temp/images.list | xargs -n 1 docker pull
harbor 批量创建项目
注:harbor 的项目名为
k8s-cluster
,访问级别可以为公开
web 页面创建,步骤较为简单,省略
重新打标签
# 定义 harbor 地址变量
$ HARBOR_DOMAIN=admin.kubespray.local/k8s-cluster
# 该命令可以过滤镜像列表,并将原 url 替换成 harbor 地址
# 传给 docker tag,批量修改镜像标签
$ cat /opt/kubespray/contrib/offline/temp/images.list \
| awk -F'/' -v OFS='/' '{$1=$0" '$HARBOR_DOMAIN'";print}' | xargs -n 2 docker tag
# 查看是否正常更新标签
$ docker images | grep admin.kubespray.local
上传至 Harbor
$ docker images | grep $HARBOR_DOMAIN \
| awk -F' ' -v OFS=':' '{print $1,$2}' \
| xargs -i docker push {}
下载文件资源
提取路径并创建本地路径(脚本)
$ cd /opt/k8s/ && vim /opt/k8s/create_dir.sh
#!/bin/bash
# 定义下载文件列表
FILES="/opt/kubespray/contrib/offline/temp/files.list"
# 定义基础目录
BASE_DIR="/data/files"
# 创建一个新文件用于存放 URL 和对应的本地目录路径
OUTPUT_FILE="./url_and_paths.list"
rm -f ${OUTPUT_FILE}
# 读取并处理文件中的每一行
while read -r line; do
# 使用 awk 去掉 http(s)://
without_http=$(echo "$line" | awk -F'//' '{print $2}')
# 使用 dirname 去掉最后的文件名
path_only=$(dirname "$without_http")
# 拼接目录
file_path="$BASE_DIR/$path_only/"
# 创建相应的目录
mkdir -p "$file_path"
# 在 URL 后面追加相应的本地目录路径,并保存到新文件中
echo "$line $file_path" >> "$OUTPUT_FILE"
# 输出创建的目录信息
echo "Created directory: $file_path"
done < "$FILES"
运行脚本,查看生成的路径文件
$ chmod 700 /opt/k8s/create_dir.sh && bash /opt/k8s/create_dir.sh
$ cat /opt/k8s/url_and_paths.list
......
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubelet /data/files/dl.k8s.io/release/v1.32.5/bin/linux/amd64/
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubectl /data/files/dl.k8s.io/release/v1.32.5/bin/linux/amd64/
https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubeadm /data/files/dl.k8s.io/release/v1.32.5/bin/linux/amd64/
......
查看
$ tree /data/files/
/data/files/
├── dl.k8s.io
│ └── release
│ └── v1.31.4
│ └── bin
│ └── linux
│ └── amd64
├── get.helm.sh
├── github.com
│ ├── cilium
│ │ └── cilium-cli
│ │ └── releases
│ │ └── download
│ │ └── v0.16.0
......
根据生成的文件,下载组件到指定路径
# 外网代理地址:https_proxy="http://192.168.111.1:10811"
$ cat /opt/k8s/url_and_paths.list | awk -F' ' '{ cmd="https_proxy=http://192.168.111.1:10811 wget --no-check-certificate " $1 " -P " $2; system(cmd) }'
完整的目录
$ tree /data/files/
私网部署相关配置
修改文件下载地址
# 修改 files_repo: "http://admin.kubespray.local:8000/files"
$ sed -ri 's@^#* *(files_repo:).*@\1 "http://admin.kubespray.local:8000/files"@' /opt/kubespray/inventory/kubespray.local/group_vars/all/offline.yml
修改私有镜像仓库地址
注:Harbor 的项目名为 k8s-cluster
# 修改 registry_host: "admin.kubespray.local/k8s-cluster"
$ sed -ri 's@^#* *(registry_host:).*@\1 "admin.kubespray.local/k8s-cluster"@' /opt/kubespray/inventory/kubespray.local/group_vars/all/offline.yml
修改 DNF 源
# 修改 yum_repo: "http://admin.kubespray.local:8000/repos/apps/almalinux/"
$ sed -ri 's@^#* *(yum_repo:).*@\1 "http://admin.kubespray.local:8000/repos/apps/almalinux/"@' /opt/kubespray/inventory/kubespray.local/group_vars/all/offline.yml
确定需要私网下载的资源
# 放开容器私网下载
$ sed -ri 's@^#* +(.*registry_host.*)@\1@g' /opt/kubespray/inventory/kubespray.local/group_vars/all/offline.yml
# 文件资源私网下载
$ sed -ri 's@^#* +(.*files_repo.*)@\1@g' /opt/kubespray/inventory/kubespray.local/group_vars/all/offline.yml
部署 Kubernetes
部署 K8S
# 进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate && cd /opt/kubespray/
# 开启日志
$ vim /opt/kubespray/inventory/kubespray.local/group_vars/all/all.yml
unsafe_show_logs: true
# 部署
$ ansible-playbook -i /opt/kubespray/inventory/kubespray.local/inventory.ini \
--become --become-user=root -b /opt/kubespray/cluster.yml
检查状态
cp 节点执行
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-cp-01.kubespray.local Ready control-plane 8m49s v1.31.4
kube-node-01.kubespray.local Ready <none> 7m22s v1.31.4
kube-node-02.kubespray.local Ready <none> 7m21s v1.31.4
kube-node-03.kubespray.local Ready <none> 7m21s v1.31.4
删除节点
# 进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate
$ cd /opt/kubespray/
# 执行 remove-node.yml,限制只在 node=kube-node-02.kubespray.local进行修改
$ ansible-playbook \
-i /opt/kubespray/inventory/skynemo.cn/hosts.yaml \
-b /opt/kubespray/remove-node.yml \
-e "node=kube-node-02.kubespray.local" --limit=kube-node-02.kubespray.local
# 在主机清单中删除对应的节点信息
$ vim /opt/kubespray/inventory/skynemo.cn/hosts.yaml
# 查看节点信息
$ kubectl get nodes
添加节点
参考:kubespray/nodes.md at master · kubernetes-sigs/kubespray (github.com)
# 编辑主机清单文件
$ vim /opt/kubespray/inventory/skynemo.cn/hosts.yaml
# 执行,限制在 kube-node-02.kubespray.local 进行修改
$ ansible-playbook \
-i /opt/kubespray/inventory/skynemo.cn/hosts.yaml \
/opt/kubespray/cluster.yml \
-b --limit=kube-node-02.kubespray.local
# 注:非控制面节点可以使用 scale.yml,控制面节点只能够使用 cluster.yml
# 查看节点信息
$ kubectl get nodes
测试
下载私有仓库镜像
创建凭证
# 方式一:通过命令直接创建
$ kubectl create secret docker-registry secret-ali-acr \
--docker-email=sky.nemo@outlook.com \
--docker-username=18487357220 \
--docker-password=20141040Ezra.. \
--docker-server=registry.cn-hangzhou.aliyuncs.com
# 方式二:通过 docker 认证文件创建 secret
$ kubectl create secret generic secret-ali-acr-2 \
--from-file=.dockerconfigjson=/root/.docker/config.json \
--type=kubernetes.io/dockerconfigjson
创建测试 yaml
mkdir -p demo
$ cat << EOF > ./demo/stress-ng.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: stress-ng
labels:
app.kubernetes.io/name: stress-ng
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: stress-ng
template:
metadata:
labels:
app.kubernetes.io/name: stress-ng
spec:
# 业务容器
containers:
- name: stress-ng
image: registry.cn-hangzhou.aliyuncs.com/kmust/stress-ng-alpine:0.14.00-r0
# 资源限制
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
args: ["--cpu", "1", "--vm", "1", "--vm-bytes", "50M"]
imagePullSecrets:
- name: secret-ali-acr
EOF
应用
kubectl apply -f ./demo/stress-ng.yaml
测试域名及网络
mkdir -p demo
$ cat << EOF > ./demo/nginx.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
#securityContext:
# sysctls:
# - name: net.ipv4.tcp_fin_timeout
# value: "30"
# - name: net.ipv4.tcp_keepalive_time
# value: "1200"
# - name: net.ipv4.tcp_keepalive_intvl
# value: "30"
# - name: net.ipv4.tcp_keepalive_probes
# value: "3"
# - name: net.core.somaxconn
# value: "1024"
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
type: ClusterIP
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
EOF
部署 busybox
mkdir -p demo
$ cat << EOF > ./demo/busybox-1.28.4.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox-selector
template:
metadata:
labels:
app: busybox-selector
spec:
restartPolicy: Always
containers:
- name: busybox
command:
- sleep
- "864000"
image: busybox:1.28.4
resources:
requests:
cpu: 50m
memory: "32Mi"
limits:
cpu: 1
memory: "512Mi"
EOF
应用
kubectl apply -f ./demo/nginx.yaml
kubectl apply -f ./demo/busybox-1.28.4.yaml
测试
$ kubectl exec -it deploy/busybox -- sh
/ # nslookup nginx
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
Name: nginx
Address 1: 10.233.62.127 nginx.default.svc.cluster.local
/ # wget -O - nginx.default.svc.kubespray.local
其他设置
放开 DNF 仓库文件
进入之前的虚拟环境
# 进入虚拟环境
$ VENVDIR=kubespray-venv && cd /opt/ && source $VENVDIR/bin/activate && cd /opt/k8s
恢复所有仓库
# 删除当前的仓库配置
ansible k8s_hosts -m shell -a "rm -rf /etc/yum.repos.d/" -b
# 恢复原本的备份
$ ansible k8s_hosts -m copy -a "src=/etc/yum.repos.d.bak-`date +'%Y%m%d'`/ dest=/etc/yum.repos.d remote_src=yes" -b
创建缓存
$ ansible k8s_hosts -m shell -a "dnf makecache" --become
配置 kubectl 命令补全
kubespray 部署默认已安装
安装 Helm(二进制)
每个 Helm 版本都提供了各种操作系统的二进制版本,可以手动下载和安装
# 设置版本
$ HELM_VERSION=v3.18.0
# 设置路径
$ HELM_HOME=/opt/helm
# 创建目录
$ mkdir -p ${HELM_HOME}
# 下载
$ curl -o /usr/local/src/helm-${HELM_VERSION}-linux-amd64.tar.gz \
-L https://get.helm.sh/helm-${HELM_VERSION}-linux-amd64.tar.gz
# 通过代理下载
$ curl -x http://192.168.111.1:10811 \
-o /usr/local/src/helm-${HELM_VERSION}-linux-amd64.tar.gz \
-L https://get.helm.sh/helm-${HELM_VERSION}-linux-amd64.tar.gz
# 解压
$ tar -zxvf /usr/local/src/helm-${HELM_VERSION}-linux-amd64.tar.gz -C ${HELM_HOME}
# 配置路径
$ cat << EOF > /etc/profile.d/helm.sh
export HELM_HOME=${HELM_HOME}
export PATH=\${PATH}:\${HELM_HOME}/linux-amd64/
EOF
$ source /etc/profile
安装 krew
安装 git
命令
$ dnf -y install git
下载安装
# 设置外网 HTTPS_PROXY
export HTTPS_PROXY=http://192.168.111.1:10811
(
set -x; cd "$(mktemp -d)" &&
OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
KREW="krew-${OS}_${ARCH}" &&
curl -k -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
tar zxvf "${KREW}.tar.gz" &&
./"${KREW}" install krew
)
配置路径
$ cat > /etc/profile.d/krew.sh << EOF
# krew enviroment
export PATH="${KREW_ROOT:-$HOME/.krew}/bin:\$PATH"
EOF
$ source /etc/profile
检查
$ kubectl krew
测试安装插件
kubectl krew update
kubectl krew install sniff
安装插件(先下载后安装)
plugin=access-matrix
os=$(uname | tr '[:upper:]' '[:lower:]')
arch=$(uname -m | sed 's/x86_64/amd64/' | sed 's/aarch64/arm64/')
version=$(curl -s https://api.github.com/repos/kubernetes-sigs/krew-index/contents/plugins/$plugin.yaml \
| grep '"download_url"' | cut -d '"' -f 4)
# 下载 yaml
$ curl -O https://raw.githubusercontent.com/kubernetes-sigs/krew-index/master/plugins/$plugin.yaml
# 查看下载地址
$ grep 'uri:' $plugin.yaml
uri: https://github.com/corneliusweig/rakkess/releases/download/v0.5.0/access-matrix-amd64-linux.tar.gz
uri: https://github.com/corneliusweig/rakkess/releases/download/v0.5.0/access-matrix-amd64-darwin.tar.gz
uri: https://github.com/corneliusweig/rakkess/releases/download/v0.5.0/access-matrix-arm64-darwin.tar.gz
uri: https://github.com/corneliusweig/rakkess/releases/download/v0.5.0/access-matrix-amd64-windows.zip
# 下载
$ curl -LO https://github.com/corneliusweig/rakkess/releases/download/v0.5.0/access-matrix-amd64-linux.tar.gz
# 安装
$ kubectl krew install --manifest=$plugin.yaml --archive=access-matrix-amd64-linux.tar.gz
Installing plugin: access-matrix
Installed plugin: access-matrix
\
| Use this plugin:
| kubectl access-matrix
| Documentation:
| https://github.com/corneliusweig/rakkess
| Caveats:
| \
| | Usage:
| | kubectl access-matrix
| | kubectl access-matrix for pods
| /
/
安装 Metric Server
参考(GitHub):kubernetes-sigs/metrics-server
参考(Kubernetes):Resource metrics pipeline | Kubernetes
Metrics Server 是 Kubernetes 内置自动缩放管道的可扩展、高效的容器资源指标来源
Metrics Server 从 Kubelets 收集资源指标,并通过 Metrics API 在 Kubernetes apiserver 中公开这些指标,供 Horizontal Pod Autoscaler 和 Vertical Pod Autoscaler 使用。指标 API 也可以通过 kubectl top [node|pod]
访问
注:Metrics Server 不适用于指标监控,如果需要指标监控,请直接从 Kubelet
/metrics/resource
端点收集指标
部署最新版本
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
指定版本部署(推荐)
下载 yaml 文件
METRICS_SERVER_VERSION=v0.7.2
# 可以提前下载镜像
export HTTPS_PROXY=http://192.168.111.1:10811
nerdctl -n k8s.io pull registry.k8s.io/metrics-server/metrics-server:${METRICS_SERVER_VERSION}
mkdir -p kube-pkg/metrics-server
cd kube-pkg/metrics-server
# 下载
$ curl -LO https://github.com/kubernetes-sigs/metrics-server/releases/download/${METRICS_SERVER_VERSION}/components.yaml
修改配置
# 修改配置
$ vim components.yaml
......
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
# 添加配置:不校验 kubelet 的证书
- --kubelet-insecure-tls
image: registry.k8s.io/metrics-server/metrics-server:v0.7.2
imagePullPolicy: IfNotPresent
......
应用
kubectl apply -f ./components.yaml
检查
检查 Pod 运行状态
$ kubectl get pods -n kube-system
# 输出信息
......
metrics-server-8467fcc7b7-4gnfl 1/1 Running 0 28s
......
检查 Metric Server API 是否正常
$ kubectl top nodes
# 输出信息
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
kube-cp-01.kubespray.local 189m 13% 1436Mi 52%
kube-node-01.kubespray.local 67m 4% 846Mi 97%
kube-node-02.kubespray.local 78m 5% 874Mi 100%
kube-node-03.kubespray.local 77m 5% 904Mi 103%
......
ETCD 操作
控制面节点操作
export ETCD_IPS="192.168.111.191 192.168.111.192 192.168.111.193"
for ip in ${ETCD_IPS};do
ETCDCTL_API=3
etcdctl \
--endpoints=https://${ip}:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/admin-kube-cp-01.pem \
--key=/etc/ssl/etcd/ssl/admin-kube-cp-01-key.pem \
endpoint health;
done
#### 输出信息
https://192.168.111.191:2379 is healthy: successfully committed proposal: took = 30.631813ms
https://192.168.111.192:2379 is healthy: successfully committed proposal: took = 33.708944ms
https://192.168.111.193:2379 is healthy: successfully committed proposal: took = 35.858495ms
for ip in ${ETCD_IPS};do
ETCDCTL_API=3
etcdctl \
--endpoints=https://${ip}:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/admin-kube-cp-01.pem \
--key=/etc/ssl/etcd/ssl/admin-kube-cp-01-key.pem \
endpoint status \
--write-out=table;
done
CoreDNS 设置
新增对外部主机的映射,此处仅作为示例,一般映射的都是非 k8s 的主机
$ kubectl edit configmap coredns -n kube-system
apiVersion: v1
data:
Corefile: |
.:53 {
errors {
}
health {
lameduck 5s
}
ready
hosts {
192.168.111.191 kube-cp-01.kubespray.local
192.168.111.192 kube-node-01.kubespray.local
192.168.111.193 kube-node-02.kubespray.local
192.168.111.194 kube-node-03.kubespray.local
fallthrough
}
kubernetes kubespray.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . 114.114.114.114 223.5.5.5 119.29.29.29 8.8.8.8 {
prefer_udp
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
# 以下为配置示例,根据实际情况配置
db.local:53 {
errors
hosts {
192.168.111.191 mysql-01.db.local
192.168.111.192 mysql-02.db.local
192.168.111.193 mysql-03.db.local
192.168.111.194 mysql-04.db.local
fallthrough
}
}
......
附录
证书操作
查看证书有效期
$ kubeadm certs check-expiration
# 输出信息
[check-expiration] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[check-expiration] Use 'kubeadm init phase upload-config --config your-config.yaml' to re-upload it.
W0408 17:47:55.938978 9625 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Apr 08, 2026 07:13 UTC 364d ca no
apiserver Apr 08, 2026 07:13 UTC 364d ca no
apiserver-kubelet-client Apr 08, 2026 07:13 UTC 364d ca no
controller-manager.conf Apr 08, 2026 07:13 UTC 364d ca no
front-proxy-client Apr 08, 2026 07:13 UTC 364d front-proxy-ca no
scheduler.conf Apr 08, 2026 07:13 UTC 364d ca no
super-admin.conf Apr 08, 2026 07:13 UTC 364d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Apr 06, 2035 07:13 UTC 9y no
front-proxy-ca Apr 06, 2035 07:13 UTC 9y no
更新证书有效期
需要重启 kube-apiserver
、 kube-controller-manager
、 kube-scheduler
、etcd
后才会生效,kubeadm 管理的集群可以选择 重启 kubelet
在所有控制面节点依次执行以下命令
# 更新证书
$ kubeadm certs renew all
# 重启使证书生效
$ systemctl restart kubelet
etcd 证书操作
查看证书有效期
openssl x509 -in /etc/ssl/etcd/ssl/ca.pem -noout --dates
notBefore=Apr 8 07:10:11 2025 GMT
notAfter=Mar 15 07:10:11 2125 GMT
$ openssl x509 -in /etc/ssl/etcd/ssl/admin-kube-cp-01.kubespray.local.pem -noout --dates
notBefore=Apr 8 07:10:12 2025 GMT
notAfter=Mar 15 07:10:12 2125 GMT
Token 操作
当使用 --upload-certs
调用 kubeadm init
时,主控制平面的证书被加密并上传到 kubeadm-certs
Secret 中,生成 Token。其他节点加入集群时,需要提供 Token 和密钥(由 --certificate-key
选项指定)
查看 Token有效期
$ kubeadm token list
# 输出信息
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
abcdef.0123456789abcdef 23h 2024-04-03T05:39:20Z authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
grwcij.envbizeafuqxwajz 1h 2024-04-02T07:39:19Z <none> Proxy for managing TTL for the kubeadm-certs secret <none>
离线升级 pip3
下载 pip3 软件包
下载地址:pip · PyPI
$ wget https://files.pythonhosted.org/packages/47/6a/453160888fab7c6a432a6e25f8afe6256d0d9f2cbd25971021da6491d899/pip-23.3.1-py3-none-any.whl
放到离线主机升级
# 进入虚拟环境
$ VENVDIR=kubespray-venv
$ cd /opt/ && source $VENVDIR/bin/activate
# 升级
$ /opt/kubespray-venv/bin/python3 -m pip install --upgrade ./pip-23.3.1-py3-none-any.whl