网站上线稳定后的工作,东莞网站建设技术支持,wordpress 获取头像,买网站做设计参考属于什么费用提示#xff1a;文章写完后#xff0c;目录可以自动生成#xff0c;如何生成可参考右边的帮助文档 文章目录 前言一、环境信息二、部署步骤2.1 基础环境准备2.2 各节点docker环境安装2.3 搭建互信集群2.4 下载ceph-ansible 三、配置部署文件3.1 使用本地docker3.2 配置hosts… 提示文章写完后目录可以自动生成如何生成可参考右边的帮助文档 文章目录 前言一、环境信息二、部署步骤2.1 基础环境准备2.2 各节点docker环境安装2.3 搭建互信集群2.4 下载ceph-ansible 三、配置部署文件3.1 使用本地docker3.2 配置hosts主机清单文件3.3 配置group_vars/all.yml文件3.4 开始部署3.5 部署ceph-common软件包3.6 部署结果 四、相关实验4.1 测试删除osd4.2 测试增加osd4.3 将实验4.1中移除的osd更换硬盘后重新加回集群4.4 新增一个只是osd功能的节点4.5 删除新增的node04节点 总结 前言
记录一下使用ceph-ansible部署ceph14版本nautilus的过程。
ceph-ansible官网地址https://docs.ceph.com/projects/ceph-ansible/en/latest/osds/scenarios.html 一、环境信息
操作系统版本centos7.9
机器-磁盘信息表格
机器名称机器IP磁盘一盘符磁盘二盘符磁盘三盘符磁盘四盘符磁盘五盘符node01192.168.150.72/dev/vdb//dev/vdc//dev/vdd/node02192.168.150.73/dev/vdb//dev/vdc//dev/vdd//dev/vde/node03192.168.150.74/dev/vdb//dev/vdc//dev/vdd//dev/vde//dev/vdf/
二、部署步骤
2.1 基础环境准备
基础环境的部署参考 https://blog.csdn.net/baidu_35848778/article/details/145564790
2.2 各节点docker环境安装
我这里的docker配置了自己本地的harbor仓库镜像都是本地化了的现在国内源pull不太好使最好是能自己提前下载好本地化一下来使用。
# 安装docker服务
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo;
# 修改cgroupdriver
mkdir -p /etc/docker/;
cat /etc/docker/daemon.json EOF
{insecure-registries: [http://harbor.XXX.XX.XX:10002],exec-opts:[native.cgroupdriversystemd],log-driver:json-file,log-opts:{max-size:100m}
}
EOF# 安装软件包
yum install docker-ce docker-ce-cli -y;
# 启动服务设置自启动
systemctl restart docker;
systemctl enable docker;# 登录仓库
docker login http://harbor.XXX.XX.XX:100022.3 搭建互信集群
搭建互信的方式各不相同我这边使用的是收集分发authorized_keys的方式。 各节点修改/etc/hosts文件
cat EOF /etc/hosts
192.168.150.72 node01
192.168.150.73 node02
192.168.150.74 node03
EOF各节点生成密钥
ssh-keygen -f ~/.ssh/id_rsa -P -q主节点(72节点)发送密钥到各个节点
yum install -y sshpasssshpass -p password ssh-copy-id -i /root/.ssh/id_rsa.pub -o StrictHostKeyCheckingno root192.168.150.72sshpass -p password ssh-copy-id -i /root/.ssh/id_rsa.pub -o StrictHostKeyCheckingno root192.168.150.73sshpass -p password ssh-copy-id -i /root/.ssh/id_rsa.pub -o StrictHostKeyCheckingno root192.168.150.74主节点(72节点)收集各节点密钥
ssh root192.168.150.73 cat ~/.ssh/id_rsa.pub /root/.ssh/authorized_keysssh root192.168.150.74 cat ~/.ssh/id_rsa.pub /root/.ssh/authorized_keys主节点(72节点)推送密钥汇集文件到各个节点
scp /root/.ssh/authorized_keys 192.168.150.73:/root/.ssh/scp /root/.ssh/authorized_keys 192.168.150.74:/root/.ssh/2.4 下载ceph-ansible
下载安装包 国内不好访问的话 我是直接买了一个阿里云的香港的抢占式虚拟机下载的
yum install python2-pip ansible git python-netaddr -y
mkdir -p /data/installceph/ cd /data/installceph/
git config --global http.postBuffer 5242880
git clone https://github.com/ceph/ceph-ansible.git
cd ceph-ansible
# 切换分支,需要部署的是14 nautilus版本
git checkout stable-4.0
相关版本信息 stable-3.0 Supports Ceph versions jewel and luminous. This branch requires Ansible version 2.4. stable-3.1 Supports Ceph versions luminous and mimic. This branch requires Ansible version 2.4. stable-3.2 Supports Ceph versions luminous and mimic. This branch requires Ansible version 2.6. stable-4.0 Supports Ceph version nautilus. This branch requires Ansible version 2.9. stable-5.0 Supports Ceph version octopus. This branch requires Ansible version 2.9. stable-6.0 Supports Ceph version pacific. This branch requires Ansible version 2.10. stable-7.0 Supports Ceph version quincy. This branch requires Ansible version 2.12. main Supports the main (devel) branch of Ceph. This branch requires Ansible version 2.12.
三、配置部署文件
3.1 使用本地docker
/data/installceph/ceph-ansible/roles/ceph-container-engine/tasks/pre_requisites/prerequisites.yml
#- name: install container packages
# package:
# name: [{{ container_package_name }}, {{ container_binding_name }}]
# update_cache: true
# register: result
# until: result is succeeded
# tags: with_pkg3.2 配置hosts主机清单文件
由于是个各节点的硬盘信息不同
cat EOF /data/installceph/ceph-ansible/hosts
[mons]
node01
node02
node03[osds]
node01 devices[/dev/vdb,/dev/vdc,/dev/vdd]
node02 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde,/dev/vdf][mgrs]
node01
node02
node03[mdss]
node01
node02
node03[clients]
node01[rgws]
node01[grafana-server]
node01
EOF3.3 配置group_vars/all.yml文件
\cp /data/installceph/ceph-ansible/group_vars/all.yml.sample /data/installceph/ceph-ansible/group_vars/all.yml
cat EOF /data/installceph/ceph-ansible/group_vars/all.yml######################################################
# INSTALL OPTIONS BY USER #
# #
####################################################### Install options
# -----------------------------
ceph_origin: repository
ceph_repository: community
ceph_mirror: http://mirrors.aliyun.com/ceph
ceph_stable_key: http://mirrors.aliyun.com/ceph/keys/release.asc
ceph_stable_release: nautilus
ceph_stable_repo: {{ ceph_mirror }}/rpm-{{ ceph_stable_release }}
# -----------------------------ceph_docker_registry: harbor.XXX.XX.XX:10002
#node_exporter_container_image: prom/node-exporter:v0.17.0
#grafana_container_image: grafana/grafana:5.4.3
#prometheus_container_image: prom/prometheus:v2.7.2
#alertmanager_container_image: prom/alertmanager:v0.16.2# Ceph options
# -----------------------------
generate_fsid: true
ceph_conf_key_directory: /etc/ceph
cephx: true
# -----------------------------# Client options
# -----------------------------
rbd_cache: false
rbd_client_log_path: /var/log/ceph
# ----------------------------# Monitor options
# -----------------------------
monitor_interface: eth0
# ----------------------------# OSD options
# -----------------------------
journal_size: 5120
public_network: 192.168.150.0/24
cluster_network: 192.168.150.0/24
osd_objectstore: bluestore
# -----------------------------# MDS options
# -----------------------------
radosgw_interface: eth0
# -----------------------------# Testing mode
# -----------------------------
#common_single_host_mode: true
# -----------------------------# DOCKER options
# -----------------------------
ceph_docker_image: ceph/daemon
ceph_docker_image_tag: latest-nautilus
containerized_deployment: true
# -----------------------------# DASHBOARD options
# -----------------------------
dashboard_enabled: False
dashboard_protocol: http
dashboard_port: 8443
dashboard_admin_user: admin
dashboard_admin_password: admin123456
grafana_admin_user: admin
grafana_admin_password: admin
# -----------------------------
EOF
3.4 开始部署
cp site-docker.yml.sample site-docker.ymlansible-playbook -i /data/installceph/ceph-ansible/hosts /data/installceph/ceph-ansible/site-docker.yml3.5 部署ceph-common软件包
因为更习惯于在本地执行ceph命令所以安装ceph-common
yum install epel-release -ycat END /etc/yum.repos.d/ceph.repo
[Ceph]
nameCeph packages for \$basearch
baseurlhttp://mirrors.aliyun.com/ceph/rpm-nautilus/el7/\$basearch
enabled1
gpgcheck1
typerpm-md
gpgkeyhttps://download.ceph.com/keys/release.asc[Ceph-noarch]
nameCeph noarch packages
baseurlhttp://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
enabled1
gpgcheck1
typerpm-md
gpgkeyhttps://download.ceph.com/keys/release.asc[ceph-source]
nameCeph source packages
baseurlhttp://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
enabled1
gpgcheck1
typerpm-md
gpgkeyhttps://download.ceph.com/keys/release.asc
ENDyum clean all
yum makecacheyum install -y ceph-common.x86_64
3.6 部署结果
osd部署结果符合预期
[rootnode01 ceph-ansible]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.17224 root default
-3 0.29306 host node01 0 hdd 0.09769 osd.0 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000
-7 0.39075 host node02 1 hdd 0.09769 osd.1 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000
-5 0.48843 host node03 2 hdd 0.09769 osd.2 up 1.00000 1.00000 5 hdd 0.09769 osd.5 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000
10 hdd 0.09769 osd.10 up 1.00000 1.00000
11 hdd 0.09769 osd.11 up 1.00000 1.00000四、相关实验
4.1 测试删除osd
实验设计模拟osd.11异常无法提供服务时的移除操作
# 命令e.g.ansible-playbook -vv -i hosts infrastructure-playbooks/shrink-osd.yml -e osd_to_kill1,2,3# 实验命令ansible-playbook -vv -i hosts infrastructure-playbooks/shrink-osd.yml -e osd_to_kill11实验结果
Thursday 13 February 2025 15:33:26 0800 (0:00:00.373) 0:00:31.086 *****
ok: [node01] changedfalse cmd:- docker- exec- ceph-mon-node01- ceph- --cluster- ceph- -sdelta: 0:00:00.547188end: 2025-02-13 15:33:27.087717rc: 0start: 2025-02-13 15:33:26.540529stderr: stderr_lines: omittedstdout: |2-cluster:id: 84a44515-64c1-4f5c-b9c5-a0cc3e797074health: HEALTH_WARNDegraded data redundancy: 28/627 objects degraded (4.466%), 7 pgs degradedservices:mon: 3 daemons, quorum node01,node02,node03 (age 76m)mgr: node02(active, since 74m), standbys: node01, node03mds: cephfs:1 {0node03up:active} 2 up:standbyosd: 11 osds: 11 up (since 14s), 11 in (since 16s); 1 remapped pgsrgw: 1 daemon active (node01.rgw0)task status:data:pools: 6 pools, 144 pgsobjects: 209 objects, 3.4 KiBusage: 11 GiB used, 1.1 TiB / 1.1 TiB availpgs: 28/627 objects degraded (4.466%)135 activeclean3 activerecovery_waitdegraded3 activerecoveringdegraded2 activerecovery_wait1 activerecovery_waitundersizeddegradedremappedio:recovery: 3 B/s, 1 keys/s, 2 objects/sprogress:Rebalancing after osd.11 marked out[............]stdout_lines: omittedTASK [show ceph osd tree] **************************************************************************************************************************************************
task path: /data/installceph/ceph-ansible/infrastructure-playbooks/shrink-osd.yml:254
Thursday 13 February 2025 15:33:27 0800 (0:00:00.999) 0:00:32.085 *****
ok: [node01] changedfalse cmd:- docker- exec- ceph-mon-node01- ceph- --cluster- ceph- osd- treedelta: 0:00:00.560455end: 2025-02-13 15:33:28.017771rc: 0start: 2025-02-13 15:33:27.457316stderr: stderr_lines: omittedstdout: |-ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 1.07455 root default-3 0.29306 host node010 hdd 0.09769 osd.0 up 1.00000 1.000003 hdd 0.09769 osd.3 up 1.00000 1.000006 hdd 0.09769 osd.6 up 1.00000 1.00000-7 0.39075 host node021 hdd 0.09769 osd.1 up 1.00000 1.000004 hdd 0.09769 osd.4 up 1.00000 1.000007 hdd 0.09769 osd.7 up 1.00000 1.000009 hdd 0.09769 osd.9 up 1.00000 1.00000-5 0.39075 host node032 hdd 0.09769 osd.2 up 1.00000 1.000005 hdd 0.09769 osd.5 up 1.00000 1.000008 hdd 0.09769 osd.8 up 1.00000 1.0000010 hdd 0.09769 osd.10 up 1.00000 1.00000stdout_lines: omitted
META: ran handlersPLAY RECAP *****************************************************************************************************************************************************************
node01 : ok19 changed3 unreachable0 failed0 skipped12 rescued0 ignored0
node02 : ok1 changed0 unreachable0 failed0 skipped0 rescued0 ignored0
node03 : ok1 changed0 unreachable0 failed0 skipped0 rescued0 ignored0 移除完毕后将主机配置文件中osd对应的硬盘信息移除
[osds]
node01 devices[/dev/vdb,/dev/vdc,/dev/vdd]
node02 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
# 移除osd11前的记录node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde,/dev/vdf]
# 下列为移除osd11之后的记录
node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]4.2 测试增加osd
实验设计在node01节点增加一个新硬盘名为/dev/vde的osd 将主机配置文件中新增osd对应的硬盘信息
[osds]
node01 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node02 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
# 移除osd11前的记录node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde,/dev/vdf]
# 下列为移除osd11之后的记录
node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]执行命令
命令e.g.ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit osd-node-name实验命令ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit node01实验结果
[rootnode01 ceph-ansible]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.17224 root default
-3 0.39075 host node01 0 hdd 0.09769 osd.0 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000
11 hdd 0.09769 osd.11 up 1.00000 1.00000
-7 0.39075 host node02 1 hdd 0.09769 osd.1 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000
-5 0.39075 host node03 2 hdd 0.09769 osd.2 up 1.00000 1.00000 5 hdd 0.09769 osd.5 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000
10 hdd 0.09769 osd.10 up 1.00000 1.000004.3 将实验4.1中移除的osd更换硬盘后重新加回集群
将主机配置文件中新增osd对应的硬盘信息
[osds]
node01 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node02 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde,/dev/vdf]执行命令
命令e.g.ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit osd-node-name实验命令ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit node03实验结果
[rootnode01 ceph-ansible]# ceph -s cluster:id: 84a44515-64c1-4f5c-b9c5-a0cc3e797074health: HEALTH_OKservices:mon: 3 daemons, quorum node01,node02,node03 (age 27m)mgr: node02(active, since 2h), standbys: node01, node03mds: cephfs:1 {0node02up:active} 2 up:standbyosd: 13 osds: 13 up (since 69s), 13 in (since 69s)rgw: 1 daemon active (node01.rgw0)task status:data:pools: 6 pools, 144 pgsobjects: 209 objects, 3.4 KiBusage: 13 GiB used, 1.3 TiB / 1.3 TiB availpgs: 144 activeclean[rootnode01 ceph-ansible]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.26993 root default
-3 0.39075 host node01 0 hdd 0.09769 osd.0 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000
11 hdd 0.09769 osd.11 up 1.00000 1.00000
-7 0.39075 host node02 1 hdd 0.09769 osd.1 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000
-5 0.48843 host node03 2 hdd 0.09769 osd.2 up 1.00000 1.00000 5 hdd 0.09769 osd.5 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000
10 hdd 0.09769 osd.10 up 1.00000 1.00000
12 hdd 0.09769 osd.12 up 1.00000 1.00000
4.4 新增一个只是osd功能的节点
前提先把基础环境安装好然后进行互信集群的扩容我这边就不展示互信的操作了。
将主机配置文件中新增osd节点及对应的硬盘信息
[osds]
node01 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node02 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]
node03 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde,/dev/vdf]
node04 devices[/dev/vdb,/dev/vdc,/dev/vdd,/dev/vde]执行命令
命令e.g.ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit osd-node-name实验命令ansible-playbook -i /data/installceph/ceph-ansible/hosts site-docker.yml --limit node04实验结果
[rootnode01 ceph-ansible]# ceph -s cluster:id: 84a44515-64c1-4f5c-b9c5-a0cc3e797074health: HEALTH_OKservices:mon: 3 daemons, quorum node01,node02,node03 (age 63s)mgr: node02(active, since 2h), standbys: node01, node03mds: cephfs:1 {0node02up:active} 2 up:standbyosd: 17 osds: 17 up (since 111s), 17 in (since 111s)rgw: 1 daemon active (node01.rgw0)task status:data:pools: 6 pools, 144 pgsobjects: 209 objects, 3.4 KiBusage: 17 GiB used, 1.6 TiB / 1.7 TiB availpgs: 144 activeclean[rootnode01 ceph-ansible]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.66068 root default
-3 0.39075 host node01 0 hdd 0.09769 osd.0 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000
11 hdd 0.09769 osd.11 up 1.00000 1.00000
-7 0.39075 host node02 1 hdd 0.09769 osd.1 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000
-5 0.48843 host node03 2 hdd 0.09769 osd.2 up 1.00000 1.00000 5 hdd 0.09769 osd.5 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000
10 hdd 0.09769 osd.10 up 1.00000 1.00000
12 hdd 0.09769 osd.12 up 1.00000 1.00000
-9 0.39075 host node04
13 hdd 0.09769 osd.13 up 1.00000 1.00000
14 hdd 0.09769 osd.14 up 1.00000 1.00000
15 hdd 0.09769 osd.15 up 1.00000 1.00000
16 hdd 0.09769 osd.16 up 1.00000 1.00000
4.5 删除新增的node04节点
实验设计先删除node04节点上的全部osd再删除掉host node04 执行命令
命令e.g.ansible-playbook -vv -i hosts infrastructure-playbooks/shrink-osd.yml -e osd_to_kill1,2,3ansible-playbook -vv -i hosts infrastructure-playbooks/shrink-osd.yml -e osd_to_kill13,14,15,16实验结果 osd都删除掉了但是这个host还在在playbook列表里面也没有找到类似的playbook个人猜测可能是版本较早且这个功能场景不太常见的原因。
[rootnode01 ceph-ansible]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.26993 root default
-3 0.39075 host node01 0 hdd 0.09769 osd.0 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000
11 hdd 0.09769 osd.11 up 1.00000 1.00000
-7 0.39075 host node02 1 hdd 0.09769 osd.1 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000
-5 0.48843 host node03 2 hdd 0.09769 osd.2 up 1.00000 1.00000 5 hdd 0.09769 osd.5 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000
10 hdd 0.09769 osd.10 up 1.00000 1.00000
12 hdd 0.09769 osd.12 up 1.00000 1.00000
-9 0 host node04 总结
记录一下