Ansible/Docker: Running Ceph on a Cluster

Written by Michael Sevilla

In this post we will configure and start Ceph on a cluster using Ansible and Docker. We assume that you are either using the Docker images provided by the Ceph team (Dockerhub) or you built your own images (we describe that process in Docker: Building Ceph Images) and pushed it to some registry (again we talk about that in Docker: Distributing Ceph Images).

There are many tools that deploy Ceph on a cluster. These tools format disks, set up the keyrings, set up configuration files, and start the daemons. Doing it by hand is very difficult. Most people use ceph-deploy but this installs a bunch of packages on the host machines. Instead we use Docker.

Pre-requisties: Docker and sgdisk (apt-get install gdisk) installed on all nodes.

Using Ansible to Deploy Ceph

ceph-ansible is a tool that configures hardware and software for Ceph. We forked the project and made it less dependent on docker-py.

Because our lab follows the “Popper Convention” to make our research reproducible we save the deploy code, configurations, and benchmarks for all our experiments in a separate repository. We have a repository that helps administrators get started called ceph-popper-template:

~$ git clone --recursive https://github.com/michaelsevilla/ceph-popper-template.git experiment
~$ cd experiment

This repository has submodules that point to ceph-ansible and our own custom roles; configuration files for our Ceph setup; and helper scripts written in bash that deploy Ceph and run the benchmarks.

Configuring your Cluster (hosts file)

Here we specify the IPs and remote users in your cluster. Ansible reads hosts from a hosts file, so edit the site/hosts file:

diff --git a/hosts b/hosts
index 664cf92..245297a 100644
--- a/hosts
+++ b/hosts
@@ -1,11 +1,20 @@
 [osds]
-<ADD OSDs>
+issdm-0 ansible_ssh_user=issdm
+issdm-1 ansible_ssh_user=issdm
+issdm-11 ansible_ssh_user=issdm
+issdm-14 ansible_ssh_user=issdm
+issdm-24 ansible_ssh_user=issdm
+issdm-27 ansible_ssh_user=issdm
+issdm-29 ansible_ssh_user=issdm
+issdm-34 ansible_ssh_user=issdm
+issdm-40 ansible_ssh_user=issdm
 
 [mons]
-<ADD MONs>
+issdm-3 ansible_ssh_user=issdm
 
 [mdss]
-<ADD MDSs>
+issdm-12 ansible_ssh_user=issdm
 
 [clients]
-<ADD CLIENTs>
+issdm-24 ansible_ssh_user=issdm
+issdm-27 ansible_ssh_user=issdm

Warning: make sure that the node that has the experiment directory is listed under [clients] otherwise the cleanup.yml script will kill the Ansible container.

Ceph Ansible requires the proper hostnames; if you are using something like CloudLab make sure that your hostname is not the FQDN:

msevilla@node-1:~$ hostname
node-1.msevilla-qv20111.cephfs-pg0.wisc.cloudlab.us
msevilla@node-1:~$ sudo hostname node-1
msevilla@node-1:~$ hostname
node-1

Finally, with regards to hosts, make sure that you have passwordless SSH from the head node to all other nodes.

Specifying the Ceph Services (site directory)

The site directory has code for deploying Ceph and its components:

~/experiment$ ls -l site/
total 40
-rw-r--r-- 1 issdm issdm  201 Nov  3 16:08 ansible.cfg
-rw-rw-r-- 1 issdm issdm  477 Nov  3 12:57 cephfs.yml
-rw-rw-r-- 1 issdm issdm  155 Nov  4 23:39 ceph_monitor.yml
-rw-rw-r-- 1 issdm issdm  300 Nov  3 12:56 ceph_pgs.yml
-rw-rw-r-- 1 issdm issdm  226 Nov  3 12:57 ceph_wait.yml
-rw-r--r-- 1 issdm issdm  340 Nov  4 21:39 ceph.yml
-rw-r--r-- 1 issdm issdm 1760 Nov  3 15:42 cleanup.yml
drwxr-xr-x 4 issdm issdm 4096 Nov  4 23:26 group_vars
-rw-rw-r-- 1 issdm issdm  560 Nov  3 16:21 hosts
drwxr-xr-x 4 issdm issdm 4096 Nov  3 13:07 roles

The *.yml files are Ansible playbooks that start and configure components: ceph.yml starts Ceph, ceph_pgs.yml configures the placement groups, ceph_wait.yml waits until Ceph reaches a healthy state, ceph_monitor.yml starts daemons that monitor performance and cephfs.yml sets up the file system layer. We separate these components into different playbooks so users can mix and match Ceph services. The other files are Ansible configuration files used by the playbooks.

Users can change the site/ceph.yml to specify which Ceph daemons to launch in the cluster. It uses the hosts file we set up above and is based off the ceph-ansible site file.

Configuring Ceph (site/group_vars directory)

To configure the Ceph cluster with the variables in the Ceph configuration file documentation, change the Ansible group_vars/all file. We also need to specify the image we built in the Docker: Building Ceph Images blog. These can be specified in each configuration file for the daemon but for simplicity we put everything in the global variable file. If you are using an internal registry, you would set up your configuration files like this:

~/experiment$ sed -i "s/<DOCKER USR>/piha.soe.ucsc.edu:5000\/ceph/g" site/group_vars/all
~/experiment$ sed -i "s/<DOCKER IMG>/daemon/g" site/group_vars/all
~/experiment$ sed -i "s/<DOCKER VER>/master/g" site/group_vars/all

If you want to just use the Docker images provided by Ceph use:

~/experiment$ sed -i "s/<DOCKER USR>/ceph/g" site/group_vars/all
~/experiment$ sed -i "s/<DOCKER IMG>/daemon/g" site/group_vars/all
~/experiment$ sed -i "s/<DOCKER VER>/build-master-jewel-ubuntu-14.04/g" site/group_vars/all

Next, configure the network:

diff --git a/site/group_vars/all b/site/group_vars/all
index 3fa5e92..cfa1432 100644
--- a/site/group_vars/all
+++ b/site/group_vars/all
@@ -3,9 +3,9 @@
 ################
 # ceph ansible #
 ################
-monitor_interface: eth1
-ceph_mon_docker_interface: eth1
-ceph_mon_docker_subnet: 192.168.140.0/24
+monitor_interface: eth2
+ceph_mon_docker_interface: eth2
+ceph_mon_docker_subnet: 10.10.1.2/24

Finally, configure the disks:

diff --git a/site/group_vars/osds b/site/group_vars/osds
index bbbae99..1b6ca6c 100644
--- a/site/group_vars/osds
+++ b/site/group_vars/osds
@@ -7,4 +7,4 @@ osd_containerized_deployment: true
 ceph_osd_docker_extra_env: "CEPH_DAEMON=OSD_CEPH_DISK_ACTIVATE"
 ceph_osd_docker_prepare_env: "CEPH_DAEMON=OSD_CEPH_DISK_PREPARE,OSD_FORCE_ZAP=1"
 ceph_osd_docker_devices:
- - /dev/sde
+ - /dev/sdc

Starting Ceph

The deploy script copies the configuration files to the Ceph Ansible directory and deploys Ceph on the cluster using our Docker image.

./deploy.sh

We configured our cluster with 1 MDS, 9 OSDs, and 1 MON. We bumped the number of placement groups to 512, as per the Placement Group Preselection Guide. After Ceph finishes starting up we should see:

cluster e9570dd8-03ad-45f0-8a74-ec9b3bb7095f
 health HEALTH_OK
 monmap e1: 1 mons at {issdm-3=192.168.140.224:6789/0}
        election epoch 3, quorum 0 issdm-3
  fsmap e5: 1/1/1 up {0=issdm-12=up:active}
    mgr no daemons active
 osdmap e24: 9 osds: 9 up, 9 in
        flags sortbitwise,require_jewel_osds,require_kraken_osds
  pgmap v43: 1088 pgs, 3 pools, 2148 bytes data, 20 objects
        91121 MB used, 1648 GB / 1737 GB avail
            1087 active+clean
               1 active+clean+replay
 recovery io 0 B/s, 2 keys/s, 0 objects/s

Great! It looks like Ceph is healthy and running.



Jekyll theme inspired by researcher

Don't click on this easter egg: A juicy easter egg!