Cluster setup

Requirements

The following requirements must be met before a CipherMail cluster can be setup

  1. At least 3 servers (nodes) should be setup, the network configured and the initial setup wizard should be finished.
  2. Every node from the cluster is configured with a fully qualified hostname.
  3. Every host can lookup the IP address of any other node.
  4. Every host can access any other node on tcp port 22, 4444, 4567 and 4568.

To fulfill requirement 3, the best option would be to add the hostnames to the DNS. If this is not feasible, hostname to IP address mapping should be added to the hosts file on every node.

Hostname mapping

Important

Only use explicit host mapping if the hostnames cannot be added to DNS

On every node, do the following:

  1. Open the hosts page (Admin ‣ Network ‣ Hosts)

  2. For every node, add the IP to Hostname mapping

    Example:
    node1.example.com 10.7.7.100
    node2.example.com 10.7.7.101
    node3.example.com 10.7.7.102

Configure cluster

To configure the cluster, use the following procedure:

  1. Configure SSH authentication.
  2. Configure which hosts should be managed by the control node.
  3. Configure which hosts are part of the cluster.
  4. Run the Ansible playbook

Configure SSH authentication

The cluster will be configured with Ansible. The first node if the cluster will be configured as the control node. The control node will configure all other nodes. The control node must be able to login to the other nodes on SSH as the root user with password-less authentication. SSH client keys should therefore be exchanged between the nodes.

To configure key based SSH authentication use the following procedure:

  1. Login to the control node (node 1) with SSH.
  2. Copy the SSH public key from the control node.
  3. Assign the SSH public key to the root user of node 2 and node 3.
  4. Test password-less login

Login to the control node with SSH

With SSH log into the control node (node 1) and open the command line (File ‣ Open shell)

Copy the SSH public key

sudo cat /root/.ssh/id_rsa.pub

The output from the command should look similar to:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDcBjNzsXe8nZ8Sy9j9vvBgD08FOnNB85sjdpm8Rj1LIF...

Copy the complete SSH key.

Assign the SSH public key to the root user

The copied SSH public key should be assigned to the root user of node 2 and node 3.

  1. Login into the cockpit app on node 2. The cockpit app can be accessed on https://node2.example.com:9090 (replace node2.example.com by the correct hostname).

  2. Open account settings for the root user (Accounts ‣ root)

    Root account
  3. Click Add key and paste the SSH key copied in the previous step.

  4. Repeat the above steps for node 3.

Test password-less login

  1. Login to the control node (node 1) with SSH.

  2. On the command line try to login as root on node 2

    sudo ssh root@node2.example.com
    

    Check the fingerprint and select yes if being asked to continue. Check if the login was successful.

    Note

    Logging into node 2 should not ask for the password of node 2. However because the command runs with sudo, you might have to provide the password for the local user.

  3. Logout of node 2

    exit
    
  4. Repeat the above steps for node 3.

Configure which hosts should be managed by the control node

The control node will be responsible for configuring all nodes. The hostnames of all the nodes should therefore be added to the Ansible hosts inventory file.

sudo vim /etc/ciphermail/ansible/hosts

The hostnames of node 2 and node 3 should be added under ciphermail_all and ciphermail_gateway (if configuring Webmail Messenger, use ciphermail_webmail).

Example:
localhost ansible_connection=local

[ciphermail_all]
localhost
node2.example.com
node3.example.com

[ciphermail_gateway]
localhost
node2.example.com
node3.example.com

Note

Because the control node is accessible via localhost, the fully qualified hostname of the control node should not be added.

Configure which hosts are part of the cluster

The list of hostnames of all the cluster nodes should be added as an Ansible variable override.

echo 'common__mysql_cluster_nodes: [node1.example.com,node2.example.com,node3.example.com]' | sudo tee /etc/ciphermail/ansible/group_vars/all/cluster.yml

The file /etc/ciphermail/ansible/group_vars/all/cluster.yml should look similar to:

common__mysql_cluster_nodes: [node1.example.com,node2.example.com,node3.example.com]

Important

It is important that the control node is listed as the first entry in the list.

Run the Ansible playbook

The cluster will be configured by Ansible when the playbook is run

sudo cm-run-playbook

The Ansible playbook will configure the local firewall, generate private keys for MariaDB, configure MariaDB etc. If succesful, the playbook recap should look like:

PLAY RECAP *******************************************************************************************************
localhost                  : ok=99   changed=21   unreachable=0    failed=0    skipped=8    rescued=0    ignored=1
node2.example.com          : ok=98   changed=20   unreachable=0    failed=0    skipped=8    rescued=0    ignored=1
node3.example.com          : ok=98   changed=20   unreachable=0    failed=0    skipped=8    rescued=0    ignored=1

To check if all the nodes of the cluster are active, use the following command

sudo cm-cluster-control --show

Cluster size should report that 3 nodes are active

+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 8                                    |
| wsrep_cluster_size       | 3                                    |
| wsrep_cluster_state_uuid | 823e389b-eb11-11eb-9b32-d3c924e58f21 |
| wsrep_cluster_status     | Primary                              |
| wsrep_connected          | ON                                   |
| wsrep_gcomm_uuid         | 95e32cea-eb11-11eb-abd2-2bef173638db |
| wsrep_last_committed     | 0                                    |
| wsrep_local_state_uuid   | 823e389b-eb11-11eb-9b32-d3c924e58f21 |
| wsrep_ready              | ON                                   |
+--------------------------+--------------------------------------+
+-----------------------+---------------------------------------------------------------+
| Variable_name         | Value                                                         |
+-----------------------+---------------------------------------------------------------+
| wsrep_cluster_address | gcomm://node1.example.com,node2.example.com,node3.example.com |
| wsrep_cluster_name    | ciphermail                                                    |
| wsrep_node_address    | node1.example.com                                             |
| wsrep_node_name       | node1.example.com                                             |
+-----------------------+---------------------------------------------------------------+

Troubleshooting

If the playbook runs into an issue with one of the nodes, the play recap will report a failure. If the playbook fails, it is advised to reset the cluster configuration and go over all the required steps and then re-run the playbook.

To reset the complete cluster config, run the following commands:

export ANSIBLE_CONFIG="/usr/share/ciphermail-ansible/ansible.cfg"
sudo -E ansible -m command -a 'rm /etc/my.cnf.d/ciphermail-cluster.cnf /var/lib/mysql/grastate.dat /etc/pki/tls/private/ciphermail.key /etc/pki/tls/private/ciphermail.pem /etc/pki/tls/certs/ciphermail.crt /etc/pki/tls/certs/ciphermail-ca.crt' ciphermail_all

Warning

Only run the above command when setting up the cluster. Do not run this on a already configured and functional cluster.

Then redo all the steps to setup the cluster and re-run the playbook.