Cluster node sync-on-boot technology

From Rost Lab Open

Description

Sync-on-boot technology allows you to use standard root file systems to:

  • initialize computer cluster nodes
  • synchronize cluster nodes after an update to the standard

Its purpose is to reduce cluster node maintenance by:

  • keeping a central standard of the root file system of a node
  • automatically synchronizing the node to the standard upon node reboot

Master

Depends

  • bind9
  • dhcp3-server
  • initramfs-tools
  • make
  • nfs-kernel-server
  • syslinux
  • tftpd-hpa
  • util-linux

Recommends

  • shorewall
  • tftp-hpa

Node

Depends

  • busybox
  • e2fsprogs
  • rsync
  • sed
  • util-linux

Recommends

  • syslinux

Configuration

I recommend you perform the following steps in the order they are presented here.

DNS

You are going to need name services for the cluster nodes. Configure forward and reverse name resolution.

DHCP

You are going to need DHCP services as well. Have a look at this example DHCP server configuration file.

TFTP

During booting TFTPD serves the nodes' requests for kernel and initramdisk downloads.

Verify that your TFTP server is ready to answer calls. Usually you should have a line like this in /etc/inetd.conf:

tftp           dgram   udp     wait    root  /usr/sbin/in.tftpd /usr/sbin/in.tftpd -s /var/lib/tftpboot

/var/lib/tftpboot is the root of the file system visible to nodes during booting. The value of the dhcpd.conf filename statement is relative to this root.

  • Customize the -s option above (/var/lib/tftpboot) as necessary

SYSLINUX/PXELINUX

Network boot loader (docs: /usr/share/doc/syslinux/pxelinux.txt.gz). If following our DHCPC example:

  1. Copy /usr/lib/syslinux/pxelinux.0 to /var/lib/tftpboot/cluster_node/pxelinux.0
  2. mkdir /var/lib/tftpboot/cluster_node/pxelinux.cfg
  3. Have at least these 2 files in pxelinux.cfg:
    1. pxelinux.cfg/default
    2. pxelinux.cfg/C0A800FD - configuration for the update node
  4. Have pxelinux-common in /var/lib/tftpboot/cluster_node. Customize this file to match your kernel.

Standard node root file system

Prepare the standard root file system. We will assume that it is exported - as in our case - from /srv/nfs4/clusternoderoot.

/etc/clustcontrol

This directory (on the master) holds node configuration parameters used during initialization and synchronization. Different standard node roots share this control directory.

  • Have at least these files in the directory:

NFS exports

  1. Bind mount /etc/clustcontrol to /srv/nfs4/clusternoderoot/etc/clustcontrol
    • This is because we want the standard roots to share just one /etc/clustcontrol
  2. NFS export to the nodes:
    1. /srv/nfs4/clusternoderoot
      • Use the appropriate 'exports' flag (crossmnt) to 'unhide' the bind-mounted /srv/nfs4/clusternoderoot/etc/clustcontrol
    2. /srv/nfs4/clusternoderoot/etc/clustcontrol

Initramdisk

Work in your standard node root file system for this step. You can do this either by working on the host where the standard is prepared or by doing a chroot on the server that stores the file system.

  1. Add these files to /srv/nfs4/clusternoderoot/etc/initramfs-tools:
  2. Make the above files except conf.d/clustcontrol executable
  3. Have initramfs.conf in /srv/nfs4/clusternoderoot/etc/initramfs-tools look like the example under the link
  4. Create the initramdisk (e.g. update-initramfs -k all -u)

Kernel and initramdisk for TFTP

In this step we copy the kernel and the new initramdisk to the appropriate location within the TFTPD root directory /var/lib/tftpboot.

Either:

  • Manually copy them over to /var/lib/tftpboot/cluster_node/ (you will have enough of this after the 3rd use)

or

  • Place this Makefile into /var/lib/tftpboot/cluster_node, edit it as necessary and just run 'make' ever after.

Network interfaces

  • It is best to disable the udev 75-persistent-net-generator.rules on the nodes. Prepend
    GOTO="persistent_net_generator_end"
    to /etc/udev/rules.d/75-persistent-net-generator.rules. If you do not have this file, make a copy of /lib/udev/rules.d/75-persistent-net-generator.rules.
  • Look at my /etc/network/interfaces. You will need the 'manual' method on the interface that gets configured in the initramdisk to get for example NFS work properly after a node reboot. This solution works both for the update node and for nodes in regular mode. Asking for 'dhcp' configuration would break the update node.

/etc/fstab

  • Comment or remove the line for the root file system from /etc/fstab on the standard node root. This line is invalid when you use nfsroot and it would confuse fsck in regular operation. The root gets mounted in the initrd and that does not depend on /etc/fstab at all.

Bios

  • Configure network booting on the nodes.

Boot

GRUB

You do not need grub. Remove it. When doing a kernel update dpkg may complain that update-grub is missing. Have this for /usr/sbin/update-grub:

#!/bin/sh
# lkajan: we do not want to use grub on the nodes - we use pxelinux and syslinux.
exit 0;

You are done. Boot the nodes.

Update node

The purpose of this node is to make it easy to update the standard node root. Upon booting it has an NFS root that mounts the standard node root. Any change on this node directly affects the standard node root and gets rsync'd to nodes upon reboot.

You can control the existence and identity of the update node in your dhcpd.conf file. Whichever node gets the IP address 192.168.0.253 (or C0A800FD in hexadecimal; updatenode.rostclust resolves to this address) boots as the update node. That is because PXE loads the configuration file pxelinux.cfg/C0A800FD for a client with the above IP address and that configuration file in turn directs PXE to boot 'linuxnfsroot' by default. 'linuxnfsroot' in pxelinux-common is parametrized (boot=nfs) to have NFS root.

Extras

Booting a node from its own hard drive

You can make the nodes' own hard drives bootable like this:

  1. Add this to the /etc/rc.local of the nodes.

Advanced

This system allows you to have multiple standard node roots for different kinds of nodes.