Highly available cluster with Wackamole and Spread

Wackamole is an application that helps with making a cluster highly available. This application was developed at the Centre for Networking and Distributed Systems at The John Hopkins University and it works by managing a poll of Virtual IP addresses ensuring that all the IPs are evenly assigned to the available machines in the cluster. When one machine fails Wackamole will, almost instantaneously, re-assign it’s Virtual IP to one of the remaining machines. This allows one to publish a list of IP Addresses through DNS RR (Round Robin) records and Wackamole will make sure those IPs are always available, even in the event of one or more machines crashing. Wackamole is licensed under the CNDS Open Source License.

Wackamole runs on top of Spread which is a resilient and fault tolerant network messaging system:

Spread is an open source toolkit that provides a high performance messaging service that is resilient to faults across local and wide area networks. Spread functions as a unified message bus for distributed applications, and provides highly tuned application-level multicast, group communication, and point to point support. Spread services range from reliable messaging to fully ordered messages with delivery guarantees.. http://www.spread.org/

This sounds cool, hein? Well, it gets even better as we will learn how easy it is to configure a cluster to run Wackamole. For this example, I’ve used just a couple of machines running Ubuntu (8.04) because it is fairly easy to deploy Wackamole, but if you are familiar with ./configure, make and make install, this should be a piece of cake!

1. Install Wackamole

The first step is to get Wackamole installed on all the machines in the cluster. If you like apt-get, you will love this step:
$ sudo apt-get install wackamole
This will install Spread and Wackamole on your machine. Alternatively you can download the source code from http://www.cnds.jhu.edu/download/download_wackamole.cgi

Obviously, we need to repeat this step for all the machines in your cluster :)

2. Configure Spread

Now we need to tell Spread which machines are available inside the cluster. For this example, let’s use a simple two machine cluster (www1 and www2) available under the IP addresses 192.168.1.10 and 192.168.1.11

$ sudo vi /etc/default/spread
# Change to enable spread
ENABLED=1
# Options, see spread.1 for list
OPTIONS="-n www1" # www1 or www2 depending on the machine


$ sudo vi /etc/spread/spread.conf

Spread_Segment 192.168.1.255:4803 {
www1 192.168.1.10
www2 192.168.1.11
}
DebugFlags = { PRINT EXIT }
EventLogFile = /var/log/spread.log
EventTimeStamp
DangerousMonitor = false
DaemonUser = spread
DaemonGroup = spread

Once we’ve changed the configuration on one machine, we need to copy it over to the other machines and restart spread:
$ sudo /etc/init.d/spread restart

3. Test Spread configuration

At this point, we should be able to confirm that Spread is up and running and that both machines are able to communicate. Spread comes installed with a command line utility (spuser) that can be used to test it. When starting it (on any of the machines in the cluster) you should get the following screen:

$ spuser

==========
User Menu:
———-

j <group> — join a group
l <group> — leave a group

s <group> — send a message
b <group> — send a burst of messages

r — receive a message (stuck)
p — poll for a message
e — enable asynchonous read (default)
d — disable asynchronous read

q — quit

User>

Now try starting spuser on both machines in the cluster and type the following instruction on spuser shell:

User>j test
User>s test
enter message: Hello World!

After joining the test group on both machines, we should now be able to send messages to the group and see them showing up instantaneously on all Spread clients. Pretty awesome :)

4. Configure Wackamole

The next step is to configure Wackamole on all machines. Like we did with Spread, we start by editing /etc/default/wackamole to make sure that Wackamole is enabled and that will run when the machine starts:

$ sudo vi /etc/default/wackamole
# Change to enable wackamole
ENABLED=1
# Options
OPTIONS=""

The next configuration file (/etc/wackamole.conf) is, luckily, very straight forward and a few changes to the default configurations are enough to get us on the right track.

Spread - Port where Spread is listening
SpreadRetryInterval - Amount of time between attempts to connect to Spread
Group - Spread group to join
Control - Location of Wackamole control socket
VirtualInterfaces - List of Virtual IP addresses that we want to make available
Arp-Cache - Collect and broadcast the IPs in our ARP table every N seconds
Notify - Define which machines to send ARP-spoofs when an IP is acquired
balance - Describes how to balance the IP addresses across the available machines
mature - Amount of time for an instance to join a group and be ready to assume virtual IPs

$ sudo vi /etc/wackamole.conf
Spread = 4803
SpreadRetryInterval = 5s
Group = www
Control = /var/run/wackamole/wackamole.it
VirtualInterfaces {
{ eth4:192.168.1.20/24 }
{ eth4:192.168.1.21/24 }
}
Arp-Cache = 90s
Notify {
eth4:192.168.1.1/32
arp-cache
}
balance {
AcquisitionsPerRound = all
interval = 4s
}
mature = 5s

Wondering what’s the /24 at the end of the ip addresses? Wackamole uses CIDR notation for the IP addresses. Please look at http://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing for more details.

5. Testing the configuration

After editing the configuration files, just restart Wackamole and check ifconfig and you should be able (give it a few seconds) to see the IP addresses assigned.

$ sudo /etc/init.d/wackamole restart
Restarting Wackamole Virtual IP Daemon: wackamole.
$ ifconfig
...
eth4:1 Link encap:Ethernet HWaddr 00:15:60:c4:bc:d4
inet addr:192.168.1.20 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:16

eth4:2 Link encap:Ethernet HWaddr 00:15:60:c4:bc:d4
inet addr:192.168.1.21 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:16

We can also see which IP addresses are assigned to which machines by running the wackatrl command:

$ sudo wackatrl -l
Owner: 192.168.1.10
* eth4:192.168.1.20/32
Owner: 192.168.1.10
* eth4:192.168.1.21/32

Since we’ve only started Wackamole in one of the machines in the cluster, we can see that these machine is getting both IP addresses assigned to it. However, if we start the second machine, we should be able to see the IP addresses changes almost instantaneously to this:
$ sudo wackatrl -l
Owner: 192.168.1.10
* eth4:192.168.1.20/32
Owner: 192.168.1.11
* eth4:192.168.1.21/32

6. All done

For the final test, we can run ping 192.168.1.20 and ping 192.168.1.21 on two separate shells and restart one of the machines.
We should be able to see one of the IPs dropping and getting back up within a couple of seconds. Pretty impressive!

Further reading references:

0 Responses to “Highly available cluster with Wackamole and Spread”


Comments are currently closed.