This article is a guide to setup an Apache Cassandra cluster. The cluster runs on local CentOS virtual machines using Virtualbox. I use this to have a local environment for development and testing.
Prerequisites
It assumes you are using the following “software” versions.
- MacOS 10.11.3
- Vagrant 1.8.5
- Java 1.8.0
- Apache Cassandra 1.2.2
Here are the steps I used:
-
First, create a workspace.
mkdir -p ~/vagrant_boxes/cassandra
cd ~/vagrant_boxes/cassandra
-
Next, create a new vagrant box. I’m using a minimal CentOS vagrant box.
vagrant box add “CentOS 6.5 x86_64” https://github.com/2creatives/vagrant-centos/releases/download/v6.5.3/centos65-x86_64-20140116.box
-
I need a basic VM to install the packages. This command creates one.
vagrant init -m “CentOS 6.5 x86_64” cassandra_base
-
Next, change the Vagrantfile to the following:
Vagrant.configure(2) do |config| config.vm.box = "CentOS 6.5 x86_64" config.vm.box_url = "cassandra_base" config.ssh.insert_key = false end
-
Now, install Cassandra and it’s dependencies.
vagrant up
vagrant ssh
sudo yum install java-1.8.0-openjdk-devel
sudo yum install wget
wget http://www-us.apache.org/dist/cassandra/3.9/apache-cassandra-3.9-bin.tar.gz ~
gunzip -c *gz | tar xvf –
-
Open up your ~/.bash_profile and append the following lines.
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk.x86_64 export PATH=$PATH:$JAVA_HOME/bin export CASSANDRA_HOME=~/apache-cassandra-3.9 export PATH=$PATH:$CASSANDRA_HOME/bin export CASSANDRA_CONF_DIR=$CASSANDRA_HOME/conf
-
Source the profile.
source ~/.bash_profile
-
Create a ~/.ssh/config file to avoid host key checking for SSH. Since these are DEV servers, this is ok. Note that the indentation here before StrictHostKeyChecking must be a tab.
Host * StrictHostKeyChecking no
-
Now run these commands to finish the password-less authentication.
chmod 600 ~/.ssh/config
ssh-keygen -f ~/.ssh/id_rsa -t rsa -P “”
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
-
In /etc/hosts, add the following lines.
192.168.50.41 cassandra1.example.com 192.168.50.42 cassandra2.example.com 192.168.50.43 cassandra3.example.com 192.168.50.44 cassandra4.example.com 192.168.50.45 cassandra5.example.com
-
In ~/apache-cassandra-3.9/conf/cassandra.yaml, comment out the line below. This will tell Cassandra to bind to the IP Address and allow our host machine to connect to it.
rpc_address: localhost
-
It’s a good idea to install Python 2.7 here, since the CQL shell (cqlsh) command requires that version. The image of CentOS we are using has Python 2.6. Follow the instructions here to update Python.
-
Exit the SSH session and copy the VM for the other Cassandra nodes.
exit
vagrant halt
vagrant package
vagrant box add cassandra ~/vagrant_boxes/cassandra/package.box
-
Edit the Vagrantfile to look like the following below. This will create 5 Cassandra nodes.
Vagrant.configure("2") do |config| (1..5).each do |i| config.vm.define "cassandra#{i}" do |node| node.vm.box = "cassandra" node.vm.box_url = "cassandra#{i}.example.com" node.vm.hostname = "cassandra#{i}.example.com" node.vm.network :private_network, ip: "192.168.50.4#{i}" node.ssh.insert_key = false # Replace the "listen_address" line in the conf file. node.vm.provision "shell", inline: "sed -i 's/^cluster_name: .*/cluster_name: \"My Cluster\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false node.vm.provision "shell", inline: "sed -i 's/- seeds: .*/- seeds: \"192.168.50.41\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false node.vm.provision "shell", inline: "sed -i 's/listen_address: .*/listen_address: \"192.168.50.4#{i}\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false end end end
-
Bring the new Vagrant VMs up.
vagrant up –no-provision
vagrant provision
-
Start Cassandra. It’s tricky because each node has to start up completely so it can join the cluster. I ssh into each VM individually and start Cassandra and wait about 60 seconds until it has completely started.
vagrant ssh cassandra1
~/apache-cassandra-3.9/bin/cassandra
# Wait 60 seconds
exit
vagrant ssh cassandra2
~/apache-cassandra-3.9/bin/cassandra
# Wait 60 seconds
exit
vagrant ssh cassandra3
~/apache-cassandra-3.9/bin/cassandra
# Wait 60 seconds
exit
vagrant ssh cassandra4
~/apache-cassandra-3.9/bin/cassandra
# Wait 60 seconds
exit
vagrant ssh cassandra5
~/apache-cassandra-3.9/bin/cassandra
# Wait 60 seconds
exit
To verify Cassandra
-
Verify that Cassandra is running correctly.
vagrant ssh cassandra1
~/apache-cassandra-3.9/bin/nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
— Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.50.41 108.49 KiB 256 38.0% 74b9bb75-e941-4e4c-9900-0b30c7b204b4 rack1
UN 192.168.50.42 108.2 KiB 256 41.8% 8ee92d98-2a83-491d-b39f-fc3263217dc7 rack1
UN 192.168.50.43 84.1 KiB 256 41.0% f30a37fd-8763-4f11-a4be-7f413e062bd9 rack1
UN 192.168.50.44 115.62 KiB 256 38.8% 15356519-30e0-435e-a865-47abfbef5a81 rack1
UN 192.168.50.45 96.62 KiB 256 40.4% 213e082a-92fd-4948-84be-fe941a0801b1 rack1