How to Setup an Apache Cassandra Cluster

This article is a guide to setup an Apache Cassandra cluster. The cluster runs on local CentOS virtual machines using Virtualbox. I use this to have a local environment for development and testing.


It assumes you are using the following "software" versions.

  • MacOS 10.11.3
  • Vagrant 1.8.5
  • Java 1.8.0
  • Apache Cassandra 1.2.2

Here are the steps I used:

  1. First, create a workspace.

    mkdir -p ~/vagrant_boxes/cassandra

    cd ~/vagrant_boxes/cassandra

  2. Next, create a new vagrant box. I'm using a minimal CentOS vagrant box.

    vagrant box add "CentOS 6.5 x86_64"

  3. I need a basic VM to install the packages. This command creates one.

    vagrant init -m "CentOS 6.5 x86_64" cassandra_base

  4. Next, change the Vagrantfile to the following:

      Vagrant.configure(2) do |config| = "CentOS 6.5 x86_64"
        config.vm.box_url = "cassandra_base"
        config.ssh.insert_key = false

  5. Now, install Cassandra and it's dependencies.

    vagrant up

    vagrant ssh

    sudo yum install java-1.8.0-openjdk-devel

    sudo yum install wget

    wget ~

    gunzip -c *gz | tar xvf -

  6. Open up your ~/.bash_profile and append the following lines.

      export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk.x86_64
      export PATH=$PATH:$JAVA_HOME/bin
      export CASSANDRA_HOME=~/apache-cassandra-3.9
      export PATH=$PATH:$CASSANDRA_HOME/bin

  7. Source the profile.

    source ~/.bash_profile

  8. Create a ~/.ssh/config file to avoid host key checking for SSH. Since these are DEV servers, this is ok. Note that the indentation here before StrictHostKeyChecking must be a tab.

      Host *
            StrictHostKeyChecking no

  9. Now run these commands to finish the password-less authentication.

    chmod 600 ~/.ssh/config

    ssh-keygen -f ~/.ssh/id_rsa -t rsa -P ""

    cat ~/.ssh/ >> ~/.ssh/authorized_keys

  10. In /etc/hosts, add the following lines.

  11. In ~/apache-cassandra-3.9/conf/cassandra.yaml, comment out the line below. This will tell Cassandra to bind to the IP Address and allow our host machine to connect to it.

    rpc_address: localhost

  12. It's a good idea to install Python 2.7 here, since the CQL shell (cqlsh) command requires that version. The image of CentOS we are using has Python 2.6. Follow the instructions here to update Python.

  13. Exit the SSH session and copy the VM for the other Cassandra nodes.


    vagrant halt

    vagrant package

    vagrant box add cassandra ~/vagrant_boxes/cassandra/

  14. Edit the Vagrantfile to look like the following below. This will create 5 Cassandra nodes.

      Vagrant.configure("2") do |config|
        (1..5).each do |i|
          config.vm.define "cassandra#{i}" do |node|
   = "cassandra"
            node.vm.box_url = "cassandra#{i}"
            node.vm.hostname = "cassandra#{i}"
   :private_network, ip: "{i}"
            node.ssh.insert_key = false
            # Replace the "listen_address" line in the conf file.
            node.vm.provision "shell", inline: "sed -i 's/^cluster_name: .*/cluster_name: \"My Cluster\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false
            node.vm.provision "shell", inline: "sed -i 's/- seeds: .*/- seeds: \"\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false
            node.vm.provision "shell", inline: "sed -i 's/listen_address: .*/listen_address: \"{i}\"/' ~/apache-cassandra-3.9/conf/cassandra.yaml", privileged: false

  15. Bring the new Vagrant VMs up.

    vagrant up --no-provision

    vagrant provision

  16. Start Cassandra. It's tricky because each node has to start up completely so it can join the cluster. I ssh into each VM individually and start Cassandra and wait about 60 seconds until it has completely started.

    vagrant ssh cassandra1


    # Wait 60 seconds


    vagrant ssh cassandra2


    # Wait 60 seconds


    vagrant ssh cassandra3


    # Wait 60 seconds


    vagrant ssh cassandra4


    # Wait 60 seconds


    vagrant ssh cassandra5


    # Wait 60 seconds


To verify Cassandra

  1. Verify that Cassandra is running correctly.

    vagrant ssh cassandra1

    ~/apache-cassandra-3.9/bin/nodetool status
    Datacenter: datacenter1
    |/ State=Normal/Leaving/Joining/Moving
    -- Address Load Tokens Owns (effective) Host ID Rack
    UN 108.49 KiB 256 38.0% 74b9bb75-e941-4e4c-9900-0b30c7b204b4 rack1
    UN 108.2 KiB 256 41.8% 8ee92d98-2a83-491d-b39f-fc3263217dc7 rack1
    UN 84.1 KiB 256 41.0% f30a37fd-8763-4f11-a4be-7f413e062bd9 rack1
    UN 115.62 KiB 256 38.8% 15356519-30e0-435e-a865-47abfbef5a81 rack1
    UN 96.62 KiB 256 40.4% 213e082a-92fd-4948-84be-fe941a0801b1 rack1

Leave a Reply