In this post we are sharing the Cassandra backup script on linux system. Backup is always important part of any system. Backup will help you to get the data back in case data is removed due to some reason.

The script includes two section as per the Cassandra backup requirement. These are given below

1. SCHEMA Backup
2. Taking SNAPSHOT Backup

NOTE: The script you have to run in each cassandra node server.
You can run this script manually as well set in cronjob to make it run periodically.

You can download the backup script from my github link.

Create Cassandra Backup Script :

1. Create the file called cassandra_backup.sh .
2. Make it executable chmod +x cassandra_backup.sh
3. And copy paste below given content in file cassandra_backup.sh.

#!/bin/bash
#
# Author: Sharad Kumar Chhetri
# Date : 27-April-2015
# Description : The backup script will complete the backup in 2 phases -
#  1. First Phase: Taking backup of Keyspace SCHEMA
#  2. Seconf Phase: Taking snapshot of keyspaces
#

## In below given variables - require information to be feed by system admin##
# For _NODETOOL , you can replace $(which nodetool) with  absolute path of nodetool command.
#

_BACKUP_DIR=/backup
_DATA_DIR=/var/lib/cassandra/data
_NODETOOL=$(which nodetool)

## Do not edit below given variable ##

_TODAY_DATE=$(date +%F)
_BACKUP_SNAPSHOT_DIR="$_BACKUP_DIR/$_TODAY_DATE/SNAPSHOTS"
_BACKUP_SCHEMA_DIR="$_BACKUP_DIR/$_TODAY_DATE/SCHEMA"
_SNAPSHOT_DIR=$(find $_DATA_DIR -type d -name snapshots)
_SNAPSHOT_NAME=snp-$(date +%F-%H%M-%S)
_DATE_SCHEMA=$(date +%F-%H%M-%S)

###### Create / check backup Directory ####

if [ -d  "$_BACKUP_SCHEMA_DIR" ]
then
echo "$_BACKUP_SCHEMA_DIR already exist"
else
mkdir -p "$_BACKUP_SCHEMA_DIR"
fi

if [ -d  "$_BACKUP_SNAPSHOT_DIR" ]
then
echo "$_BACKUP_SNAPSHOT_DIR already exist"
else
mkdir -p "$_BACKUP_SNAPSHOT_DIR"
fi



##################### SECTION 1 : SCHEMA BACKUP ############################################ 

## List All Keyspaces
cqlsh -e "DESC KEYSPACES" |perl -pe 's/e([^[]]|[.*?[a-zA-Z]|].*?a)//g' | sed '/^$/d' > Keyspace_name_schema.cql

#_KEYSPACE_NAME=$(cat Keyspace_name_schema.cql)

## Create directory inside backup SCHEMA directory. As per keyspace name.
for i in $(cat Keyspace_name_schema.cql)
do
if [ -d $i ]
then
echo "$i directory exist"
else
mkdir -p $_BACKUP_SCHEMA_DIR/$i
fi
done

## Take SCHEMA Backup - All Keyspace and All tables
for VAR_KEYSPACE in $(cat Keyspace_name_schema.cql)
do
cqlsh -e "DESC KEYSPACE  $VAR_KEYSPACE" > "$_BACKUP_SCHEMA_DIR/$VAR_KEYSPACE/$VAR_KEYSPACE"_schema-"$_DATE_SCHEMA".cql 
done


##################### END OF LINE ---- SECTION 1 : SCHEMA BACKUP #####################

###### Create snapshots for all keyspaces
echo "creating snapshots for all keyspaces ....."
$_NODETOOL snapshot -t $_SNAPSHOT_NAME

###### Get Snapshot directory path
_SNAPSHOT_DIR_LIST=`find $_DATA_DIR -type d -name snapshots|awk '{gsub("'$_DATA_DIR'", "");print}' > snapshot_dir_list`

#echo $_SNAPSHOT_DIR_LIST > snapshot_dir_list

## Create directory inside backup directory. As per keyspace name.
for i in `cat snapshot_dir_list`
do
if [ -d $_BACKUP_SNAPSHOT_DIR/$i ]
then
echo "$i directory exist"
else
mkdir -p $_BACKUP_SNAPSHOT_DIR/$i
echo $i Directory is created
fi
done

### Copy default Snapshot dir to backup dir

find $_DATA_DIR -type d -name $_SNAPSHOT_NAME > snp_dir_list

for SNP_VAR in `cat snp_dir_list`;
do
## Triming _DATA_DIR
_SNP_PATH_TRIM=`echo $SNP_VAR|awk '{gsub("'$_DATA_DIR'", "");print}'`

cp -prvf "$SNP_VAR" "$_BACKUP_SNAPSHOT_DIR$_SNP_PATH_TRIM";

done

Your question and comments will help me to improve the script.

Read Some More Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.