A little note on how I setup and bootstrap a local Cassandra clusters on macOS machines for development.
Instructions below were tested on macOS Sierra, and aim to spawn a 3-nodes 2.1.x cluster.
Install ccm and its dependencies:
1$ brew install ant
2$ brew cask install java
3$ pip install --upgrade ccm
Just in case we messed up a previous installation, let’s clean things up:
1$ ccm switch test21
2$ ccm stop test21
3$ killall java
4$ ccm remove
5$ rm -rf "${HOME}/.ccm/test21"
Create a new 3-nodes cluster named test21 with the latest Cassandra release of the 2.1.x series:
1 $ ccm create test21 -v 2.1 -n 3
Here is an example on how we can alter the common configuration of all nodes of the cluster. In this case to bump all timeouts ten times:
1$ tee -a ~/.ccm/test21/cluster.conf <<-EOF
2config_options: {
3read_request_timeout_in_ms: 50000,
4range_request_timeout_in_ms: 100000,
5write_request_timeout_in_ms: 20000,
6request_timeout_in_ms: 100000,
7tombstone_failure_threshold: 10000000}
8EOF
9$ ccm updateconf
Also had to sometimes increase Java’s heap size, like to accommodate large data imports:
1$ export CCM_MAX_HEAP_SIZE="12G"
2$ export CCM_HEAP_NEWSIZE="2400M"
Before starting the server, we need to create missing local network interfaces, one for each node:
1$ sudo ifconfig lo0 alias 127.0.0.1 up
2$ sudo ifconfig lo0 alias 127.0.0.2 up
3$ sudo ifconfig lo0 alias 127.0.0.3 up
We can now start the cluster:
1 $ ccm start test21
To get the state of the cluster:
1$ ccm status
2Cluster: 'test21'
3-----------------
4node1: UP
5node3: UP
6node2: UP
Or a much more detailed status:
1$ ccm status -v
2Cluster: 'test21'
3-----------------
4node1: UP
5auto_bootstrap=False
6thrift=('127.0.0.1', 9160)
7binary=('127.0.0.1', 9042)
8storage=('127.0.0.1', 7000)
9jmx_port=7100
10remote_debug_port=0
11byteman_port=0
12initial_token=-9223372036854775808
13pid=81379
1415
node3: UP
16auto_bootstrap=False
17thrift=('127.0.0.3', 9160)
18binary=('127.0.0.3', 9042)
19storage=('127.0.0.3', 7000)
20jmx_port=7300
21remote_debug_port=0
22byteman_port=0
23initial_token=3074457345618258602
24pid=81381
2526
node2: UP
27auto_bootstrap=False
28thrift=('127.0.0.2', 9160)
29binary=('127.0.0.2', 9042)
30storage=('127.0.0.2', 7000)
31jmx_port=7200
32remote_debug_port=0
33byteman_port=0
34initial_token=-3074457345618258603
35pid=81380
To get the detailed data ownership status, you need to get through a node and point to an existing column family:
1$ ccm node1 status my_column_family
23
Datacenter: datacenter1
4=======================
5Status=Up/Down
6|/ State=Normal/Leaving/Joining/Moving
7-- Address Load Tokens Owns (effective) Host ID Rack
8UN 127.0.0.1 6.08 GB 1 100.0% 25e0440b-3ac9-490e-b0b0-260e96395f15 rack1
9UN 127.0.0.2 6.22 GB 1 100.0% 848edc79-db1c-49bf-bdd8-3768b588460f rack1
10UN 127.0.0.3 6.14 GB 1 100.0% 75acd6c7-61c5-4ae7-9008-63d6426d1468 rack1
For debugging, a node’s log is available through ccm:
1 $ ccm node1 showlog
And you can directly query through that node:
1$ TZ=UTC cqlsh --cqlversion=3.2.1 127.0.0.1
2Connected to test21 at 127.0.0.1:9042.
3[cqlsh 5.0.1 | Cassandra 2.1.12 | CQL spec 3.2.1 | Native protocol v3]
4Use HELP for help.
5cqlsh> CONSISTENCY QUORUM;
6Consistency level set to QUORUM.
7cqlsh>
Finally, to restore a bunch of table snapshots from your production cluster:
1$ TABLES="table1 table2 table3"
2$ DUMP_FOLDER="${HOME}/dump/2016-09-12/"
3$ for host_folder in $(ls "${DUMP_FOLDER}"); do
4> for table in ${TABLES}; do
5> SSTABLE_FOLDER="${DUMP_FOLDER}/${host_folder}/my_column_family/${table}";
6> echo "Importing: ${SSTABLE_FOLDER} ...";
7> ccm bulkload "${SSTABLE_FOLDER}";
8> done
9> done
Forcing a repair on each table after a massive import can’t be bad:
1$ for table in ${TABLES}; do
2> ccm node1 nodetool repair my_column_family ${table};
3> done