How to check cluster health
This guide shows you how to inspect whether a running cluster is behaving normally using the output of dqlite-utils.
Steps
- Type and run
cd /path/to/dqlite/dir - Type and run
watch -n 1 'dqlite-utils -c ".status;.log --compact" | head -n25 - Observe the
termoutput for a few moments—it should be stable. If this value increases frequently, it indicates that the nodes are failing to elect a leader or that the elected leader is frequently becoming unavailable. - Observe the
current_indexfor a few moments—it should be increasing at the rate new data is written by the leader. Ifcurrent_indexandtermare both constant, it indicates that the leader is not being asked to write data. - Observe the log output (re-run the command from step 1 without
--compactfor more information). This should corroborate the observations from steps 2 and 3. - Type and run
watch -n 1 'dqlite-utils -c .configand check the set of nodes this server believes are in the cluster. - If the problem persists, read how-to-check-data-integrity