Redis in HA with Sentinel
Redis (REmote DIctionary Server) is a high-performance, open-source, non-relational in-memory database that can be used as a store of key-value data structures. It can obviously be used as a single instance, therefore installed on a single server, or configured in High Reliability. There are two ways to have a HA-type Redis architecture, activate a Redis Cluster having at least 6 nodes available (3 Master + 3 Slave), or use Sentinel having 3 nodes available with Sentinel installed on, in addition to the Master and Slave nodes of Redis.
The replication mechanism works so that data written to the master is asynchronously replicated to the slave servers. In case of connectivity issues between the master and the slave, the synchronization stops, but as soon as the communication channel becomes available again, the slave reconnects to the master and replication resumes. Read/write operations are always possible on the master node, while only read operations are allowed on a slave node, at least until the slave is elected as master. This way, clients can read data even if the master server is unavailable, assuming that at least one of the slave servers is available and contains the requested data.
Sentinel is responsible for monitoring the Redis servers in the cluster (master and slaves) and coordinating the automatic failover of the master to one of its available slaves, if a master failure occurs. When a master is no longer available, Sentinel chooses one of its available slaves and promotes it to master, so the cluster continues to run uninterrupted.
In this post we will see how to create a simple Redis HA architecture with 3 nodes, i.e. 2 nodes hosting the Redis engine and Sentinel, and one node running Sentinel only. The example schema is shown below.
Redis setup
I used CentOS 7 virtual machines, but few changes are required if you use other operating systems.
Since Redis is not available as rpm in the CentoOS repositories, we enable the Remi repository and proceed with the installation on all 3 nodes. Sentinel is installed as an additional tool for Redis, so let’s do the basic installation and then on one of the nodes we will start Sentinel only, while on the other two we will start both Sentinel and Redis.
yum install epel-release yum-utils
yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum-config-manager --enable remi
yum install redis
Replica configuration
Proceed with the configuration of 2 Redis servers configured in replication, so we will have a master node and a slave in replica.
The configuration file to edit is /etc/redis/redis.conf.
In the master node (IP 172.31.6.218):
bind 127.0.0.1 172.31.6.218
protected-mode no
supervised systemd
masterauth P@ssw0rd
masteruser mymasteruser
user default off
user mymasteruser +@all on >P@ssw0rd
user superuser +@all ~* on >P@ssw0rd
user sentineluser on >P@ssw0rd allchannels +multi +slaveof +ping +exec +subscribe +config|rewrite +role +publish +info +client|setname +client|kill +script|kill
In the slave node (IP 172.31.94.134):
bind 127.0.0.1 172.31.94.134
protected-mode no
supervised systemd
replicaof 172.31.6.218 6379
masterauth P@ssw0rd
masteruser mymasteruser
user default off
user mymasteruser +@all on >P@ssw0rd
user superuser +@all ~* on >P@ssw0rd
user sentineluser on >P@ssw0rd allchannels +multi +slaveof +ping +exec +subscribe +config|rewrite +role +publish +info +client|setname +client|kill +script|kill
On the official documentation of redis you can find the detail of the configuration, and all the parameters are well described even within the configuration file itself. Here we briefly see the meaning of the parameters used.
- bind: is the address that redis listens on, you need to add the machine’s IP as well as localhost to allow slave nodes and sentinel nodes to connect.
- protected-mode: set to no to allow redis accept remote connections.
- supervised: how the application is managed; in our case systemd.
- replicaof: allows to tell the slave node which redis server to replicate from.
- masterauth: is the password of the masteruser used for replication operations.
- masteruser: is the username of the masteruser used for replication operations.
- user: allows to configure different users to perform different operations on redis, according to the ACL management logic; in my case I configured the mymasteruser user to manage the replica operations, the superuser user who can execute all the commands on the redis server, the sentineluser user to manage the operations necessary for Sentinel.
Start Redis on both servers.
systemctl start redis
systemctl enable redis
Once started, using the redis-cli tool it is possible to check the status of the servers. Since we have configured ACLs for users, it is necessary to authenticate to be able to execute commands, so after starting redis-cli we have to execute the command AUTH
On the master node we have:
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=172.31.94.134,port=6379,state=wait_bgsave,offset=0,lag=0
master_failover_state:no-failover
master_replid:7483bc9c74537c06dad78ec10bae247ac26bea52
master_replid2:a9dbab0000bbf9b7306e617a3bb07b7070721a03
master_repl_offset:2098991
second_repl_offset:2086135
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1166345
repl_backlog_histlen:932647
While on the slave node we have:
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:172.31.6.218
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_read_repl_offset:2122994
slave_repl_offset:2122994
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:7483bc9c74537c06dad78ec10bae247ac26bea52
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:2122994
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2098992
repl_backlog_histlen:24003
In particular role indicates the role of the server, slave0:ip=172.31.94.134,port=6379,state=wait_bgsave,offset=0,lag=0 on the master gives us an indication of the connected slave, master_host:172.31. 6.218 on the slave gives us indication of the master to which the slave is connected.
Sentinel configuration
Well, at this point we have 2 Redis nodes in replica, one is the master and the other is the slave. In case of problems on the master, we could manually elect the slave as master and still do all the read/write operations on our data.
The role of Sentinel is to constantly monitor the status of the nodes and automatically migrate the role of the servers in case of problems.
Proceed with Sentinel configuration on our 3 nodes. The Sentinel configuration file is /etc/redis/sentinel.conf.
Node redis 1 (IP 172.31.6.218)
bind 172.31.6.218
port 26379
sentinel announce-ip 172.31.6.218
sentinel monitor mymaster 172.31.6.218 6379 2
sentinel auth-pass mymaster P@ssw0rd
sentinel auth-user mymaster sentineluser
sentinel down-after-milliseconds mymaster 3000
sentinel failover-timeout mymaster 120000
Node redis 2 (IP 172.31.94.134)
bind 172.31.94.134
port 26379
sentinel announce-ip 172.31.94.134
sentinel monitor mymaster 172.31.6.218 6379 2
sentinel auth-pass mymaster P@ssw0rd
sentinel auth-user mymaster sentineluser
sentinel down-after-milliseconds mymaster 3000
sentinel failover-timeout mymaster 120000
Node sentinel (IP 172.31.13.69)
bind 172.31.13.69
port 26379
sentinel announce-ip 172.31.13.69
sentinel monitor mymaster 172.31.6.218 6379 2
sentinel auth-pass mymaster P@ssw0rd
sentinel auth-user mymaster sentineluser
sentinel down-after-milliseconds mymaster 3000
sentinel failover-timeout mymaster 120000
Start Sentinel on all the servers:
systemctl start redis-sentinel
systemctl enable redis-sentinel
Now we use redis-cli again, but this time to connect to Sentinel and not the Redis server.
[root@ip-172-31-13-69 redis]# redis-cli -h 172.31.13.69 -p 26379
172.31.13.69:26379> sentinel masters
1) 1) "name"
2) "mymaster"
3) "ip"
4) "172.31.6.218"
5) "port"
6) "6379"
7) "runid"
8) "b9b82fc75dd0bab18bb26ac8d38c6867a04626ca"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "321"
19) "last-ping-reply"
20) "321"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "7559"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "3295851"
29) "config-epoch"
30) "5"
31) "num-slaves"
32) "1"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "120000"
39) "parallel-syncs"
40) "1"
Within the various parameters obtained in output, the most important are the IP of the master node that in this case is 172.31.6.218, num-slaves that in our case is 1 (we have only 1 slave node) and num-other-sentinels that indicates the number of additional Sentinel nodes. If this latter value were 0, it would mean that the system does not communicate correctly with the other Sentinel nodes.
Failover test
To test if the failover mechanism is working we shut down Redis on the first node, currently the master master systemctl stop redis
.
After the 3 seconds that we set on the down-after-milliseconds parameter, in the log /var/log/redis/sentinel.log we will have this:
16268:X 22 Mar 2023 08:25:02.864 # +sdown master mymaster 172.31.6.218 6379
16268:X 22 Mar 2023 08:25:02.908 * Sentinel new configuration saved on disk
16268:X 22 Mar 2023 08:25:02.908 # +new-epoch 6
16268:X 22 Mar 2023 08:25:02.912 * Sentinel new configuration saved on disk
16268:X 22 Mar 2023 08:25:02.912 # +vote-for-leader adecd88892552108121902d37e2d279ab4a1dc1c 6
16268:X 22 Mar 2023 08:25:02.936 # +odown master mymaster 172.31.6.218 6379 #quorum 3/2
16268:X 22 Mar 2023 08:25:02.936 # Next failover delay: I will not start a failover before Thu Mar 22 08:29:03 2023
16268:X 22 Mar 2023 08:25:04.061 # +config-update-from sentinel adecd88892552108121902d37e2d279ab4a1dc1c 172.31.94.134 26379 @ mymaster 172.31.6.218 6379
16268:X 22 Mar 2023 08:25:04.061 # +switch-master mymaster 172.31.6.218 6379 172.31.94.134 6379
16268:X 22 Mar 2023 08:25:04.061 * +slave slave 172.31.6.218:6379 172.31.6.218 6379 @ mymaster 172.31.94.134 6379
16268:X 22 Mar 2023 08:25:04.064 * Sentinel new configuration saved on disk
16268:X 22 Mar 2023 08:25:07.107 # +sdown slave 172.31.6.218:6379 172.31.6.218 6379 @ mymaster 172.31.94.134 6379
If we check the status of the master again on Sentinel we see that now the master node is 172.31.94.134, the one that was previously slave.
[root@ip-172-31-13-69 redis]# redis-cli -h 172.31.13.69 -p 26379
172.31.13.69:26379> sentinel masters
1) 1) "name"
2) "mymaster"
3) "ip"
4) "172.31.94.134"
5) "port"
6) "6379"
7) "runid"
8) "6a6c7352bd37ad247e40c6b498ceb688a8e2ae7e"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "390"
19) "last-ping-reply"
20) "390"
21) "down-after-milliseconds"
22) "3000"
23) "info-refresh"
24) "7036"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "359407"
29) "config-epoch"
30) "6"
31) "num-slaves"
32) "1"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "120000"
39) "parallel-syncs"
40) "1"
Note that the Master-Slave management is now the responsibility of Sentinel, and even restarting Redis on the node where we switched it off, the master will not migrate back to it, but will remain on the second node until problems arise (redis switched off or unreachable ) or moved manually.