Balancing traffic to Galera Cluster with HAProxy

We described here Galera Cluster with its features, which allows you to create a multi-master active-active database cluster with synchronous data replication, enabling read and write operations on all the cluster nodes.

To ensure high availability and greater scalability of the architecture it is useful to add a load balancer element between the application part (Application Servers) and the database cluster. HAproxy offers all the features we need to balance data traffic between applications and database nodes, also ensuring that the database node is always available.

The figure shows a possible architecture with HAProxy as a load balancer between applications and databases.

haproxy-galera

We assume that HAProxy is installed on a separate server, and all App Servers communicate with the database via the single HAProxy. Another possibility is to install it locally on the App Servers, in this case each application sends requests to the local HAProxy that balances on the database nodes. Another possibility is to install HAProxy on the database nodes, in this case an App Server could send requests to the HAProxy of a single node, that can balance towards the other database nodes.

HAProxy configuration

HAProxy is a package usually included in Linux distributions, so you can proceed with a simple yum install haproxy.

Considering our scenario, the purpose of HAProxy is to represent a single access point to the Galera cluster, balancing traffic to the different nodes and ensuring that client requests are correctly routed to available nodes. Therefore, during the configuration phase you will have to consider both the balancing method you want to use and the mechanisms for controlling the availability of the nodes.

HAProxy provides several traffic balancing modes, including round-robin mode and backup mode, with one or more nodes used exclusively as backup. In this post we will briefly look at this part, while we will go into detail about the control mechanisms of database nodes. We can consider three different logics for monitoring the availability of database nodes.

  • TCP health check used to enable a basic check, HAProxy will send a TCP packet to the backend server to check if it is listening on the specified port and responding correctly
  • MySQL health check used to enable a connection level check on the database server, HAProxy will attempt a connection to the database to verify that the server is active
  • HTTP health check allows you to enable a more advanced control over server availability, HAProxy sends an HTTP request to the server which can be used to carry out specific checks on server functionality

Suppose that the IP of our HAProxy is 192.168.1.5 and that our DBMS servers are 192.168.1.10, 192.168.1.11 and 192.168.1.12 and let’s see how to configure the different health check methods.

TCP health check

In this case HAProxy checks whether a TCP connection can be established with the servers.

	#---------------------------------------------------------------------
	# galera cluster
	#---------------------------------------------------------------------
	listen galeracluster
     bind 192.168.1.5:3306
     log global
     balance source
     mode tcp
     option tcpka
     server node1 192.168.1.10:3306 inter 2s downinter 2s rise 3 fall 2
     server node2 192.168.1.11:3306 inter 2s downinter 2s rise 3 fall 2 backup
     server node3 192.168.1.12:3306 inter 2s downinter 2s rise 3 fall 2 backup

With this configuration HAProxy tries to establish a TCP connection on the servers listening port. The option tcpka option used to enable the TCP keep-alive functionality allows you to keep the TCP connection open with the backend server even after completing its availability check.

Here the description of the other parameters used with the server option:

  • inter is the interval between 2 consecutive health checks
  • downinter is the minimum amount of time that must elapse after a backend server is marked DOWN before it begins to be tested again
  • rise is the number of consecutive successful health checks required for a server previously in the DOWN state to be considered UP again
  • fall is the number of consecutive failed health checks for a server to be considered in DOWN state
  • backup marks a server as a backup server, i.e. it will only be used if the main server (or servers) are in DOWN state

MySQL health check

In this case HAProxy tries to establish a MySQL connection with the DBMS active on the servers.

	#---------------------------------------------------------------------
	# galera cluster
	#---------------------------------------------------------------------
	listen galeracluster
     bind 192.168.1.5:3306
     log global
     balance source
     mode tcp
     option mysql-check
     server node1 192.168.1.10:3306 inter 2s downinter 2s rise 3 fall 2
     server node2 192.168.1.11:3306 inter 2s downinter 2s rise 3 fall 2 backup
     server node3 192.168.1.12:3306 inter 2s downinter 2s rise 3 fall 2 backup

With this configuration HAProxy tries to establish a connection with the MySQL server. Using the option mysql-check option, HAProxy sends a connection request to the MySQL server and checks the response packet whether it is a “Mysql Handshake Initialization” packet or whether it is an “Error” packet. You can also use the option by specifying a user, option mysql-check user , in this case HAProxy will try to connect to the MySQL server using the specified user, sending a “Client Authentication” packet. The user must be configured on the DBMS to accept connections without a password.

HTTP health check

Using the 2 check methods just described, HAProxy is able to check that the MySQL server is active and that it responds to connection requests, but this does not guarantee that the DBMS is fully operational. In a Galera cluster, for example, in split brain conditions, the server accepts connections but the database is not operational and will therefore respond with an error to any query request following login.
To check the real availability of the DBMS we can use a more intelligent check mode with HAProxy combined with Linux application socket. To do this we need:

  1. Configure HAProxy to use the httpchk option which performs the health check by sending an HTTP request and checking the response
  2. Enable an application socket on the server listening on a specific TCP port
  3. Prepare a script on the server that is executed upon receiving a connection on the socket which carries out the appropriate checks on the DBMS

The configuration of HAProxy for the HTTP check is as follows.

	#---------------------------------------------------------------------
	# galera cluster
	#---------------------------------------------------------------------
	listen galeracluster
     bind 192.168.1.5:3306
     log global
     balance source
     mode tcp
     option httpchk OPTIONS * HTTP/1.1
     server node1 192.168.1.10:3334 inter 2s downinter 2s rise 3 fall 2
     server node2 192.168.1.11:3334 inter 2s downinter 2s rise 3 fall 2 backup
     server node3 192.168.1.12:3334 inter 2s downinter 2s rise 3 fall 2 backup

In this way HAProxy sends an HTTP OPTIONS on port 3334 of the servers and will consider the server UP if the response is a 2xx or a 3xx, otherwise it will consider the server in the DOWN state.

We have to activate these 2 services on the server.

	cd /etc/systemd/system
	cat mysqlcheck.socket
	
	[Unit]
	Description=mysqlcheck Socket

	[Socket]
	ListenStream=192.168.1.10:3334
	Accept=yes

	[Install]
	WantedBy=sockets.target


	cat mysqlcheck@.service
	
	[Unit]
	Description=mysqlcheck Per-Connection Server

	[Service]
	ExecStart=-/opt/mysql/mysqlcheck.sh     # the binary to start
	User=root
	Group=root
	StandardInput=socket

In this way, a socket on TCP port 3334 is active on the server which, once received a connection, executes the /opt/mysql/mysqlcheck.sh service. The script could be the following.

	#!/bin/bash

	# MySQL Health Monitoring Script

	MYSQL_HOST="localhost"
	MYSQL_PORT="3306"
	MYSQL_USER="hauser"
	MYSQL_PASS="hapassword"

	SUCCESS_LOG="/opt/mysql/mysql-scs.log"
	ERR_LOG="/opt/mysql/mysql-err.log"

	# MySQL query

	/usr/bin/mysql --host=$MYSQL_HOST --port=$MYSQL_PORT --user=$MYSQL_USER --password=$MYSQL_PASS -e "select * from haproxy.status;" > $SUCCESS_LOG 2> $ERR_LOG


	# Checking parameters -- if MySQL answers the query, it writes success to SUCCESS_LOG else file will be empty then it'll jump to error(else) part of the script

	if [ "$(/bin/cat $SUCCESS_LOG)" != "" ]
	then
		# mysql is fine, return http 200
		/bin/echo -e "HTTP/1.1 200 OK\r\n"
	else
		# Problem with database, return http 503
		/bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"
	fi

So trying to summarize, HAProxy sends an HTTP OPTIONS on port 3334 of the server; a socket on port 3334 is active on the server which executes a script when the connection is received; the script connects to the database with an appropriately configured user and executes a query; if the query is executed successfully the server responds with HTTP 200 OK, if the query returns an error the server responds with an HTTP 500 error; HAProxy receives the response and interprets the response code, setting the server status as UP or DOWN.