In this post, I am going to show you my implementation on how to achieve high availability MySQL setup with load balancing using HAProxy, Galera cluster, garbd and virtual IP using keepalived.

Actually, the process is similar with my previous post as here, with some added steps to configure HAProxy and garbd. The architecture can be describe as image below:

 

Variable that I used:

OS: CentOS 6.0 64bit
MySQL server1: 192.168.0.171
MySQL server2: 192.168.0.172
HAProxy server1: 192.168.0.151
HAProxy server2: 192.168.0.152
Virtual IP to be shared among HAProxy: 192.168.0.170
MySQL root password: q1w2e3!@#
Cluster root username: clusteroot
Cluster root password: q1w2e3!@#
Galera SST user: sst
Galera SST password: sstpass123

Server hostname is important in cluster. Following information has been setup in all servers /etc/hosts file. All configurations shown below are assuming that firewall is turn OFF and SELINUX is DISABLED:

192.168.0.151 haproxy1.cluster.local haproxy1
192.168.0.152 haproxy2.cluster.local haproxy2
192.168.0.171 galera1.cluster.local galera1
192.168.0.172 galera2.cluster.local galera2
127.0.0.1     localhost.localdomain localhost
::1           localhost6 localhost6.localdomain

MySQL Cluster with Galera

1. Following steps are similar with my previous post. But in this tutorial, I am going to rewrite it as refer to this case with latest version of Galera and MySQL. I have no MySQL server installed in this server at the moment. Download the latest Galera library, MySQL with wsrep, MySQL client and MySQL shared from MySQL download page:

$ mkdir -p /usr/local/src/galera
$ cd /usr/local/src/galera
$ wget https://launchpad.net/galera/2.x/23.2.0/+download/galera-23.2.0-1.rhel5.x86_64.rpm
$ wget https://launchpad.net/codership-mysql/5.5/5.5.20-23.4/+download/MySQL-server-5.5.20_wsrep_23.4-1.rhel5.x86_64.rpm
$ wget http://dev.mysql.com/get/Downloads/MySQL-5.5/MySQL-client-5.5.20-1.el6.x86_64.rpm/from/http://ftp.jaist.ac.jp/pub/mysql/
$ wget http://dev.mysql.com/get/Downloads/MySQL-5.5/MySQL-shared-5.5.20-1.el6.x86_64.rpm/from/http://ftp.jaist.ac.jp/pub/mysql/

2. Remove unwanted library and install all the packages in following sequence:

$ rpm -e --nodeps mysql-libs
$ rpm -Uhv galera-23.2.0-1.rhel5.x86_64.rpm
$ rpm -Uhv MySQL-client-5.5.20-1.el6.x86_64.rpm
$ rpm -Uhv MySQL-shared-5.5.20-1.el6.x86_64.rpm
$ rpm -Uhv MySQL-server-5.5.20_wsrep_23.4-1.rhel5.x86_64.rpm

3. Start the MySQL service and make sure it start on boot:

$ chkconfig mysql on
$ service mysql start

4. Setup the MySQL root password:

$ /usr/bin/mysqladmin -u root password 'q1w2e3!@#'

5. Setup the MySQL client for root. Create new text file /root/.my.cnf using text editor and add following line:

[client]
user=root
passowrd='q1w2e3!@#'

6. Change the permission to make sure it is not viewable by others:

$ chmod 600 /root/.my.cnf

7. Login to the MySQL server by executing command “mysql” and execute following line. We will also need to create another root user called clusteroot with password and haproxy without password (for HAProxy monitoring) as stated on variables above:

mysql> DELETE FROM mysql.user WHERE user='';
mysql> GRANT USAGE ON *.* TO root@'%' IDENTIFIED BY 'q1w2e3!@#';
mysql> UPDATE mysql.user SET Password=PASSWORD('q1w2e3!@#') WHERE User='root';
mysql> GRANT USAGE ON *.* to sst@'%' IDENTIFIED BY 'sstpass123';
mysql> GRANT ALL PRIVILEGES on *.* to sst@'%';
mysql> GRANT USAGE on *.* to clusteroot@'%' IDENTIFIED BY 'q1w2e3!@#';
mysql> GRANT ALL PRIVILEGES on *.* to clusteroot@'%';
mysql> INSERT INTO mysql.user (host,user) values ('%','haproxy');
mysql> FLUSH PRIVILEGES;
mysql> quit

8. Create the configuration files and directory, copy the example configuration and create mysql exclusion configuration file:

$ mkdir -p /etc/mysql/conf.d/
$ cp /usr/share/mysql/wsrep.cnf /etc/mysql/conf.d/
$ touch /etc/my.cnf
$ echo '!includedir /etc/mysql/conf.d/' >> /etc/my.cnf

9. Configure MySQL wsrep with Galera library. Open /etc/mysql/conf.d/wsrep.cnf using text editor and find and edit following line:

For galera1.cluster.local:

wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://"
wsrep_sst_auth=sst:sstpass123

For galera2.cluster.local:

wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.0.171:4567"
wsrep_sst_auth=sst:sstpass123

10. Restart services in both servers:

$ service mysql restart

11. Check whether Galera replication is running fine:

$ mysql -e "show status like 'wsrep%'"

If the cluster is working, you should see following value in both servers:

wsrep_ready = ON

HAProxy servers

1. In both haproxy1 and haproxy2, we will start with installing virtual IP to make sure the HAProxy IP is always available. Lets download and install keepalived from here. OpenSSL header and popt library are required, so we will install it first using yum:

$ yum install -y openssl openssl-devel popt*
$ cd /usr/local/src
$ wget http://www.keepalived.org/software/keepalived-1.2.2.tar.gz
$ tar -xzf keepalived-1.2.2.tar.gz
$ cd keepalived-*
$ ./configure
$ make
$ make install

2. Since we have virtual IP which shared between these 2 servers, we need to tell kernel that we have a non-local IP to be bind to HAProxy service later. Add following line into /etc/sysctl.conf:

net.ipv4.ip_nonlocal_bind = 1

Run following command to apply the changes:

$ sysctl -p

3. By default, keepalived configuration file will be setup under /usr/local/etc/keepalived/keepalived.conf. We will make things easier by symlink it into /etc directory. We will also need to clear the configuration example inside it:

$ ln -s /usr/local/etc/keepalived/keepalived.conf /etc/keepalived.conf
$ cat /dev/null > /etc/keepalived.conf

4. This step is different in both servers for keepalived configuration.

For haproxy1, add following line into /etc/keepalived.conf:

vrrp_script chk_haproxy {
        script "killall -0 haproxy" # verify the pid is exist or not
        interval 2                      # check every 2 seconds
        weight 2                        # add 2 points of prio if OK
}
 
vrrp_instance VI_1 {
        interface eth0			# interface to monitor
        state MASTER
        virtual_router_id 51		# Assign one ID for this route
        priority 101                    # 101 on master, 100 on backup
        virtual_ipaddress {
            192.168.0.170		# the virtual IP
        }
        track_script {
            chk_haproxy
        }
}

For haproxy2, add following line into /etc/keepalived.conf:

vrrp_script chk_haproxy {
        script "killall -0 haproxy"     # verify the pid is exist or not
        interval 2                      # check every 2 seconds
        weight 2                        # add 2 points of prio if OK
}
 
vrrp_instance VI_1 {
        interface eth0			# interface to monitor
        state MASTER
        virtual_router_id 51		# Assign one ID for this route
        priority 100                    # 101 on master, 100 on backup
        virtual_ipaddress {
            192.168.0.170		# the virtual IP
        }
        track_script {
            chk_haproxy
        }
}

5. Download and install HAProxy. Get the source from http://haproxy.1wt.eu/#down .We also need to install some required library using yum:

$ yum install pcre* -y
$ cd /usr/local/src
$ wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.19.tar.gz
$ tar -xzf haproxy-1.4.19.tar.gz
$ cd haproxy-*
$ make TARGET=linux26 ARCH=x86_64 USE_PCRE=1
$ make install

6. Create HAProxy configuration file and paste following configuration. This configuration will tell HAProxy to be a reverse-proxy for the virtual IP on port 3306. It then forward the request to backend servers. MySQL checking need to be done via haproxy user:

$ mkdir -p /etc/haproxy
$ touch /etc/haproxy/haproxy.cfg

Add following line into /etc/haproxy/haproxy.cfg:

defaults
        log global
        mode http
        retries 3
        option redispatch
        maxconn 4096
        contimeout 50000
        clitimeout 50000
        srvtimeout 50000
 
listen mysql_proxy 0.0.0.0:3306
        mode tcp
        balance roundrobin
        option tcpka
        option httpchk
        option mysql-check user haproxy
        server galera1 192.168.0.171:3306 weight 1
        server galera2 192.168.0.172:3306 weight 1

7. Since we only have 2 database servers, it means we have 2 members in the cluster. Even though it is working but still not a good idea to have database failover because it can cause “split brain”.  Split brain mode refers to a state in which each database server does not know the high availability (HA) role of its redundant peer, and cannot determine which server currently has the primary HA role. So we will use both HAProxy servers to be the 3rd and 4th member. We called them arbitrator. Galera has provided the binary called garbd to overcome this problem. Download and install Galera library:

$ cd /usr/local/src
$ wget https://launchpad.net/galera/2.x/23.2.0/+download/galera-23.2.0-1.rhel5.x86_64.rpm
$ rpm -Uhv galera-23.2.0-1.rhel5.x86_64.rpm

8.  Run following command to start garbd as daemon to join my_wsrep_cluster group:

$ garbd -a gcomm://192.168.0.171:4567 -g my_wsrep_cluster -d

9. Now lets start keepalived and HAProxy and do some testing whether the IP failover, database failover and load balancing are working. Run following command in both servers:

$ keepalived -f /etc/keepalived.conf
$ haproxy -D -f /etc/haproxy/haproxy.cfg

Ping IP 192.168.0.170 from another host. Now in haproxy1, stop the network:

$ service network stop

You will notice that the IP will be down for 2 seconds and then it will up again. It means that haproxy2 has taking over IP 192.168.0.170 from haproxy1. If you start back the network in haproxy1, you will noticed the same thing happen because haproxy1 will taking over back the IP from haproxy2, as what we configure in /etc/keepalived.conf. You can also try to kill haproxy process and you will see the virtual IP will be take over again by haproxy2.

In other hand, you can try to stop mysql process in galera2 and create a new database inside galera1. After a while, start back the mysql process in galera2 and you should see that galera2 will synchronize to galera1 as reference node to update data.

If everything working as expected, add following line into /etc/rc.local so the service started automatically after boot:

/usr/local/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg
/usr/local/sbin/keepalived -f /etc/keepalived.conf
/usr/bin/garbd -a gcomm://192.168.0.171:4567 -g my_wsrep_cluster -d

Now your MySQL is running in high availability mode. The MySQL client just need to access 192.168.0.170 as the MySQL database server host.

*Notes: All steps above need to be done on all respective servers except if specified

23 Responses to High Availability: MySQL Cluster with Galera + HAProxy

  1. EvertonNo Gravatar says:

    Thank you so much, it works perfectly
    I had already a 3 nodes Galera cluster installed but I did not have a load balance. HAProxy solved the problem!

  2. EvertonNo Gravatar says:

    You can delete my last comment, I just tried disabling SELinux and it works fine… :-)
    http://www.centos.org/docs/5/html/5.2/Deployment_Guide/sec-sel-enable-disable.html

  3. EdyNo Gravatar says:

    What hapen if galera1 down..??

    Can I have another arbitrator on different data center?? and how to do that..

    thx…

    • SecaGuyNo Gravatar says:

      Please refer my reply to you on other comment.

      You can have arbitrator in different data center but it is not recommended due to connectivity issue. Group of servers in a cluster should be working together under one internal network to get best performance and availability.

  4. EdyNo Gravatar says:

    Hi SecaGuy.

    What will hapen if galera1 Down??

    • SecaGuyNo Gravatar says:

      if galera1 down, mysql still accessible on galera2. if galera1 then up, then it will sync to the cluster team and data from galera2 will be sync to galera1. But make sure, galera must be run on at least 3 nodes and more which is in this case we have 4 galera nodes, 2 garbd and 2 real galera cluster to avoid split brain.

      • EdyNo Gravatar says:

        i have try your tutor step by step.. as you said in this tutotial.

        you’re corect if galera2 goes down.. and then up again it will sync to galera1..

        i try shutdown galera1 and then start it again. but when it come up, galera1 not syncing with other.. maybe because this setting on galera1 ??

        wsrep_cluster_address=”gcomm://”

        right??

        • SecaGuyNo Gravatar says:

          Since you know that galera2 has the latest data which can be refer to, you can use galera2 as reference node. So, login to galera2 and change the gcomm URL to gcomm:// and in galera1, change the URL to gcomm://192.168.0.172:4567. Now onwards, galera2 will be the reference point. Dont forget to change garbd gcomm address as well to the new reference node.

          • SébastienNo Gravatar says:

            Hi SecaGuy!

            Here is a quote from the codership wiki:
            “Only use empty gcomm:// address when you want create a NEW cluster. Never use it when you want to reconnect to the existing one.”

            I finally tried the galera but only the galera replication I didn’t setup the HAproxy. I just want to play a little with galera and after that I will install HAproxy.

            This is what I tried:
            -I shutted down the node 1, the one with the gcomm:// address.
            -I wrote some change in the database with my node 2.
            -I started the node 1, it didn’t replicate with the other nodes but that’s normal
            -I configured the node 1 as a joiner, so I changed the gcomm url with gcomm://ip_node2
            -restart mysql for the change and node 1 up -> replication works!

            At the end, I didn’t change the gcomm address in the node 2, still gcomm://ip_node1. And that works. We only have to use this “gcomm://” during the first setup. After that Galera doesn’t care about which node is the referent. The only thing we have to do is to select a node with the latest data, no matter which one.

            Are you agree with those statements?

            Cheers!

          • SébastienNo Gravatar says:

            Sorry for the flood.

            I did a mistake with my previous comment. I just forgot that my topology has 3 nodes, this is why I don’t need to switch with this gcomm:// address. Actually the third node tells to the cluster that there is something wrong with the cluster and avoid the split-brain.

            My mistake :(

        • SecaGuyNo Gravatar says:

          One more thing, if galera1 network is down but the mysql is still up, you do not need to change the reference point. Whenever the network available, it will sync to the cluster. That is what I meant on previous comment. So in case of some network congestion happened between cluster team, you got nothing to do.

          If in galera1 mysql is down, once it up, you cant use that as reference point anymore. Thats why you need to change the refernce node.

  5. SébastienNo Gravatar says:

    Thank you for this useful post.

    Why don’t you use tungsten for the replication?
    Any opinion on tungsten vs galera?

    Cheers!

    • SecaGuyNo Gravatar says:

      For me, the main reason I am using galera is because easy to expand. You just need to install a new server and point the gcomm URL to the reference node and the new node will know the role automatically. For Tungsten, you need to specify what role it will be on the new node, then it will work accordingly.

  6. EvertonNo Gravatar says:

    Some contribution:
    I finished my cluster project as shown on this diagram: http://i.imgur.com/m0BQb.png

    This cluster avoids split-brain too and uses only 3 machines.
    I don’t know if the performance is better or worse than post’s approach.

  7. Marcelo DiederNo Gravatar says:

    Congratulations documentation!

    Also is needed install rsync e open default port 444 in firewall for replication to new nodes.

    Marcelo Dieder

  8. joeNo Gravatar says:

    bro, i tried to simulate a HA test with this galera mysql cluster, what i do is when i stop the mysql services on galera1 node it’s still be able to connect to the db but when i started back the services in galera1 it doesn’t sync with the galera2. when i tried to do this simulation on galera2 it works and sync with galera1 when i started the services on galera2. is there any part that i have done wrong on the configuration on the test simulation? thanks bro

    • SecaGuyNo Gravatar says:

      This is because galera1 is the reference node. The reference node data cannot be outdated from the cluster group communication.

      Galera2 down, no problem because reference node is galera1 so once up, galera2 will get the latest schema update from galera1. If galera1 down, you need to change the galera1 gcomm URL to galera2. Now, galera2 is the reference point so galera1 can get the latest update from galera2.

  9. AllenNo Gravatar says:

    Is it possible to load keepalived and HAproxy all on each of the 3 galera nodes instead of having two separate servers?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>