MySQL: Advantages and Disadvantages of Galera

Galera cluster replication are now supported in MySQL InnoDB and Percona XtraDB. Even it is new, this technology is having bright future to be developed and is coming into trend to achieve high availability and scalability on database server.

Advantages:

  • No need to learn new storage engine technology like NDBCluster. Learning new technology will require some time to learn. It just similar to InnoDB with added of cluster functionality.
  • All nodes are equal on read and write at all times. It provides synchronous replication, and can guarantee that replicas are up-to-date and ready for reads or to be promoted to master.
  • It protects you against data loss when a node fails. In fact, because all nodes have all the data, you can lose every node but one and still not lose the data (even if the cluster has a split brain and stops working). This is different from NDB, where the data is partitioned across node groups and some data can be lost if all servers in a node group are lost.
  • Since it is using synchronous replication, replicas cannot fall behind. The write sets are propagated to and certified on every node in the cluster before the transaction commits.
  • Easy to scale with more simple implementation on adding and removing nodes, joining cluster and monitor the cluster status. No need to have management node like MySQL cluster.

Disadvantages:

  • It’s new. There isn’t a huge body of experience with its strengths, weaknesses, and appropriate use cases.
  • The whole cluster performs writes as slowly as the weakest node. Thus, all nodes need similar hardware, and if one node slows down (e.g., because the RAID card does a battery-learn cycle), all of them slow down. If one node has probability P of being slow to accept writes, a three-node cluster has probability 3P of being slow.
  • It isn’t as space-efficient as NDB, because every node has all the data, not just a portion. On the other hand, it is based on Percona XtraDB (which is an enhanced version of InnoDB), so it doesn’t have NDB’s limitations regarding on-disk data.
  • It currently disallows some operational tricks that are possible with asynchronous replication, such as making schema changes offline on a replica and promoting it to be master so you can repeat the changes on other nodes offline. The current alternative is to use a technique such as Percona Toolkit’s online schema change tool. Rolling schema upgrades are nearly ready for release at the time of writing, however.
  • Adding a new node to a cluster requires copying data to it, plus the ability to keep up with ongoing writes, so a big cluster with lots of writes could be hard to grow. This will put a practical limit on the cluster’s data size. We aren’t sure how large this is, but a pessimistic estimate is that it could be as low as 100 GB or so. It could be much larger; time and experience will tell.
  • The replication protocol seems to be somewhat sensitive to network hiccups at the time of writing, and that can cause nodes to stop themselves and drop out of the cluster, so we recommend a high-performance network with good redundancy. If you don’t have a reliable network, you might end up adding nodes back to the cluster too often. This requires a resynchronization of the data. At the time of writing, incremental state transfer to avoid a full copy of the dataset is almost ready to use, so this should not be as much of a problem in the future. It’s also possible to configure Galera to be more tolerant of network timeouts (at the cost of delayed failure detection), and more reliable algorithms are planned for future releases.
  • If you aren’t watching carefully, your cluster could grow too big to restart nodes that fail, just as backups can get too big to restore in a reasonable amount of time if you don’t practice it routinely. We need more practical experience to know how this will work in reality.
  • Because of the cross-node communication required at transaction commit, writes will get slower, and deadlocks and  rollbacks will get more frequent, as you add nodes to the cluster.

Reference: High Performance MySQL 3rd Edition

CentOS: Using XFS File System for MySQL

MySQL is preferred to be run on XFS file system due to its performance on direct IO. Even though many benchmarks have already come out with the latest Linux default file system Ext4 versus XFS, it still convenience to use this file system for our MySQL data directory.

XFS is not come by default in CentOS. So we need to install required utilities to manage XFS. I will use a different virtual hard disk specifically for this and mapped it to MySQL data directory.

OS: CentOS 6.2 64bit
Device: /dev/sdb
Old MySQL data directory: /var/lib/mysql
New MySQL data directory:  /mysql

1. Since this is new hard disk without any partition table, we need to create partition table first:

$ fdisk /dev/sdb

Sequence pressed on keyboard: n > p > 1 > Enter > Enter > w

2.  You should see disk partition has been created as /dev/sdb1 as below:

$ fdisk -l /dev/sdb
 
Disk /dev/sdb: 11.8 GB, 11811160064 bytes
255 heads, 63 sectors/track, 1435 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xaa7ca5e3
 
Device Boot  Start      End        Blocks         Id     System
/dev/sdb1        1     1435      11526606         83     Linux

3. Install XFS utilities via yum:

$ yum install -y xfs*

4. Format the drive with the XFS file system:

$ mkfs.xfs /dev/sdb1

5. Check the file system, create mount point and mount the file system:

$ xfs_check /dev/sdb1
$ mkdir /mysql
$ mount /dev/sdb1 /mysql

6. To allow it mount automatically after boot, at this line into /etc/fstab. Open the file using text editor and add following line:

/dev/sdb1    /mysql     xfs    defaults     0 0

7. Stop MySQL and copy over the data using rsync (to remain permission, ownership and timestamp) and assign correct ownership to the directory:

$ service mysql stop
$ rsync -avz /var/lib/mysql/ /mysql
$ chown mysql.mysql /mysql

8. Change the value of MySQL configuration file to be mapped into the new directory. Open /etc/my.cnf and change or add following line under [mysqld] directive:

datadir = /mysql

9. Start MySQL service:

$ service mysql start

Done. You can verify this in MySQL by executing following command:

mysql> SHOW VARIABLES LIKE 'datadir';
 
+---------------+---------+
| Variable_name | Value   |
+---------------+---------+
| datadir       | /mysql/ |
+---------------+---------+
1 row in set (0.00 sec)

To check the mount status:

$ mount | grep xfs
/dev/sdb1 on /mysql type xfs (rw)

To repair the XFS file system, we need to unmount the file system first:

$ umount /mysql
$ xfs_repair /dev/sdb1

High Availability: Web Server Cluster using Apache + Pound + Unison

To achieve high availability, what we really need is to eliminate single point of failure as many point as possible but, it comes with expensive way to do this. What if we just have 2 servers, and we want to have highest web service availability possible with lowest cost?

Most important part in this high availability is data should be sync between these servers. So we need several tools to help us achieve our target:

  • Apache – Web server
  • Pound – HTTP load balancer/failover
  • Keepalived – IP failover
  • Unison – 2 way file synchronization
  • Fsniper – Monitor file and trigger the file synchronization

Following images will give some clear explanation on the architecture that we will setup:

In this setup, SELINUX and iptables has been turning OFF and root privileges is required. I am using following variables:

OS: CentOS 6.2 64bit
Web server #1 IP: 210.48.111.21
Web server #2 IP: 210.48.111.22
Domain: icreateweb.net
Web site public IP: 210.48.111.20
Web directory: /home/icreate/public_html

The steps are similar on both servers, unless specified. To make things easier, we need to enable RPMforge repository because almost all applications that we need is available there:

$ cd /usr/local/src
$ rpm --import http://apt.sw.be/RPM-GPG-KEY.dag.txt
$ wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el6.rf.x86_64.rpm
$ rpm -Uhv rpmforge-release-0.5.2-2.el6.rf.x86_64.rpm

Both servers has been added following lines into /etc/hosts to ease up SSH communication:

210.48.111.21 webserver1
210.48.111.22 webserver2

We also need to allow SSH between node without password for auto file synchronization. Execute following command:

$ ssh-keygen -t dsa

Press ‘Enter’ until finish. Then, copy the public key to another server (webserver2):

$ ssh-copy-id -i ~/.ssh/id_dsa root@webserver2

Do this on another server as well, but copy the public key to webserver1 in ssh-copy-id command above. This step is critical and should not be skipped.

Web servers

1. Install all required applications using yum. We will skip RPMforge for the moment because I just need simple package of Apache and PHP:

$ yum install httpd* php* -y --disablerepo=rpmforge

2. Create web and logs directory for user icreate:

$ useradd -m icreate
$ chmod 755 /home/icreate
$ mkdir /home/icreate/public_html
$ mkdir /home/icreate/logs
$ touch /home/icreate/logs/access_log
$ touch /home/icreate/logs/error_log

3. We will use Pound as reverse-proxy and load balancer of port 80. So Apache need to run on different port. We will use port 88. Open /etc/httpd/conf/httpd.conf via text editor and make sure following line value as below:

Listen 88

4. Create vhosts.conf under /etc/httpd/conf.d directory and paste following line:

Web server #1:

NameVirtualHost 210.48.111.21:88
 
# Default host
<VirtualHost 210.48.111.21:88>
    ServerName localhost
    ServerAdmin admin@localhost
    DocumentRoot /var/www/html
</VirtualHost>
 
# Virtual host for domain icreateweb.net
<VirtualHost 210.48.111.21:88>
    ServerName icreateweb.net
    ServerAlias www.icreateweb.net
    ServerAdmin webmaster@icreateweb.net
    DocumentRoot /home/icreate/public_html
    ErrorLog /home/icreate/logs/error_log
    CustomLog /home/icreate/logs/access_log combined
</VirtualHost>

Web server #2:

NameVirtualHost 210.48.111.22:88
 
# Default host
<VirtualHost 210.48.111.22:88>
    ServerName localhost
    ServerAdmin admin@localhost
    DocumentRoot /var/www/html
</VirtualHost>
 
# Virtual host for domain icreateweb.net
<VirtualHost 210.48.111.22:88>
    ServerName icreateweb.net
    ServerAlias www.icreateweb.net
    ServerAdmin webmaster@icreateweb.net
    DocumentRoot /home/icreate/public_html
    ErrorLog /home/icreate/logs/error_log
    CustomLog /home/icreate/logs/access_log combined
</VirtualHost>

5. Restart and enable Apache service:

$ chkconfig httpd on
$ service httpd restart

6. Lets just create a html test file to differentiate between 2 web servers:

Web server #1:

$ echo "web server 1" > /home/icreate/public_html/server.html

Web server #2:

$ echo "web server 2" > /home/icreate/public_html/server.html

Website should run in local IP for both servers. Now we need to install and configure other applications that help us achieve high availability.

Unison & Fsniper

1. To keep all files in both servers are sync correctly, we will use Unison to execute file synchronization. Install Unison via yum:

$ yum install unison -y

2. Type following command to initialize Unison profile:

$ unison

3. Lets create Unison configuration file. Using text editor, open /root/.unison/default.prf and add following line. We also ignore server.html which we will use to determine HTTP connection from Pound whether to web #1 or web #2:

Web server #1:

root=/home/icreate/public_html
root=ssh://webserver2//home/icreate/public_html
batch=true
ignore=Name{server.html}

Web server #2:

root=/home/icreate/public_html
root=ssh://webserver1//home/icreate/public_html
batch=true
ignore=Name{server.html}

4. Now lets start first synchronization which is important. Run following command in either one server. In this case, I will run on web server #1:

$ unison default

5. Once completed, your files should be synced between both servers. Next, download and install Fsniper:

$ cd /usr/local/src
$ yum install pcre* file-libs file-devel -y
$ wget http://projects.l3ib.org/fsniper/files/fsniper-1.3.1.tar.gz
$ tar -xzf fsniper-1.3.1.tar.gz
$ cd fsniper-*
$ ./configure
$ make
$ make install

6. Create Fsniper configuration files to watch the directory and trigger the synchronization script. Open /root/.config/fsniper/config and add following line:

watch {
      /home/user/public_html {
      recurse = true
      * {
        handler = echo %%; /root/scripts/file_sync
        }
      }
}

7. Lets create the file_sync script to trigger Unison and check the process. Unison is 2 way replication so only need to have 1 process running in both servers in a same time:

$ mkdir -p /root/scripts
$ vim /root/scripts/file_sync

Web server #1:

#!/bin/bash
# Trigger Unison to do 2 way synchronization
 
# Check if Unison is running on both servers
if [ "$(pidof unison)" ] || [ "$(ssh [email protected] pidof unison)" ]
then
    echo "Unison is running. Exiting"
    exit 0
else
    /usr/bin/unison default
fi

Web server #2:

#!/bin/bash
# Trigger Unison to do 2 way synchronization
 
# Check if Unison is running on both servers
if [ "$(pidof unison)" ] || [ "$(ssh [email protected] pidof unison)" ]
then
    echo "Unison is running. Exiting"
    exit 0
else
    /usr/bin/unison default
fi

8. Now lets start the Fsniper process and allow it to start on boot:

$ /usr/local/bin/fsniper --daemon
$ echo "/usr/local/bin/fsniper --daemon" >> /etc/rc.local

KeepAlived

1. Download and install KeepAlived. This application will allow web server #1 and web server #2 to share the public IP (210.48.111.20) between them:

$ yum install -y openssl openssl-devel popt*
$ cd /usr/local/src
$ wget http://www.keepalived.org/software/keepalived-1.2.2.tar.gz
$ tar -xzf keepalived-1.2.2.tar.gz
$ cd keepalived-*
$ ./configure
$ make
$ make install

2. Since we have virtual IP which shared between these 2 servers, we need to tell kernel that we have a non-local IP to be bind to Pound later. Add following line into /etc/sysctl.conf:

net.ipv4.ip_nonlocal_bind = 1

Run following command to apply the changes:

$ sysctl -p

3. By default, keepalived configuration file will be setup under /usr/local/etc/keepalived/keepalived.conf. We will make things easier by symlink it into /etc directory. We will also need to clear the configuration example inside it:

$ ln -s /usr/local/etc/keepalived/keepalived.conf /etc/keepalived.conf
$ cat /dev/null > /etc/keepalived.conf

4. Lets configure Keepalived:

For web server #1, add following line into /etc/keepalived.conf:

vrrp_script chk_pound {
        script "killall -0 pound"       # verify the pid is exist or not
        interval 2                      # check every 2 seconds
        weight 2                        # add 2 points of prio if OK
}
 
vrrp_instance VI_1 {
        interface eth0			# interface to monitor
        state MASTER
        virtual_router_id 51		# Assign one ID for this route
        priority 101                    # 101 on master, 100 on backup
        virtual_ipaddress {
            210.48.111.20		# the virtual IP
        }
        track_script {
            chk_pound
        }
}

For web server #2, add following line into /etc/keepalived.conf:

vrrp_script chk_pound {
        script "killall -0 pound"       # verify the pid is exist or not
        interval 2                      # check every 2 seconds
        weight 2                        # add 2 points of prio if OK
}
 
vrrp_instance VI_1 {
        interface eth0			# interface to monitor
        state MASTER
        virtual_router_id 51		# Assign one ID for this route
        priority 100                    # 101 on master, 100 on backup
        virtual_ipaddress {
            210.48.111.20		# the virtual IP
        }
        track_script {
            chk_pound
        }
}

5. Start Keepalived and make it auto start after boot:

$ keepalived -f /etc/keepalived.conf
$ echo "/usr/local/sbin/keepalived -f /etc/keepalived.conf" >> /etc/rc.local

Pound

1. Install Pound:

$ rpm -Uhv http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/pound-2.4.3-1.el5.rf.x86_64.rpm

2. Configure Pound by editing the configuration file located under /etc/pound.cfg and paste following line:

User            "nobody"
Group           "nobody"
 
LogLevel        1
Alive           2
 
ListenHTTP
        Address 0.0.0.0
        Port    80
End
 
Service
        HeadRequire "Host: .*icreateweb.net.*"
 
        BackEnd
                Address 210.48.111.21
                Port    88
		TimeOut 300
        End
 
        BackEnd
                Address 210.48.111.22
                Port 88
		TimeOut 300
        End
 
        Session
                Type Cookie
                ID   "JSESSIONID"
                TTL  300
        End
End

3. Allow the service to auto start after boot and start Pound:

$ chkconfig pound on
$ service pound start

4. Check whether Pound is correctly running and listen to port 80:

$ netstat -tulpn | grep pound
tcp        0      0 0.0.0.0:80                  0.0.0.0:*                   LISTEN      6175/pound

 

Done! Now you may try to check your website availability by bringing down the HTTP service or the server itself. By using only 2 servers, it is possible to increase the service uptime to the highest possible. Cheers!

FreeBSD 9: Shared Object “libutil.so.8” not Found

Problem

After upgrading to FreeBSD 9, whenever I try to use ports to install something, I will get following error:

$ cd /usr/ports
$ make search name=nano
The search target requires INDEX-9. Please run make index or make fetchindex.

Then, I whenever I run make index command, it will prompt following error:

$ cd /usr/ports
$ make index
Generating INDEX-9 - please wait.. Shared object "libutil.so.8" not found, required by "perl ""Makefile". line 29: warning "/usr/local/bin/perl -V::usethreads" returned non-zero status

What happen?

During FreeBSD upgrade from version 8.2 to the new release 9.0, it seems like FreeBSD has deleted the old library after the second time of freebsd-update install command execution. This is usually happen when you are doing major release version upgrade.

Solution

We need to create symlink to the new libutil.so for FreeBSD 9 under /lib directory:

$ cd /lib
$ ln -s libutil.so.9 libutil.so.8

Then we need to run again to “make index” command. Make index will create the index (which then use for us to lookup ports collection) by looking at your current ports tree:

$ cd /usr/ports
$ make index
Generating INDEX-9 - please wait.. Done.

Now you should able to use ports as usual. Cheers!

MySQL: Generate Random Data using Stored Procedure

If you have setup your MySQL cluster or standalone database server correctly, you might need to do a stress test to the server. You can use many ways to achieve this and in my case, I will use the simplest way which is create generate a fake data and insert them into table using stored procedure. Actually, you can use any other script like BASH, Perl or PHP as long as it support looping.

We need to loop certain SQL command in order to generate more and more data. Before we start, make sure you have prepare the database and table to be insert into. Following informations are variable that I used:

MySQL: Version 5.5.20 Community Server
Database: dbTest
Table: tbl_data1

Firstly, we need create the database and table. Log into MySQL console or client via appropriate database user (in this case I will just use root), and execute following MySQL command:

CREATE DATABASE dbTest;
USE dbTest;
CREATE TABLE `tbl_data1` (
`a` INT(10) NULL,
`b` INT(10) NULL,
`c` INT(10) NULL
)
ENGINE=InnoDB;

Our aim is to insert 100,000 rows random number between 0 to 1000 into the table.

DELIMITER //
CREATE PROCEDURE GenerateFakeData()
BEGIN
DECLARE i INT DEFAULT 1;
WHILE i <= 100000 DO
INSERT INTO tbl_data1 (a,b,c) VALUES ((SELECT floor(rand() * 1000) AS randNum), (SELECT floor(rand() * 1000) AS randNum),(SELECT floor(rand() * 1000) AS randNum));
SET i = i + 1;
END WHILE;
END //
DELIMITER ;

To call the procedure above, execute following command:

USE dbTest;
CALL GenerateFakeData();

This will execute the stored procedure and the table will contains 100,000 rows of data.

SELECT * FROM dbTest.tbl_data1 LIMIT 1,10;

Example results:

+------+------+------+
| a    | b    | c    |
+------+------+------+
|  521 |  662 |  748 |
|  753 |  523 |  356 |
|  212 |  994 |  334 |
|  689 |  441 |  139 |
|  374 |  453 |  145 |
|  364 |  386 |  837 |
|   30 |  637 |   96 |
|  571 |  569 |  129 |
|  940 |  312 |  741 |
|  770 |  628 |  828 |
+------+------+------+
10 rows in set (0.00 sec)

You can start to play around with this table like do some heavy queries, play with indexes and much more.

MySQL User Privileges Explained

If you familiar with MySQL, following are the user privileges which available:

Data

SELECT – Allows reading data
INSERT – Allows inserting and replacing data
UPDATE – Allows changing data
DELETE – Allows deleting data
FILE – Allows importing data from and exporting data into files

Structure

CREATE – Allows creating new databases and tables
ALTER – Allows altering the structure of existing tables
INDEX – Allows creating and dropping indexes
DROP – Allows dropping databases and tables
CREATE TEMPORARY TABLES – Allows creating temporary tables
SHOW VIEW – Allows performing SHOW CREATE VIEW queries
CREATE ROUTINE – Allows creating stored routines
ALTER ROUTINE – Allows altering and dropping stored routines
EXECUTE – Allows executing stored routines
CREATE VIEW – Allows creating new views
EVENT – Allows to set up events for event scheduler
TRIGGER – Allows creating and dropping triggers

Administration

GRANT – Allows adding users and privileges without reloading the privilege tables
SUPER – Allows connecting, even if maximum number of connections has reached, required for most administrative operations like setting global variables or killing threads for other users
PROCESS – Allows viewing processes for all users
RELOAD – Allows reloading the server settings and flushing the server’s cache
SHUTDOWN – Allows shutting down the MySQL server
SHOW DATABASES – Gives access to the complete list of databases
LOCK TABLES – Allows locking tables for the current thread
REFERENCES – No usage
REPLICATION CLIENT – Allows the user to ask where the slaves/masters are
REPLICATION SLAVE – Needed for replication slaves
CREATE USER – Allows creating, dropping and renaming user accounts