Galera Cluster – Simpler way to view GRA file content

Galera Cluster (MySQL from Codership, Percona XtraDB Cluster, MariaDB Galera Cluster) generates a GRA log files if it fails to apply the writeset on the target node. This files exists in the MySQL data directory. You can get an overview of the file (if exist) by listing your MySQL data directory (in my case, the data directory is at /var/lib/mysql):

$ ls -1 /var/lib/mysql | grep GRA
GRA_10_104865779.log
GRA_13_104865781.log
GRA_5_104865780.log

MySQL Performance Blog has covered this topic in well-explained. I’m going to make this simple. Download the script here and copy it to your /usr/bin directory:

wget http://blog.secaserver.com/files/grareader -P /usr/bin/
chmod 755 /usr/bin/grareader

Just run following command to simply convert the GRA log file to a human-readable output:

grareader [gra_log_file]

Here is the example output:

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @[email protected]@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#140114 3:12:42 server id 3 end_log_pos 120 Start: binlog v 4, server v 5.6.15-log created 140114 3:12:42 at startup
ROLLBACK/*!*/;
BINLOG '
qjrUUg8DAAAAdAAAAHgAAAAAAAQANS42LjE1LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACqOtRSEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAf73
8eY=
'/*!*/;
# at 120
#140114 3:12:43 server id 3 end_log_pos 143 Stop
# at 143
#140507 14:55:42 server id 4 end_log_pos 126 Query thread_id=3173489 exec_time=0 error_code=0
use `test_shop`/*!*/;
SET TIMESTAMP=1399445742/*!*/;
SET @@session.pseudo_thread_id=3173489/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER TABLE `tblreshipment_header` DROP `ShipmentStatus`
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET [email protected]_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

You can download the script here or copy and paste the code:

#!/bin/bash
# Convert Galera GRA_* files to human readable output
# Usage: grareader 
# Example: grareader /var/lib/mysql/GRA_1_1.log
 
##
## GRA header file path
##
path=/tmp
gra_header_path=$path/GRA-Header
tmp_path=$path/grareader.tmp
 
input=$1
[ ! -e $input ] && echo 'Error: File does not exist' && exit 1
 
get_gra_header()
{
        download_url='http://blog.secaserver.com/files/GRA-Header'
        wget_bin=`command -v wget`
        [ -z "$wget_bin" ] && echo 'Error: Unable to locate wget. Please install it first' && exit 1
        echo "Downloadling GRA-Header file into $path"
        $wget_bin --quiet $download_url -P $path
        [ $? -ne 0 ] && echo 'Error: Download failed' && exit 1
}
 
locate_files()
{
        mysqlbinlog_bin=`command -v mysqlbinlog`
        [ -z "$mysqlbinlog_bin" ] && echo 'Error: Unable to locate mysqlbinlog binary. Please install it first' && exit 1
        [ ! -e $gra_header_path ] && echo 'Error: Unable to locate GRA header file' && get_gra_header
 
}
 
locate_files
 
cat $gra_header_path >> $tmp_path
cat $input >> $tmp_path
 
echo ''
clear
$mysqlbinlog_bin -v -v -v $tmp_path
echo ''
rm -rf $tmp_path

 

Hope this could help make your Galera administrative task simpler!

Percona Server Installation Error – libssl.so.10 and libcrypto.so.10

I stumbled upon one error when installing Percona Server and socat via yum repository with following error:

--> Processing Dependency: libssl.so.10(libssl.so.10)(64bit) for package: socat-1.7.2.3-1.el6.x86_64
--> Processing Dependency: libcrypto.so.10(libcrypto.so.10)(64bit) for package: socat-1.7.2.3-1.el6.x86_64

It turns out that:

“Red Hat upgraded the version of OpenSSL in EL6 from 1.0.0 to 1.0.1 during the 6.4-6.5 cycle, in order to resolve a years-old feature request. This package is no longer binary compatible, and programs that were built against OpenSSL 1.0.0 must be rebuilt from source against 1.0.1.”

What do we need to do then?

Luckily the package is available at IUS repository:

rpm -Uhv http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/epel-release-6-5.noarch.rpm
rpm -Uhv http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/ius-release-1.0-13.ius.el6.noarch.rpm
yum install yum-plugin-replace
yum replace --enablerepo=ius-archive openssl --replace-with openssl10

(Press y for any question)

Verify the dependent libraries are exist (libcrypto.so.10 and libssl.so.10) :

ls /usr/lib64/ | grep -e ssl.so -e crypto.so
libcrypto.so.10
libcrypto.so.1.0.1e
libssl3.so
libssl.so.10
libssl.so.1.0.1e

Then, try again with the Percona Server installation command. In some cases, you might need to remove the installed epel-release package since it’s a little bit outdated with the current release 6.8.

Hope the solution will help you guys out there!

 

 

Converting Magento to Work Well on Galera Cluster

I have a Magento data set which run on MySQL-wsrep with Galera. Galera has its known limitations, and one of it is:

DELETE operation is unsupported on tables without primary key. Also rows in tables without primary key may appear in different order on different nodes. Don’t use tables without primary key.

Basically, if you want to have your DB serve by Galera cluster, please use InnoDB storage engine and define a primary key for each table. That’s all. Since Magento dataset is unaware of this limitation, you could see that there are many tables do not meet the criteria.

You can use following query to identify unsupported stuffs in Galera (Thanks to Giuseppe Maxia for this):

SELECT DISTINCT Concat(t.table_schema, '.', t.table_name)     AS tbl,
                t.engine,
                IF(Isnull(c.constraint_name), 'NOPK', '')     AS nopk,
                IF(s.index_type = 'FULLTEXT', 'FULLTEXT', '') AS ftidx,
                IF(s.index_type = 'SPATIAL', 'SPATIAL', '')   AS gisidx
FROM   information_schema.tables AS t
       LEFT JOIN information_schema.key_column_usage AS c
              ON ( t.table_schema = c.constraint_schema
                   AND t.table_name = c.table_name
                   AND c.constraint_name = 'PRIMARY' )
       LEFT JOIN information_schema.statistics AS s
              ON ( t.table_schema = s.table_schema
                   AND t.table_name = s.table_name
                   AND s.index_type IN ( 'FULLTEXT', 'SPATIAL' ) )
WHERE  t.table_schema NOT IN ( 'information_schema', 'performance_schema','mysql' )
       AND t.table_type = 'BASE TABLE'
       AND ( t.engine <> 'InnoDB'
              OR c.constraint_name IS NULL
              OR s.index_type IN ( 'FULLTEXT', 'SPATIAL' ) )
ORDER  BY t.table_schema,
          t.table_name;

Example:

mysql> SELECT DISTINCT
    ->        CONCAT(t.table_schema,'.',t.table_name) as tbl,
    ->        t.engine,
    ->        IF(ISNULL(c.constraint_name),'NOPK','') AS nopk,
    ->        IF(s.index_type = 'FULLTEXT','FULLTEXT','') as ftidx,
    ->        IF(s.index_type = 'SPATIAL','SPATIAL','') as gisidx
    ->   FROM information_schema.tables AS t
    ->   LEFT JOIN information_schema.key_column_usage AS c
    ->     ON (t.table_schema = c.constraint_schema AND t.table_name = c.table_name
    ->         AND c.constraint_name = 'PRIMARY')
    ->   LEFT JOIN information_schema.statistics AS s
    ->     ON (t.table_schema = s.table_schema AND t.table_name = s.table_name
    ->         AND s.index_type IN ('FULLTEXT','SPATIAL'))
    ->   WHERE t.table_schema NOT IN ('information_schema','performance_schema','mysql')
    ->     AND t.table_type = 'BASE TABLE'
    ->     AND (t.engine <> 'InnoDB' OR c.constraint_name IS NULL OR s.index_type IN ('FULLTEXT','SPATIAL'))
    ->   ORDER BY t.table_schema,t.table_name;
+-------------------------------------------------+--------+------+----------+--------+
| tbl                                             | engine | nopk | ftidx    | gisidx |
+-------------------------------------------------+--------+------+----------+--------+
| magento.api2_acl_user                           | InnoDB | NOPK |          |        |
| magento.api_session                             | InnoDB | NOPK |          |        |
| magento.catalogsearch_fulltext                  | MyISAM |      | FULLTEXT |        |
| magento.catalog_category_anc_categs_index_idx   | InnoDB | NOPK |          |        |
| magento.catalog_category_anc_categs_index_tmp   | InnoDB | NOPK |          |        |
| magento.catalog_category_anc_products_index_idx | InnoDB | NOPK |          |        |
| magento.catalog_category_anc_products_index_tmp | InnoDB | NOPK |          |        |
| magento.catalog_category_product_index_enbl_idx | InnoDB | NOPK |          |        |
| magento.catalog_category_product_index_enbl_tmp | InnoDB | NOPK |          |        |
| magento.catalog_category_product_index_idx      | InnoDB | NOPK |          |        |
| magento.catalog_category_product_index_tmp      | InnoDB | NOPK |          |        |
| magento.catalog_product_index_price_downlod_tmp | MEMORY |      |          |        |
| magento.oauth_nonce                             | MyISAM | NOPK |          |        |
| magento.weee_discount                           | InnoDB | NOPK |          |        |
| magento.widget_instance_page_layout             | InnoDB | NOPK |          |        |
| magento.xmlconnect_config_data                  | InnoDB | NOPK |          |        |
+-------------------------------------------------+--------+------+----------+--------+

 

I do not know much about Magento data set and structure. So I am assuming that the output above can simply bring future problems according to Galera limitation, so it might be good to comply with that and alter whatever necessary on those tables.

So I start by adding a simple auto increment primary key into tables labeled as NOPK:

mysql> ALTER TABLE magento.api2_acl_user ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST;
mysql> ALTER TABLE magento.api_session ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST;
mysql> ALTER TABLE magento.weee_discount ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST;
mysql> ALTER TABLE magento.widget_instance_page_layout ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST;
mysql> ALTER TABLE magento.xmlconnect_config_data ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST;

Next, add primary key and convert the storage as engine to InnoDB:

mysql> ALTER TABLE magento.oauth_nonce ADD id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY FIRST, ENGINE='InnoDB';

Then, remove the full-text indexing and convert the storage engine to InnoDB:

mysql> ALTER TABLE magento.catalogsearch_fulltext DROP INDEX FTI_CATALOGSEARCH_FULLTEXT_DATA_INDEX;
mysql> ALTER TABLE magento.catalogsearch_fulltext ENGINE='InnoDB';

I am quite sure the above drop indexes statement would likely to have some performance hit. Maybe Magento does not really fit in Galera multi-master environment, but it is worth to give it a try. I will keep updating this post to share about this.

Am going to sleep now. Cheers!

Ubuntu: Error Installing MySQL Server

I encountered following error when trying to upgrade MySQL server in Ubuntu 12.04:

dpkg: error processing mysql-server-5.5 (--configure):
 subprocess installed post-installation script returned error exit status 1
No apport report written because MaxReports is reached already
                                                              Setting up mysql-client (5.5.22-0ubuntu1) ...
dpkg: dependency problems prevent configuration of mysql-server:
 mysql-server depends on mysql-server-5.5; however:
  Package mysql-server-5.5 is not configured yet.
dpkg: error processing mysql-server (--configure):
 dependency problems - leaving unconfigured
No apport report written because MaxReports is reached already
                                                              Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
Errors were encountered while processing:
 mysql-server-5.5
 mysql-server
E: Sub-process /usr/bin/dpkg returned an error code (1)
A package failed to install.  Trying to recover:
Setting up mysql-server-5.5 (5.5.22-0ubuntu1) ...
start: Job failed to start
invoke-rc.d: initscript mysql, action "start" failed.
dpkg: error processing mysql-server-5.5 (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of mysql-server:
 mysql-server depends on mysql-server-5.5; however:
  Package mysql-server-5.5 is not configured yet.
dpkg: error processing mysql-server (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 mysql-server-5.5
 mysql-server

The easiest way to overcome this is by removing all mysql related packages in the server:

$ sudo apt-get purge mysql*

Then, try again to reinstall the mysql-server package as follow:

$ sudo apt-get install -y  mysql-client mysql-server

Make sure to backup your data files before performing the upgrade! Hope this workaround will help others!

How to Fix ‘Too many open files’ Problem

I have been facing following problem when executing Percona Xtrabackup in my CentOS 6.3 box:

xtrabackup_55 version 2.1.3 for Percona Server 5.5.16 Linux (x86_64) 
(revision id: 608) 
xtrabackup: uses posix_fadvise(). 
xtrabackup: cd to /var/lib/mysql 
xtrabackup: Target instance is assumed as followings. 
xtrabackup: innodb_data_home_dir = ./ 
xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend 
xtrabackup: innodb_log_group_home_dir = ./ 
xtrabackup: innodb_log_files_in_group = 2 
xtrabackup: innodb_log_file_size = 67108864 
xtrabackup: using O_DIRECT 
130619 12:57:36 InnoDB: Warning: allocated tablespace 2405, old maximum 
was 9 
130619 12:57:37 InnoDB: Operating system error number 24 in a file 
operation. 
InnoDB: Error number 24 means 'Too many open files'. 
InnoDB: Some operating system error numbers are described at 
InnoDB: 
http://dev.mysql.com/doc/refman/5.5/en/operating-system-error-codes.html 
InnoDB: Error: could not open single-table tablespace file 
InnoDB: We do not continue the crash recovery, because the table may become 
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it. 
InnoDB: To fix the problem and start mysqld:

Linux / UNIX sets soft and hard limit for the number of file handles and open files. By default the value is too low as you can check using following command:

$ ulimit -n
1024

To increase open files limitation, you can use several ways:

1. Set the limit using ulimit command

$ ulimit -n 8192

This is temporary solution as it will increase the limit accordingly per login session.  Once you logged out and login again, this value will back to default.

2. Permanently define in /etc/security/limits.conf

To make it permanent, you can define the values (soft and hard limit) at /etc/security/limits.conf by adding following lines:

* soft nofile 8192
* hard nofile 8192

The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit. Reboot the server to apply the changes. Or, if you do not want to reboot, add following line into the respective user’s .bashrc file, as in my case is root:

$ echo "ulimit -n 8192" >> ~/.bashrc

You will then need to relogin into the session to see the changes.

If the problem still persists, you might need to increase the limit higher and retry again the failed process.

Warning

Do not set the value to unlimited as it can caused PAM to fail and you will not able to SSH or console into the box with following error:

Apr 19 09:22:15 rh02 sshd[5679]: error: PAM: pam_open_session(): Permission denied

This issue has been reported in this bug report.

Further reading:

http://ss64.com/bash/ulimit.html
http://ss64.com/bash/limits.conf.html

Fixing Auto Start and Auto Shutdown Issue in VMware ESXi 5.0

I am a free VMware ESXi 5.0.0 user. The biggest problem with this release is the failure of auto-start and auto-shutdown for VM after/before the hardware node reboot. Whenever you start or restart the ESXi node, you will need to manually turn on every single virtual machine inside it. This has caused a lot of inconvenience especially when your ESXi was in production server line.

VMware has described this bug in details as refer to this link: http://blogs.vmware.com/vsphere/2012/07/clarification-on-the-auto-start-issues-in-vsphere-51-update-1.html.

1. As for me, I will need to download the patch for ESXi 5.0.0. Go to this page: http://www.vmware.com/patchmgr/findPatch.portal and login into your VMware account. Search with following criteria in that page:

 

I will be downloading this into my Windows 7 PC and will be using SSH method to apply the patch.

 

2. Enable SSH. Go to vSphere Client > ESXi host > Configuration > Security Profile > Services > Properties and make sure SSH is running as screenshot below:

 

3. Go to vSphere Client > ESXi host > Configuration > right click storage > Browse Datastore. Create a new folder called ‘update’ inside the datastore and upload the patch as screenshot below:

 

4. Login into the ESXi server using SSH. Run following command to verify the image profiles:

$ esxcli software sources profile list --depot=[datastore1]/update/ESXi500-201207001.zip
Name                              Vendor        Acceptance Level
--------------------------------  ------------  ----------------
ESXi-5.0.0-20120701001s-standard  VMware, Inc.  PartnerSupported
ESXi-5.0.0-20120704001-no-tools   VMware, Inc.  PartnerSupported
ESXi-5.0.0-20120701001s-no-tools  VMware, Inc.  PartnerSupported
ESXi-5.0.0-20120704001-standard   VMware, Inc.  PartnerSupported

 

5. Put the host into maintenance mode. Go to vSphere client > right click to the ESXi node > Enter Maintenance Mode > Yes.

 

6. Start the update process by running following command. We will use profile ESXi-5.0.0-20120701001s-standard for this update:

$ esxcli software profile update --depot=[datastore1]/update/ESXi500-201207001.zip --profile=ESXi-5.0.0-20120701001s-standard

 

7. Reboot the ESXi node using command line or using vSphere client. Once up, exit the maintenance mode and you can verify whether it is fixed by enabling Virtual Machine Startup/Shutdown options under vSphere client > ESXi node > Configuration > Software >  Virtual Machine Startup/Shutdown > Properties and enable the VM auto start as screen shot below:

 

 

Done. Try to reboot the ESXi host and you should see something like below on your vSphere client:

 

FreeBSD 9: Shared Object “libutil.so.8” not Found

Problem

After upgrading to FreeBSD 9, whenever I try to use ports to install something, I will get following error:

$ cd /usr/ports
$ make search name=nano
The search target requires INDEX-9. Please run make index or make fetchindex.

Then, I whenever I run make index command, it will prompt following error:

$ cd /usr/ports
$ make index
Generating INDEX-9 - please wait.. Shared object "libutil.so.8" not found, required by "perl ""Makefile". line 29: warning "/usr/local/bin/perl -V::usethreads" returned non-zero status

What happen?

During FreeBSD upgrade from version 8.2 to the new release 9.0, it seems like FreeBSD has deleted the old library after the second time of freebsd-update install command execution. This is usually happen when you are doing major release version upgrade.

Solution

We need to create symlink to the new libutil.so for FreeBSD 9 under /lib directory:

$ cd /lib
$ ln -s libutil.so.9 libutil.so.8

Then we need to run again to “make index” command. Make index will create the index (which then use for us to lookup ports collection) by looking at your current ports tree:

$ cd /usr/ports
$ make index
Generating INDEX-9 - please wait.. Done.

Now you should able to use ports as usual. Cheers!

FreeBSD: Upgrade from 8.2 to 9.0

If you use this command to upgrade to latest release FreeBSD 9.0:

$ freebsd-update -r 9.0-RELEASE upgrade

You might see following error:

The update metadata is correctly signed, but
failed an integrity check.
Cowardly refusing to proceed any further.

This error indicate that it cannot accept % and @ characters which appear in FreeBSD 9 . To overcome this, run following command:

$ sed -i '' -e 's/=_/=%@_/' /usr/sbin/freebsd-update

Now start the upgrade process:

$ freebsd-update -r 9.0-RELEASE upgrade

Accept all prompted values and follow the wizard. This process downloads all files and patches required for upgrade so it takes time. You might need to press ‘Enter’ once to check /etc/hosts file. Once complete, run following command to start installing the updates:

$ freebsd-update install

After a while, you should see the system will prompt something as below:

Installing updates...rmdir: ///boot/kernel: Directory not empty
 
Kernel updates have been installed. Please reboot and run "/usr/sbin/freebsd-update install"
again to finish installing updates.

Reboot the server:

$ init 6

Once up, it will boot to FreeBSD 9. Run again the installation command:

$ freebsd-update install

After the process completed, the system will ask you to build back all your application which installed using ports. Once done, you need to rerun again the above command to complete the upgrade process and you should something like below:

$ freebsd-update install
Installing updates... Done

Your update should be completed now. To check the new version, run following command:

$ uname -r
9.0-RELEASE

Source: http://lists.freebsd.org/pipermail/freebsd-stable/2011-October/064321.html

Linux: Remove Files/Folder More Than Certain Time

One of our server encounter backup problem because there are too many files need to be backup, near 200 million total of files. What I need to do is to remove some files in some folders, and let it run automatically everyday to remove the unwanted files. In our web server, we have one temporary folder that use to have lots of temporary files. The folder is located under /home/mywebsite/temp_upload/ .

I started by checking the inodes (the number of files) in this folder:

$ du -sk /home/mywebsite/temp_upload/
10543660 /home/mywebsite/temp_upload/

As you can see, I have 10 million files inside this directory. Our developer has forgot to remove the unused files so I need to create a cron job to remove files which older than 3 months (90 days) in this directory. To remove, the command should be as below:

$ find /home/mywebsite/temp_upload/ -type f -mtime +90 | xargs rm -Rf

It takes some time to complete and once done, the inodes has dropped to 2077:

du -sk /home/mywebsite/temp_upload/
2077 /home/mywebsite/temp_upload

To automate this, just add the command into crontab and schedule to run on weekly basis (at 6 AM every Sunday):

$ crontab -e

Add following line:

0 6 * * 0 /bin/find /home/mywebsite/temp_upload/ -type f -mtime +90 | xargs rm -Rf

Restart crond to apply the cron changes:

$ service crond restart

Warning: Make sure you run the command during low peak hours. This process might overloading your server, as what happened to me due to wrong time zone 🙂

Now you should automate the files removal and you can focus on other things!

Windows: The ‘Microsoft.ACE.OLEDB.12.0’ provider error

Usually in Windows Server 2008 R2 or any related Windows operating system which run on 64bit, you might facing following error if you want to run some applications:

Unhandled exception has occurred in your application. If you click Continue, the application will ignore this error and attempt to continue. If you click Quit, the application will close immediately.
 
Object reference not set to an instance of an object.
The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine.

Or in the screenshot version as below:

What you need to do to solve this problem is:

1. Download the Microsoft Access Database Engine 2010 Redistributable 64bit at here. Download the correct version which is “AccessDatabaseEngine_x64.exe”.

2. Uninstall the older version of Microsoft Access Database Engine (usually the version of 2007). Go to:

Control Panel > Add/Remove Program > Microsoft Access Database Engine 2007 > Right click > Uninstall

3. Install the one that we just download. This installation should install the latest OLE DB and ODBC version 14.0 (during this time) where you can check at:

Control Panel > System and Security > Administrative Tools > Data Sources (ODBC) > Drivers

Done. Even though the version is higher, this ODBC and OLEDB driver are backward compatible. You can try to re-run back the application and the error should disappear at this moment. Cheers!

Linux: Log Rotation Customization

Problem Description

Our server has facing problem due to heavy development on web applications project. As the result, it cause our server hard disk turn full within a week. It all cause by the error log, which actually being rotated on weekly basis and has grow to 103GB within a week after rotation!

In the same time, it will consume CPU processing to the peak because the logrotate service will automatically compress the huge log file as what has been setup in /etc/logrotate.conf by default.

Symptom

Following process has been captured when I encounter this problem:

root     13324 38.2  0.0   4028   620 ?        RN   18:59  11:10      \_ gzip -f -

By checking the PID environment, we can see that this process is creating .gz (Gunzip) file for error log in Apache archive directory:

root@server [~] ll /proc/13324/fd
total 0
dr-x------ 2 root root  0 Jan 11 19:00 ./
dr-xr-xr-x 5 root root  0 Jan 11 19:00 ../
lr-x------ 1 root root 64 Jan 11 19:29 0 -> pipe:[1815813470]
l-wx------ 1 root root 64 Jan 11 19:29 1 -> /usr/local/apache/logs/archive/error_log-01-2012.gz
l-wx------ 1 root root 64 Jan 11 19:29 2 -> /usr/local/cpanel/logs/stats_log

Analysis

Following is what has been setup by default on my /etc/logrotate.d/httpd :

/var/log/httpd/*log {
    missingok
    notifempty
    sharedscripts
    postrotate
        /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
}

The log rotation will find any files which ended with ‘log’ (eg: access.log or error.log) under directory /var/log/httpd:

  • missingok – if the log file is missing, go on to the next log file without issuing error
  • notifempty – disable log rotation for empty file
  • sharedscripts – once all log files has been rotated, it will execute postrotate script once. If you are not specifying this option, logrotate will reload httpd for every log file which has been rotated
  • postrotate – bash script to what logrotate will do once the log file rotated. Apache need to be reloaded in order to write to the new files after log rotation complete
  • endscript – means the end line of the postrotate script

 

Solution

So I need to change my log rotation configuration for httpd by adding following line into /etc/logrotate.d/httpd:

/etc/httpd/logs/error_log {
    rotate 5
    nocompress
    notifempty
    size 1024M
    postrotate
        /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
}

Since I am using cPanel, the Apache log is located under /etc/httpd/logs/error_log:

  • rotate 5 – log rotation will kept the last 5 log files. The rest will be deleted -> For backup purpose
  • nocompress – the last log files will not be compress -> Save CPU consumption
  • notifempty – disable log rotation for empty file -> Do not waste time to process empty file
  • size 1024M – rotate when the file size reach 1GB -> Do not waste disk space
  • postrotate – bash script to what logrotate will do once the log file rotated. Httpd need to be reloaded in order to write to the new files after rotated -> Restart httpd so the new log will take place in new file
  • endscript – means the end line of the postrotate script

The Best Way to do Server Documentation

As a server administrator, it is very important to have a knowledge base section as our reference point. Forget can never be forgiven. You can use any online documentation tools like Google Docs, Zoho, Evernote and  Office 365. For me I will use Microsoft Office OneNote 2010, which is come in my office laptop as part of Microsoft Office Professional Plus 2010.

The key features that I am looking is transparent synchronization with my online docs. Since I have my Windows Live login, which previously registered for MSN Messenger, I can integrate my local OneNote in my laptop into OneNote in Windows SkyDrive. How cool is that?

The best thing about OneNote is you can write, copy/paste, erase, draw just like a note book. It also has protected section where you can put and store your sensitive and confidential information securely.

Following YouTube video is simple tutorial on how to use OneNote:

Requirement:

OS: Windows 7 Home Premium 64bit
Office application: Microsoft Office OneNote 2010
Windows Live ID: [email protected]
Web browser: Internet Explorer 9 64bit

1. First of all, register your Windows Live ID (if you dont have) at https://signup.live.com/ so we can sync our notes to SkyDrive.

2. Login into SkyDrive at https://skydrive.live.com using Internet Explorer. Click the OneNote icon as screenshot below and create new Notebooks:

 

3. You should see something like screenshot below. I will then create my own section called “Linux” and another new page called “My 1st Documentation“:

4. Now open our Microsoft OneNote in the laptop. Go to Open > Open from the Web > Sign In. Enter you Windows Live ID credentials and click OK.

5. OneNote will then try to retrieve the online Notebooks associated with the user. Once done, click to the Notebooks list “ServerDocumentation” and then it will try to load the content of this notebook on the local OneNote. Below screenshot is what it looks alike when loaded into our local OneNote:

Done! You now have your tools to save all your notes, knowledge base, clips, and much more which accessible all around the world. You can choose to use local OneNote in your laptop or use OneNote WebApp which available at SkyDrive by accessing it using your web browser (not specifically IE). This application do not have “Save” button because it is automatically saved on any changes happen on both sides. Just like your notes!