Friday, October 29, 2010

The case of the read-only file system

Today, I received a notification from Nagios that one of my servers, running Sphinx search engine, was not OK. The message said the RAID controller didn't work well, but this was not the case. Sphinx itself was still running and it served the requests. It just could not update the indices to reflect the new data. I logged in with SSH and tried to run the Nagios plugin manually and received the error:

$ /usr/lib/nagios/plugins/check_hpacucli
/usr/lib/nagios/plugins/check_hpacucli: line 72: /tmp/hpacucli.txt: Read-only file system
rm: cannot remove `/tmp/hpacucli.txt': No such file or directory

Was it really read-only?! /tmp was not a mounted file system, it was the root FS

$ touch 1.txt
touch: cannot touch `1.txt': Read-only file system

It seemed that the root file system had really turned read-only.

$ mount
/dev/mapper/sphinx-root on / type ext4 (rw,errors=remount-ro)
$ cat /etc/fstab
/dev/mapper/sphinx-root /               ext4    errors=remount-ro 0       1

The meaning of the flag errors=remount-ro must be obvious: it remounts the file system in question in read-only mode when some errors may corrupt the FS. I had to find out what might kind of errors could trigger the flag. But there was nothing special in the system logs:

$ sudo tail -n 2 syslog
Oct 27 12:08:14 sphinx kernel: [10073549.962985] IP Tables: IN=eth1 OUT= MAC=ff:ff:ff:ff:ff:ff:18:a9:05:41:f3:ce:08:00 SRC=212.24.56.5 DST=212.24.56.31 LEN=78 TOS=0x00 PREC=0x00 TTL=128 ID=31736 PROTO=UDP SPT=137 DPT=137 LEN=58
Oct 27 12:08:14 sphinx kernel: [10073550.725397] IP Tables: IN=eth1 OUT= MAC=ff:ff:ff:ff:ff:ff:18:a9:05:41:f3:ce:08:00 SRC=212.24.56.5 DST=212.24.56.31 LEN=78 TOS=0x00 PREC=0x00 TTL=128 ID=32600 PROTO=UDP SPT=137 DPT=137 LEN=58
$ sudo tail -n 3 messages
Oct 27 12:07:22 sphinx kernel: [10073498.248999]       blocks= 879032432 block_size= 512
Oct 27 12:07:22 sphinx kernel: [10073498.249037]       heads=255, sectors=32, cylinders=107725
Oct 27 12:07:22 sphinx kernel: [10073498.249038] 

Of course, nothing could get into the log files as soon as the FS stopped writing into files! However, there is one more log, which is not written into a file. You can browse it using dmesg command:

$ sudo dmesg|tail
[10105513.875139] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=94.55.152.129 DST=212.24.56.25 LEN=48 TOS=0x00 PREC=0x00 TTL=117 ID=27206 DF PROTO=TCP SPT=3128 DPT=445 WINDOW=65535 RES=0x00 SYN URGP=0
[10105565.081765] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=81.198.246.184 DST=212.24.56.25 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=55870 DF PROTO=TCP SPT=3218 DPT=445 WINDOW=65535 RES=0x00 SYN URGP=0
[10105568.037173] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=81.198.246.184 DST=212.24.56.25 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=56107 DF PROTO=TCP SPT=3218 DPT=445 WINDOW=65535 RES=0x00 SYN URGP=0
[10105707.853626] IP Tables: IN=eth1 OUT= MAC=ff:ff:ff:ff:ff:ff:18:a9:05:41:f3:ce:08:00 SRC=212.24.56.5 DST=212.24.56.31 LEN=78 TOS=0x00 PREC=0x00 TTL=128 ID=29692 PROTO=UDP SPT=137 DPT=137 LEN=58
[10105708.615590] IP Tables: IN=eth1 OUT= MAC=ff:ff:ff:ff:ff:ff:18:a9:05:41:f3:ce:08:00 SRC=212.24.56.5 DST=212.24.56.31 LEN=78 TOS=0x00 PREC=0x00 TTL=128 ID=29742 PROTO=UDP SPT=137 DPT=137 LEN=58
[10105709.377978] IP Tables: IN=eth1 OUT= MAC=ff:ff:ff:ff:ff:ff:18:a9:05:41:f3:ce:08:00 SRC=212.24.56.5 DST=212.24.56.31 LEN=78 TOS=0x00 PREC=0x00 TTL=128 ID=30803 PROTO=UDP SPT=137 DPT=137 LEN=58
[10106190.118646] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=92.226.213.205 DST=212.24.56.25 LEN=52 TOS=0x00 PREC=0x00 TTL=119 ID=30620 DF PROTO=TCP SPT=2961 DPT=445 WINDOW=32767 RES=0x00 SYN URGP=0
[10106193.144746] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=92.226.213.205 DST=212.24.56.25 LEN=52 TOS=0x00 PREC=0x00 TTL=119 ID=30905 DF PROTO=TCP SPT=2961 DPT=445 WINDOW=32767 RES=0x00 SYN URGP=0
[10106214.257277] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=109.160.88.133 DST=212.24.56.25 LEN=434 TOS=0x00 PREC=0x00 TTL=50 ID=0 DF PROTO=UDP SPT=21015 DPT=5060 LEN=414
[10106311.722746] IP Tables: IN=eth1 OUT= MAC=00:24:81:fb:2b:ad:00:0c:cf:47:11:c0:08:00 SRC=83.234.62.250 DST=212.24.56.25 LEN=48 TOS=0x00 PREC=0x00 TTL=121 ID=55552 PROTO=TCP SPT=53697 DPT=2222 WINDOW=65535 RES=0x00 SYN URGP=0

In dmesg log, I found some interesting lines:

[10073618.603399] do_get_write_access: OOM for frozen_buffer
[10073618.603451] ext4_reserve_inode_write: aborting transaction: Out of memory in __ext4_journal_get_write_access
[10073618.603541] EXT4-fs error (device dm-0) in ext4_reserve_inode_write: Out of memory
[10073618.603623] Aborting journal on device dm-0:8.
[10073618.603839] EXT4-fs (dm-0): Remounting filesystem read-only
[10073618.611361] EXT4-fs error (device dm-0) in ext4_dirty_inode: Out of memory
[10073618.876932] htop invoked oom-killer: gfp_mask=0x280da, order=0, oomkilladj=0

Out of memory killer is an funny feature of the Linux kernel. When the kernel needs more memory than it has available, it kills in a more or less random fashion one or more running processes. But when did it happen? The timestamps in dmesg are not very informative. The figures in brackets are seconds since boot time. Which means I had to find the uptime in seconds. It is very easy, just read the file /proc/uptime. This file is located on the virtual filesystem and hence it was still writable:

$ cat /proc/uptime
10132406.21 161014358.20

The first number is the number of seconds sine boot time. So, I just had to subtract the dmesg number from the /proc/uptime number, divide it by 60 and I got the number of minutes passed since the process was killed.

Actually, this information was not really important, it was just interesting. Now, I had the explanation of the situation. Sphinx must have allocated too much memory, the kernel killed some process (the message in dmesg said it was htop, but it was not the only one, there were more processes killed). The process must have been writing something to the disk when it was shot down. The file system recognized it as an error and switched to the read-only mode, as prescribed by the fstab option errors=remount-ro.

First of all, I tried to remount the file system back in rw mode. The usual command mount -o remount,rw would not work, because mount attempts to write to /etc/mtab, located on the read-only file system. So, I had to give one more flag to avoid writing to mtab:

$ sudo mount -n -o remount,rw
mount: block device /dev/mapper/sphinx-root is write-protected, mounting read-only

Read-only again... I thought I had to run fsck to fix the errors, but to do so, I had to reboot the server. It was probable that the automatic fsck would not solve the problem and I would need a live CD to run fsck. So, I sighed and moved Sphinx to another server. When it was done and the second server began serving users' requests, I inserted a live CD in the server and rebooted. Fortunately, the automatic fsck fixed the errors, the server rebooted once again and very soon Sphinx was up and running again.

Sunday, October 24, 2010

MySQL reference I had always missed

InnoDB Glossary. These terms are commonly used in information about the InnoDB storage engine. No idea, why I have never seen this before. I found the article today and I'm still reading. It's a very dense explanation of many important construction blocks of InnoDB (and, of course, XtraDB!) engine. Must read.

Friday, October 15, 2010

Nagios notifications

For quite some time, I've been receiving Nagios notifications via e-mail and SMS. Recently, my cellular operator, MTS, turned off the service that allowed me to receive email messages via SMS. Shame upon MTS! Don't use their services. The reliability is terrible and then they just turn it off.

I plan to buy and set up a SMS gateway on the server. In the meanwhile, I was looking for other alternatives.

Firefox extension

Nagios Checker is a Firefox extension that displays Nagios alerts in the browser's status bar. Works fine, but I'm still not certain about the merits of this solution. After all, I can get e-mail notifications in the same browser area. And it's not mobile.

On the other hand, it's much more satisfying to see the green spot down there than to jump up every time an email notifier informs you of another webinar you don't want to attend.

Push e-mail

My cell phone is Nokia E63. It's a Symbian thingy and its messaging application can monitor my Gmail account non-stop. Well, actually, I'm not sure if it's implemented as a real push e-mail, even though Wikipedia is dead sure that it is. Either way, it works, but only as long as I'm within reach of a WiFi network. Of course, it should work over GRPS/Edge/3G, too, but I am still to find how much this permanent connection would cost me.

Social networks

There's a number of articles that recommend using Twitter to deliver the notifications, but I never could grasp the value of this service and do not use it. Besides, I think anyone would be able to read the alerts? Not good.

Instant messengers

A sound idea. But I tend to keep as far from them as possible. Even at work, where Skype is the preferred way of communication, I hardly use it. It's not as comfortable as e-mail, IMHO. It annoys and distracts me. And, finally, there are too many new messages, and the natural reaction is to neglect them.

Other ways

I've found an interesting article, "Notifications and Events in Nagios 3.0" (Part I and Part II). It summarizes some interesting ideas and details of their implementation in Nagios. Still, it does not mention an amazing idea, which I would really like to try out. Have a look at this video:

In the meanwhile, I think I should try push e-mail. Also, I think the server could trigger a desktop notification program (like notify-send, kdialog or, even better, a self-made Tcl/Tk pop-up notifier) remotely, via ssh. It should work well.

It's a pity cell phones don't have their own IP addresses and can't be controlled via ssh...

Thursday, October 7, 2010

Git and SVN

Very soon I will have a chance to compare the two version control systems in practice. I have to admit, though, that I'm rather biased against Git. It might have certain advantages for open-source teams, where members change often and hardly ever see each other. But I fail to understand (at least, a priori) how it could be better than SVN in a traditional development model. If I understand correctly, the only difference between them is one more intermediate repository between every two team members and, consequently, one more synchronization before code gets into trunk. I can believe that Git makes developers commit more often. But the changes are checked into the local repository, while synchronizations between repositories are usually made less often.

Another advantage of Git, local branches, which are easier to create and merge, also has a downside. They encourage the developer to experiment, but they also increase the chances of conflicts when branches are merged. Once again, the freedom of experiments might be good for open source projects, but for the teams where people work in contact it may create extra risks of project fragmentation.