Podcasts are a great way to get educated and entertained. As a developer we are lucky to have a choice of some fine podcasts from industry leaders. If you commute, jog, travel, you can easily use that time away from the computer to get better informed and reflect about our field.
(Updated Saturday 10DEC2011.) On 28th of June 2011, Microsoft Office 2010 Service Pack 1 and the Access 2010 Runtime Service Pack 1 were issued.
After upgrading my development machine (Win7 x64) and a few clients (Windows 2008R2 x64) to SP1 (x86), I started to get strange issues:
- I use .Net libraries from my Access application and suddenly, even when not instantiating any .Net objects, Access would crash, usually on startup, but sometimes when opening the VBE.
Decompiling and re-compacting the database would be OK, usually once, but the problem would reappear the next time I would restart the application.
- In the Runtime, I would get strange errors, such as The setting your entered isn’t valid for this property, or Action Failed Error Number: 2950, or Runtime Error 3270 Property not found.
The strange thing about these errors is that they would occur in places that had not been modified for many releases of our application, parts that have been running without problem until now.
- Another weird issue was the systematic reset of our custom ribbon to its first tab. this could happen randomly, but most it could also be reproduced by simply opening a report as a tab page (that fill-in the whole MDI window). When closing that form, the first tab of the ribbon would select itself automatically. This wasn’t happening when closing pop-up windows.
After removing the Office and Runtime Service Pack 1, everything went back to normal.
A fix, finally!
In the previous article, I showed how to improve the performance of an existing file server by tweaking ext3 and mount settings.
We can also take advantage of the availability of the now stable ext4 file system to further improve our file server performance.
Some distribution, in particular RedHat/CentOS 5, do not allow us to select ext4 as a formatting option during setup of the machine, so you will initially have to use ext3 as file system (on top of LVM preferably for easy extensibility).
A small digression on partitioning
Remember to create separate partitions for your file data: do not mix OS files with data files, they should live on different partitions. In an enterprise environment, a minimal partition configuration for a file server could look like:
- 2x 160GB HDD for the OS
- 4x 2TB HDD for the data
The 160GB drives could be used as such:
- 200MB RAID1 partition over the 2 drives for
- 2GB RAID1 partition over the 2 drives for swap
- all remaining space as a RAID1 partition over the 2 drives for
Note though that it is generally recommended to create additional partitions to further contain
The 2TB drives could be used like this:
- all space as RAID6 over all drives (gives us 4TB of usable space) for
- alternatively, all space as RAID5 over all drives (gives us 6TB of usable space) The point of using RAID6 is that it gives better redundancy than RAID5, so you can safely add more drives later without increasing the risk of failure of the whole array (which is not true of RAID5).
Moving to ext4
If you are upgrading an existing system, backup first!
Let’s say that your
/data partition is an LVM volume under
First, make sure we have the ext4 tools installed on our machine, then unmount the partition to upgrade:
# yum -y install e4fsprogs # umount /dev/VolGroup01/LogVol00
For a new system, create a large partition on the disk, then format the volume (this will destroy all data on that volume!).
# mkfs -t ext4 -E stride=32 -m 0 -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00 # tune4fs -o journal_data_writeback /dev/VolGroup01/LogVol00
Note: on a RAID array, use the appropriate
-E stride,stripe-width options, for instance, on a RAID5 array using 4 disks and 4k blocks, it could be:
For an existing system, upgrading from ext3 to ext4 without damaging existing data is barely more complicated:
# fsck.ext3 -pf /dev/VolGroup01/LogVol00 # tune2fs -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00 # fsck.ext4 -yfD /dev/VolGroup01/LogVol00
We can optionally give our volume a new label to easily reference it later:
# e4label /dev/VolGroup01/LogVol00 data
Then we need to persist the mount options in
/dev/VolGroup01/LogVol00 /data ext4 noatime,data=writeback,barrier=0,nobh,errors=remount-ro 0 0
And now we can remount our volume:
# mount /data
If you upgraded an existing filesystem from etx3, you may want to run the following to ensure the existing files are using extents for file attributes:
# find /data -xdev -type f -print0 | xargs -0 chattr +e # find /data -xdev -type d -print0 | xargs -0 chattr +e
The mounting options we use are somewhat a bit risky if your system is not adequately protected by a UPS.
If your system crashes due to a power failure, you are more likely to lose data using these options than using the safer defaults.
At any rate, you must have a proper backup strategy in place to safeguard data, regardless of what could damage them (hardware failure or user error).
barrier=0option disables Write barriers that enforce proper on-disk ordering of journal commits.
nobhgo hand in hand and allow the system to write data even after it has been committed to the journal.
noatimeensures that the access time is not updated when we’re reading data as this is a big performance killer (this one is safe to use in any case).
- Mount options to improve ext4 file system performance
- Ext4 Howto
- Migrating a live system from ext3 to ext4 filesystem
Using a Linux for an office file server is a no-brainer: it’s cheap, you don’t have to worry about unmanageable license costs and it just works.
Default settings of most Linux distributions are however not optimal: they are meant to be as standard compliant and as general as possible so that everything works well enough regardless of what you do.
For a file server hosting large numbers of files, these default settings can become a liability: everything slows down as the number of files creeps up and it makes your once-snappy fileserver as fas as a sleepy sloth.
There are a few things that we can do to ensure we get the most of our server.
Checking our configuration
First, a couple of commands that will help us investigate the current state of our configuration.
dfwill give us a quick overview of the filesystem:
# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/md2 ext3 19840804 4616780 14199888 25% / tmpfs tmpfs 257580 0 257580 0% /dev/shm /dev/md0 ext3 194366 17718 166613 10% /boot /dev/md4 ext3 9920532 5409936 3998532 58% /var /dev/md3 ext3 194366 7514 176817 5% /tmp /dev/md5 ext3 46980272 31061676 13493592 70% /data
tune2fswill help us configure the options for each ext3 partition. If we want to check what is the current configuration of a given partition, says we want to know the current options for our
# tune2fs -l /dev/md5
If I was using LVM as a Volume manager, I would type something like:
# tune2fs -l /dev/VolGroup00/LogVol02
This would give lots of information about the partition:
tune2fs 1.40.2 (12-Jul-2007) Filesystem volume name: <none> Last mounted on: <not available> Filesystem UUID: d6850da8-af6f-4c76-98a5-caac2e10ba30 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file Filesystem flags: signed directory hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue ....
The interesting options are listed under
Default mount options. For instance, here we know that the partition is using a journal and that it is using the
dir_indexcapability, already a performance booster.
cat /proc/mountsis useful to know the mounting options for our filesystem (just listed some interesting ones here):
rootfs / rootfs rw 0 0 /dev/root / ext3 rw,data=ordered 0 0 /dev/md0 /boot ext3 rw,data=ordered 0 0 /dev/md4 /var ext3 rw,data=ordered 0 0 /dev/md3 /tmp ext3 rw,data=ordered 0 0 /dev/md5 /data ext3 rw,data=ordered 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 /dev/md4 /var/named/chroot/var/run/dbus ext3 rw,data=ordered 0 0
data=orderedmount parameter tells us of the journaling configuration for the partition.
So what is journaling?
It’s one of the great improvements of ext3: a journal is a special log on the disk that keeps track of changes about to be made. It ensures that, in case of failure, the filesystem can quickly recover without loss of information.
There are 3 settings for the journalling feature:
data=journalthe most secure but also slowest option since all data and metadata is written to disk: the whole operation needs to be completed before any other operation can be completed. It’s sort of going to the bank for a deposit, filling the paperwork and making sure the teller puts the money in the vault before you leave.
data=orderedis usually the default compromise: you fill-in the paperwork and remind the teller to put the money in the vault asap.
data=writebackis the fastest but you can’t be absolutely sure that things will be done in time to prevent any loss if a problem occurs soon after you’ve asked for the data to be written.
In normal circumstances all 3 end-up the same way: data is eventually written to disk and everything is fine.
Now if there is a crash just as the data was written only option
journal would guarantee that everything is safe. Option
ordered is fairly safe too because the money should be in the vault soon after you left; most systems use this option by default.
If you want to boost your performance and use
writeback you should make sure that:
- you have a good power-supply backup to minimise the risk of power failure
- you have a good data backup strategy
- you’re ok with the risk of losing the data that was written right before the crash.
To change the journaling option you simply use
tune2fs with the appropriate option:
# tune2fs -o journal_data_writeback /dev/md5
Now that we’ve changed the available options for our partition, we need to tell the system to use them.
/etc/fstab and add
data=writeback to the option columns:
/dev/md5 /data ext3 defaults,data=writeback 1 2
Next time our partition is mounted, it will use the new option. For that we can either reboot or remount the partition:
# mount - o remount /data
There is another option that can have a very dramatic effect on performance, probably even more than the journaling options above.
By default, whenever you read a file the kernel will update its last access time, meaning that we end up with a write operation for every read!
Since this is required for POSIX compliance, almost all Linux distributions leave this setting alone by default.
For a file server, this can have such drastic consequence on performance.
To disable this time-consuming and not useful feature (for a file server), simply add the
noatime option to the
fstab mount options:
/dev/md5 /data ext3 defaults,noatime,data=writeback 1 2
Note that updating access times is sometimes required by some software, such as mail software (such as mutt). If you properly keep your company data in a dedicated partition, you can enable the mount options only for that partition and keep other options for the root filesystem.
dealing with errors in fstab
After doing the above on one of the servers, I realized that I made a typo when editing
This resulted in the root filesystem being mounted read-only, making fstab impossible to edit…
To make matters worse, this machine was a few thousand miles away and could not be accessed physically….
Remounting the root filesystem resulted in errors:
# mount -o remount,rw / mount: / not mounted already, or bad option
After much trial and rebooting, this worked (you need to specify all mounting options, to avoid the wrong defaults from being read from etc/mtab`):
# mount -o rw,remount,defaults /dev/md2 /
After that, I could edit
/etc/fstab and correct the typo…
How much these options will improve performance really depends on how your data is used: the improvements should be perceptible if your directories are filled with large amounts of small files.
Deletion should also be faster.
I often have to test String, Variant or Object variables that have no content and could be considered ‘blank’.
The problem is that testing for “blankness” can mean many different things to different types:
- For an
Objecttype, the variable can be
- For a
Stringtype, the string can have no content at all:
- For a
Varianttype, the string can have any of the following attributes or values:
- it can be
Missingif the variable is an unused optional parameter,
- it can be
Emptyif it was never assigned,
- it can be
Nullif, for instance it’s bound to a nullable field or unbound with no value,
- it can be an empty string
- it can be
When having to check these variables in code, it can be tiresome to have to go through testing some of these possibilities just to find out that your variable does or not not contains something useful, regardless of the type of variable you are using.
To avoid having to do all these tests, make the code a bit more tidy and allow me to move on to more important things, I use this small utility function quite often:
So now I don’t have to worry so much about the type of the variable I’m testing when I want to know if it contains useful data:
IsBlank() doesn’t replace the other tests but I found it to be more straightforward to use in most cases.