File systems

Summary

All Linux disks have been converted from ext2 to ext3.

In the longer term, you should switch the large video drives to a file system that explicitly support large files. XFS does a great job with large files. Either one of these may be fine -- there's no big hurry.

Software (sourceforge projects)

Guides

How to create FAT32 drives

In Linux, issue "cfdisk /dev/sda" and select Type c, which is W95 FAT32 (LBA). The choices are misleadingly called file systems -- these are partition types. Then issue "mkfs.vfat /dev/sda1" and you're done. This is very fast and the resulting drives are automatically detected and mounted by WinXP.

In Windows XP, Right-click on My Computer | Manage | Storage | Disk Management, and then create the partitions -- this is fast and easy. However, formating is a hassle -- WinXP won't give you the option to use anything other than NTFS, at least not on large partitions, although FAT32 supports them. Open a Command shell:

c:\> format /? c:\> format f: /fs:FAT32 /v:1

You may discover WinXP spends an hour appearing to format the drive, and then tells you the partition is too big. For details, see MS format parameters.

File System Comparisons

64-bit file systems

JFS comes out the clear winner in a 64-bit file system benchmarking contest -- the fastest, and with a very low CPU usage.  Ext2 does well, and 64-bit ext3 and XFS poorly (but very well on 32-bit).  Reiser4 64-bit is OK, just not quite as fast. In brief, XFS doesn't seem quite ready for 64-bit systems.

NTFS

On 1 November 2006, ntfs-3g was released in Debian amd64, promising full write support to NTFS through fuse -- but with a warning the code is still not fully tested, especially not in amd64, though it is reported to be working fine.

It looks like full write access to NTFS has been achieved by the Captive project -- "You can mount your Microsoft Windows NT, 200x or XP partition as a transparently accessible volume for your GNU/Linux" -- by using the original Microsoft Windows ntfs.sys driver -- but the project was soon abandoned.

The 1.1.5 version of the packages are in Sigillo:/spare/software/debs, downloaded from http://debian.okey.net/debian-uo/dists/sid/misc/pool/captive-ntfs/ and not yet tested.

The 1.1.4 version of the packages are in Knoppix 3.4 (the current version is 1.1.5):

  • ii captive 1.1.4 NTFS filesystem using Microsoft Windows drivers
  • ii captive-install 1.1.4 Instant installer for Microsoft Windows platform filesystem access
  • ii captive-lufs 1.1.4 LUFS module for Microsoft Windows platform filesystem access

There are also ext2fs readers for WinXP. You should now reformat Sigillo's /vx drive to ext2 -- see ToDo.

File system shootout
http://fsbench.netnation.com/

On 2.4.6-test5 in October 2003, JFS and ext2 do well, along with XFS. Ext3 does really poorly, and ReiserFS is worst in several cases.

Comparing large file systems
http://aurora.zemris.fer.hr/filesystems/

In this independent test, XFS comes out on top, but ext3 is such a close second that it's not worth the hassle to change. FAT32 (VFAT) is many times slower and ends up at the bottom.

IBM DeveloperWorks tutorial series
Advanced filesystem implementor's guide
http://www-106.ibm.com/developerworks/library/l-fs11.html

Abstract
With the 2.4 release of Linux come a host of new filesystem possibilities, including ReiserFS, XFS, GFS and others. Sure, these filesystems sound cool, but what exactly can they do, what are they good at and exactly how do you go about safely using them in a Linux production environment? In the advanced filesystem implementor's guide, Daniel Robbins answers these questions by showing you how to set up these new advanced filesystems under Linux 2.4. Along the way, he shares valuable practical implementation advice, performance information and important technical notes so that your new filesystem experience is as pleasant as possible. In this, the first article in the series, he explains the benefits of journalling and ReiserFS.

ext3
http://www-106.ibm.com/developerworks/linux/library/l-fs7/
http://www-106.ibm.com/developerworks/linux/library/l-fs8/

I think you'll find that that ext3 has several surprising and intriguing capabilities. In this article, I'll give you a good understanding of how ext3 compares to the other journaling filesystems currently available."

Ext3 allows you to choose from one of three data journaling modes at filesystem mount time: data=writeback, data=ordered, and data=journal.

To specify a journal mode, you can add the appropriate string (data=journal, for example) to the options section of your / etc/fstab, or specify the -o data=journal command-line option when calling mount directly. If you'd like to specify the data journaling method used for your root filesystem (data=ordered is the default), you can to use a special kernel boot option called rootflags. So, if you'd like to put your root filesystem into full data journaling mode, add rootflags=data=journal to your kernel boot options.

data=writeback mode In data=writeback mode, ext3 doesn't do any form of data journaling at all, providing you with similar journaling found in the XFS, JFS, and ReiserFS filesystems (metadata only). As I explained in my previous article, this could allow recently modified files to become corrupted in the event of an unexpected reboot. Despite this drawback, data=writeback mode should give you the best ext3 performance under most conditions.

data=ordered mode In data=ordered mode, ext3 only officially journals metadata, but it logically groups metadata and data blocks into a single unit called a transaction. When it's time to write the new metadata out to disk, the associated data blocks are written first. data=ordered mode effectively solves the corruption problem found in data=writeback mode and most other journaled filesystems, and it does so without requiring full data journaling. In general, data=ordered ext3 filesystems perform slightly slower than data=writeback filesystems, but significantly faster than their full data journaling counterparts.

When appending data to files, data=ordered mode provides all of the integrity guarantees offered by ext3's full data journaling mode. However, if part of a file is being overwritten and the system crashes, it's possible that the region being written will contain a combination of original blocks interspersed with updated blocks. This is because data=ordered provides no guarantees as to which blocks are overwritten first, so you can't assume that just because overwritten block x was updated, that overwritten block x-1 was updated as well. Instead, data=ordered leaves the write ordering up to the hard drive's write cache. In general, this limitation doesn't end up negatively impacting people very often, since file appends are generally much more common than file overwrites. For this reason, data=ordered mode is a good higher-performance replacement for full data journaling.

data=journal mode data=journal mode provides full data and metadata journaling. All new data is written to the journal first, and then to its final location. In the event of a crash, the journal can be replayed, bringing both data and metadata into a consistent state. Theoretically, data=journal mode is the slowest journaling mode of all, since data gets written to disk twice rather than once. However, it turns out that in certain situations, data=journal mode can be blazingly fast.

ext3's data=journal mode is incredibly well-suited to situations where data needs to be read from and written to disk at the same time. Therefore, ext3's data=journal mode, which was assumed to be the slowest of all ext3 modes in nearly all conditions, actually turns out to have a major performance advantage in busy environments where interactive IO performance needs to be maximized.

Use the following for instructions on how to convert your file system from ext2 to ext3:
Andrew Morton, Using the ext3 filesystem in 2.4 kernels

XFS
http://www-106.ibm.com/developerworks/linux/library/l-fs9/

XFS is SGI's free, 64-bit high-performance filesystem for Linux. In my tests, I found XFS to be generally quite speedy. XFS consistently won all tests that involved manipulating large files, which should be expected since it has been designed and tuned over the years to do this very well. I also discovered that XFS has a singular performance quirk: it doesn't delete files very quickly. [A patch may be available; this was Jan 02.] Other than that, XFS performance was very close to that of ReiserFS and generally surpasses that of ext3.

My results show that XFS is the best filesystem to use if you need to manipulate large files. For small to medium-sized files, XFS can be competitive and sometimes even faster than ReiserFS if you create and mount your XFS filesystem with some performance-enhancing options.

Deploying XFS
http://www-106.ibm.com/developerworks/linux/library/l-fs10/

SGI's XFS page
http://oss.sgi.com/projects/xfs

Filesystem update (June 2002)
http://www-106.ibm.com/developerworks/linux/library/l-fs11/

According to Hans Reiser and his team of developers, there are some very nice improvements that are scheduled to appear in the 2.4.20_pre1 kernel, including Chris Mason's data journaling (like ext3's "data=journal" mode!) support, new block allocation code that scales much better, and several improvements in large file peformance, resulting in an up to 15% performance improvement when reading large files from IDE drives. Beyond these immediate and significant improvements, we are likely to soon see ReiserFS support the equivalent of ext3's "data=ordered" mode. At that point, ReiserFS will offer equivalent data integrity features to those found in the ext3 filesystem.

AFS

OpenAFS -- power filesharing
http://www.openafs.org/

Installation and configuration history

My current systems are all running ext2. This lacks journaling, which can be a hassle. However, ex2 performance is great and I've not had serious problems. In contrast, I lost a whole disk to ReiserFS.

The immediately available option is to convert ext2 to ext3. The "tune2fs -j" command can be used to convert from the "standard" linux ext2 file system to the new ext3 journaling system. The cost may be a slight (?) decrease in speed, which could matter for digital video capture. Or not.

On 28 June I decided to upgrade my file system on the main system disks from ext2 to ext3, following the instructions in Andrew Morton, Using the ext3 filesystem in 2.4 kernels.

I first verified that the ext3 file system was supported in the kernel -- note, however, that the recommended JBD debugger is not. Note that the kernel help screen says,

Other than adding the journal to the file system, the on-disk format of ext3 is identical to ext2. It is possible to freely switch between using the ext3 driver and the ext2 driver, as long as the file system has been cleanly unmounted, or e2fsck is run on the file system.

To add a journal on an existing ext2 file system or change the behavior of ext3 file systems, you can use the tune2fs utility ("man tune2fs"). To modify attributes of files and directories on ext3 file systems, use chattr ("man chattr"). You need to be using e2fsprogs version 1.20 or later in order to create ext3 journals (available at http://sourceforge.net/projects/e2fsprogs/).

In brief, it looks like a piece of cake to convert to ext3.

I appear to have E2fsprogs 1.24a (September 2, 2001) -- this may be worth updating from http://e2fsprogs.sourceforge.net/ -- I got 1.27. I did a ./configure --enable-dynamic-e2fsck --enable-fsck --enable-jfs-debug --with-cc=/usr/local/bin/gcc and it compiled and installed great. You may have been fine with what you had -- the instructions at http://www.zip.com.au/~akpm/linux/ext3/ext3-usage.html only require 1.23. I did the same on cyberspace, just to keep the systems similar.

With e2fsprogs 1.23 and later, you can use file system auto in fstab and the system will autodetect it. However, using filesystem type auto for the root filesystem confuses /bin/df, and causes it to not print out information for the root filesystem. Fix: always specify the root filesystem as ext3 in /etc/fstab.

I'm supposed to need a new package of util-linux from http://www.kernel.org/pub/linux/utils/util-linux/ but it doesn't look like there have been serious changes since my 2.11i version -- we're now at version 2.11r.

Now for the scary part: issue the conversion command. I did,

tune2fs -j /dev/hdc8

which is my main partition on gubbio. This took a second or two, saying

tune2fs 1.27 (8-Mar-2002)
Creating journal inode: done
This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.

I did the same for the boot partition, /dev/hdc5. I then changed /etc/fstab to read ext3 for these two partitions. This seemed so painless I'm tempted to do it on /vm too, but let's hold off -- performance might suffer.

I did the same on cyberspace, for /dev/hda1 (/boot) and /dev/hda3 (/). Interestingly, presumably because I used e2fsck 1.24, the response said the filesystem would be checked every 21 and 22 mounts instead of 28 above. I updated e2fsck anyway, but this looks fine. I updated the fstab there too.

Note that df -Th will show you the file system types.

I rebooted both machines; they rebooted into ext3. Everything is working fine; I decided to do the same on the two video drives -- file corruption could be a big hassle.

 

 

top
Debate
Evolution
CogSci

Maintained by Francis F. Steen, Communication Studies, University of California Los Angeles


CogWeb