Some code I've written

I started writing code for the Linux kernel in 1998, and my first patches were accepted into mainline in 2000. Here are some of my more interesting contributions.

Relative atime

I invented and implemented relative atime (relatime), now the default mount option for all Linux file systems. It reduces writes to storage by updating the file access time only when it is newer than the last written time. This made it possible to turn on by default without breaking most software, thereby saving billions of unnecessary writes to storage.

Here are the original patches: Relative atime kernel patch against 2.6.18-rc4

Relative atime userspace patch against util-linux 2.13-pre7

Test script to test relatime

Bare bones mount program to remount with either relatime or lazyatime (depending on kernel patch)

Older lazy atime patch, which updates the atime in memory but doesn't bother writing it to disk. The major problem with this is that atime can go backwards if an inode is evicted from memory.

Lazy atime patch against 2.6.18-rc3

64-bit support for e2fsprogs

I wrote the majority of the changes needed to support 64-bit ext4 file systems in the userland utilities like e2fsck and mke2fs. Now in mainline.


Chunkfs is a set of techniques for compartmentalizing file system metadata such that small pieces can be checked and repaired. A prototype is available as a concrete example of what this would look like.

e2fsck IO parallelization

I implemented parallel readahead of metadata in e2fsck using a pretty hysterical combination of fadvise(), read(), and blindfolded buffer cache manipulation. It gets about 50% improvement in elapsed time on a RAID5. Current e2fsck readahead patch.

malloc() tuning

The default malloc() threshold for allocating using mmap() versus brk() is designed for 32-bit systems with 2001 memory capacity, and winds up doing too many allocations using mmap(). This is especially harmful when the application is doing an alloc/write/free cycle, as the memory has to be cleared unnecessarily in between each free and alloc. Arjan van de Ven and I wrote a patch to glibc that dynamically adjusts the threshold based on the size of free()'s.

Update: Merged as of August 24, 2006 (with some additional tweaks).

Dynamic mmap threshold patch

ebizzy workload generator

I needed a workload that represented a class of applications fairly common in web application servers (think search and indexing services). Basically, a lot of threads are all doing the same kind of thing to a big (many GB's) chunk of memory: pick some chunk of memory, make a copy, look up some record in it using binary search, free that memory again. Microbenchmarks don't cut it, because the performance issues only kick in when everything is combined. ebizzy tries to replicate this for system tuning purposes. Licensed under the GPL.

The source is now hosted on Sourceforge.

OLD 0.1 release

Ext2 fsck reduction project

These patches are intended to reduce the average time spent on fscking ext2 file systems. For more info, see the paper (draft).

Latest patch for fswide dirty bit

Port of ext3 reservations to ext2 Martin Bligh and Andrew Morton cleaned this up and it's floating around in one of Andrew's trees somewhere; ask for it.