Saving billions of disk writes with ten lines of code

Summary

I invented and implemented relative atime, a Linux VFS mount option that eliminated billions of unnecessary writes to Linux file systems with fewer than 10 lines of code.

Details

In UNIX file systems, “access time” or atime keeps track of the last time that a file was read. Atime is stored in the file inode, which is written to the storage device whenever it is updated. This means that, by default, every time a file is read, it generates a tiny write. While these tiny random writes can be batched up and optimized, they waste a lot of energy and can degrade file system performance a lot when a lot of different files are being read.

Turning off atime updates saves power and improves performance. But we can’t ship a Linux kernel that does it by default because some applications need atime updates to run correctly. For example, the mutt mail reader program uses atime updates to figure out if new mail has been delivered, by comparing the last written time of the file with the last read time (atime).

However, most Linux systems did not have any applications that relied on atime updates. But because it took an explicit action to turn atime updates off, many Linux systems had atime updates turned on when they didn’t need them, wasting energy and reducing performance. This was especially important in laptops and other personal computers that weren’t managed by professional systems administrators.

I invented and implemented relative atime, which eliminated most but not all atime updates in such a way that applications continued to just work. My insight was that the applications that were broken by turning off atime didn’t really want to know the EXACT time a file had last been read. What the actually wanted to know was, “Has anyone read this file since it was last written?” And that required only one atime update after the last file write, instead of an atime update every time it was read.

Think of it like a mailbox flag: if the flag is up, it means there is new mail for the mail carrier to pick up. So the mail carrier opens the mailbox, takes out the mail, and lowers the mailbox flag. If the next day the flag is still down, it means no new mail has been added to the mailbox and the mail carrier does not need to open it to check, saving them time and energy.

I added a few lines of code to only update the atime if the last written time was more recent than the previous atime. It worked perfectly in all the cases where noatime broke applications.

Relative atime needed one additional tweak to allow it to be turned on by default: some programs delete files when they haven’t been accessed for a while. So I added code to update the atime if it had been more than 24 hours since the last time it was updated.

This simple trick required fewer than 10 lines of kernel code. It has saved billions of writes since it was made the default atime mode in 2008. The alternate proposals were far more complicated, resource-intensive, and error-prone, with different methods of batching and flushing atime updates, responding to memory pressure, and the like. I love finding a simple, elegant fix to a long-standing problem.

Original patch

Tweak to update every 24 hours

Have a systems problem you need help with? Schedule a free consultation or read about more systems problems I have solved.