Filesystems Matter – Here’s Why
A filesystem in the most simple terms is a hierarchy of how the information is stored on computer storage media such as hard drives, SSDs and RAID arrays. Many people don’t think that filesystems pose a significant impact on the performance of a system and therefore do not think much of them. In fact, most end users never need to deal with them unless they become corrupted. I’d like to share a little story of mine, which I think showcases the true importance of choosing the right filesystem and the effects it can have.
I have an Ubuntu powered server, which contains 4 separate RAID arrays. Of those 4, 2 are used for data storage and they are named ‘Storage’ and ‘Backup’ appropriately in accordance to what they are used for. Every night, the server versions some directories from the Storage array, and copies them into the Backup array. The Backup array also stores filesystem images of other computers, along with other backup related stuff. The nightly backup used to take about 1-2 hours, but lately it has been going for over 7-8 hours – still running when I get up in the morning. I was intrigued by this, and thus went digging to see if I could find out why everything was slowing down. What I found literally shocked me. The 2 data partitions were NTFS! For those who don’t already see the horror, let me explain. First of all, NTFS is a Windows filesystem and it has no place on a Linux system. It is also a notoriously bad filesystem when it comes to speed. I can’t explain why I made the arrays NTFS and not ext4, and up until my investigation, that’s what I thought they were. As a small test, I tried to copy a 1.5GB file onto the Backup array and was saddened to tears when I saw that the maximum transfer speed was an abysmal 300KB/s! This was a RAID 10 array with a benchmarked write rate of over 300MB/s, yet it could only transfer at a speed 1000x slower! I got the feeling that the filesystem was epically fragmented, but since it was a Linux machine, I had no way of checking or defragmenting it.
I decided to reformat the array as ext4, and fix the problem, which was no easy task. I spent an entire day copying the 800GB of data to another array, and proceeded with the reformat. After spending about 3 hours copying the data back, I reviewed the results. The first result I got was actually during the copying process itself. It took about a day to copy the data off the server, but only 3 hours getting it back on. To do the transfers, I used SAMBA over gigabit Ethernet, and using the NTFS partition, I yielded an average of 10MB/s transfer rate with a maximum of about 15MB/s. On the ext4 partition copying data back, I yielded a very impressive average of 70MB/s, with a maximum of over 110MB/s.
As you can see, using the right filesystem can make a huge difference in the performance of a system. I’m not out to say that NTFS was entirely the problem in this situation, but this just shows that a poorly maintained filesystem can cause problems. I also encourage those who can to use ext4, to do so, due to its amazingly high performance.
Leave a Reply