Linux File Server Ideas

One of the things that computers are very good at is storing lots of data. Storing files is a fundamental requirement in any I.T. environment. There are a number of options available. This is not a complete list but it gives an outline of the most common concepts.

Network Protocols for File Sharing

Linux is very configurable and can talk one protocol or can talk several protocols at once.

Samba is compatible with MS-Windows and is useful in an environment where you want to continue to support Microsoft platforms but also take advantage of Linux capabilities. The disadvantage of samba is the complexity involved in configuration and the details of ensuring compatibility with other Microsoft file servers (which may talk various dialects of the same fundamental protocol, depending on how old they are and which versions of MS-Windows they are running). The samba homepage has details of the history and capabilities of this software.

This link gives an excellent plain English explanation of how Samba was written to be compatible with MS-Windows

netatalk is compatible with Apple Macintosh machines. See the netatalk homepage for details. This lets you use Linux as a fileserver for Macs and you can get Macs to share files with other system types as well.

NFS is a standard handed down from Sun Microsystems that has gone through several iterations and has become very common in a range of Unix-related environments. NFS has a long history and is considered very reliable, it is best used behind a firewall.

CODA is the "advanced" file sharing standard that is still somewhat experimental. More information can be found at the CODA homepage but in a nutshell, CODA has a lot of advanced capabilities that make if better than the others if you are willing to accept that it is still under development.

Filesystem Options

Linux can be configured with various file system structures. Outwardly, they all store your data and they all look the same to the end user. Once we get a little closer, there are some advantages and disadvantages that might make a difference to your individual requirements.

Extended File System v3 is the current "standard" Linux file system. It is robust and fast to bootstrap because it keeps a journal of all changes, if the system crashes in the middle of an operation, the journal can be used to ensure the file system comes back to a stable state.

Reiser File System is also popular and tends to be faster for reading, and more efficient with drive space. This is especially true if you have lots of small files and you have many files in the same directory. Reiser also keeps a journal and is quite robust when recovering from a crash situation. Reiser can be a little slow when writing large files. See the information at Namesys for many details about the reiser filesystem.

Backup Options

Surprisingly, many people run without a backup. Given the low cost of storage media and remembering that hard drives can fail because of power fluctuation, heating, physical bump or vibration, or simple defect... living without a backup is an unnecessary risk.

Backing up your data with rsync to alternate hard drive is one of the fastest and easiest options. Hard drives are cheap and commodity PC hardware is also cheap. This option does not require a high-power machine for a backup server, it merely requires a cheap machine with many drive bays and plenty of large IDE hard drives. The rsync protocol is very efficient so that even though the first backup may be slow, subsequent data transfers will be much faster. Backups can run automatically across the local network and require only minimal human intervention.

Setting up your file server with mirroring RAID is a good way to protect against hard drive failure. Your file server can make two (or more) copies of every file on separate drives. Presuming that all drives do not fail at the same time, you always have at least one copy of your data. When setup properly, this requires no human intervention at all (other than to check whether a drive has actually failed).

Backing up to CDROM or DVD is also an easy option but requires someone to remember to change the disc. Also note that your files may be too large to fit onto a single CDROM, even a DVD may not be big enough. This is quite a handy option for small businesses with small data requirements. Also, you might decide to archive old data onto a CDROM or DVD and remove those files from the server. Using CDROM storage wallets is an excellent way to store old project data on the shelf in a highly compact physical space and is recommended for anything you don't expect to use anytime soon but still want to keep "just in case". Being disciplined with server space helps reduce costs.

Backing up to tape can also be a useful direction if you have a large amount of data. These tape backups usually run overnight and may take hours to finish. The advantage is that you can take the tapes home with you to store offsite, the disadvantage is that tape systems are usually expensive and someone must keep changing the tapes as well as keeping a logbook.


Legal Details