Differences between revisions 1 and 2
Revision 1 as of 2005-09-16 10:00:57
Size: 88
Comment:
Revision 2 as of 2005-09-16 14:30:08
Size: 5150
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
===== How can I maximize storage problems for me, my group, and all other users? ===== ==== How can I maximize storage problems for me, my group, and all other users? ====

Here's a collection of best practices to follow if you prefer
 * high risk of severe data loss
 * lousy performance
 * minimal availability

That's ridiculous? Maybe. But a significant fraction of users must like it that way, or
these would not keep happening time and time again:

===== Prefer NFS over AFS =====

NFS is a great source of problems. It does not scale at all, does not support replication and transparent relocation of data, and the weak caching semantics provides for interesting
and surprising effects occasionally. It is very easy to make an NFS server grind to a halt,
and an additional effect is often that many clients won't recover without being rebooted.

 * All you have to do in order to render a server unusable is to create O(10000) files
 in the same directory, and then access that. Try it, it's fun.
 * Writing to the same filesystem from many clients (farm nodes) simultaneously will result
 in very good fragmentation of files, causing slow read performance on them ever after.

There is no way to move data to a different server while keeping it available to clients.
During the process, there's always an extended period when it's read-only. Afterwards,
you either have to access your old data under a different path, or all clients need to
be reconfigured - which usually does not work smoothly -, or there's some period when the
data is not available at all.

In addition, NFS is very insecure. Anyone with (a) a linux notebook, (b) intermediate knowledge, and (c) access to any network wall socket providing a subnet with at least
one client to which a filesystem is exported read-write can read, wipe out, or (most
interesting) modify your data on that filesystem with very little effort.

===== Maximize the size of your AFS volumes =====

Be tidy, keep it all in one place. This very effectively prevents any load balancing because
it's impossible to distribute the data across multiple servers and RAID sets. It also prevents
load sharing by making it impractical to replicate such volumes. Since it takes so long to
move such a volume to a different server (even if the required huige chunk of space is
available on one), you run no risk of being offered uninterrupted access to part of your
data in case a fileserver needs to be serviced. Finally, you maximize your risk of data loss:
if the volume ever gets corrupted (say, due to a hardware malfunction), it will probably
affect all your files in the volume.

So, next time you create a repository for some event data or monte carlo,
 * do not create one volume per year or run period, even though it may be very tempting
 especially if the data is to be organized in corresponding subdirectories
 * if you've got 15000 numbered files making up 150 GB, do not consider creating 15
 subdirectories/volumes for them - you could get several times higher throughput that way and
 would even avoid the overhead due very many entries in a single directory

===== Store your most valuable data in scratch space without backup =====

It's most exciting to store the only copy of your source code, your publication, or next week's
conference presentation in /usr1/scratch on your desktop PC.

Next best is AFS or NFS scratch space: Even though this is hosted on RAID5 arrays, we can expect
an average loss of a few hundred gigabytes of such data per year due to multiple disk failures,
defective controllers or cache memory, firmware bugs, and even mistakes made by administrators.

===== Use the best storage with backup for scratch data =====

It's good practice to store huge amounts of easily retrievable data in your home directory
or group space with daily backup. ISO images of CDs or DVDs are a good example, another one
is large software packages or tarballs you can download from the internet anytime again.

Building large applications on remote filesystems is not only a very effective way of wasting
bandwidth and fileserver throughput, it also provides you with your well desired coffee breaks.
Don't run ''make clean'' afterwards, or you will waste less disk and tape space than possible.

===== Copy and move data around as much as possible =====

Never write data to the correct location in the first place. Instead, just stow it somewhere
temporarily (preferrably in a huge AFS volume), and start organizing your data later. Change
directory structures often, copying or moving data from the new to the old location occasionally.
Avoid having related data grouped in specific volumes, because that would make it possible to
just change the mountpoint to make it appear in the new location. Instead, have a few very large
general purpose volumes and reorganize them regularly.

Hint: If you want to unpack tar archives into central storage, it's most inefficient to
first copy them into the destination directory and then unpack them there. This way you
achieve almost three times the amount of I/O compared to simply unpacking them from wherever
they are (temporarily) stored.

How can I maximize storage problems for me, my group, and all other users?

Here's a collection of best practices to follow if you prefer

  • high risk of severe data loss
  • lousy performance
  • minimal availability

That's ridiculous? Maybe. But a significant fraction of users must like it that way, or these would not keep happening time and time again:

Prefer NFS over AFS

NFS is a great source of problems. It does not scale at all, does not support replication and transparent relocation of data, and the weak caching semantics provides for interesting and surprising effects occasionally. It is very easy to make an NFS server grind to a halt, and an additional effect is often that many clients won't recover without being rebooted.

  • All you have to do in order to render a server unusable is to create O(10000) files in the same directory, and then access that. Try it, it's fun.
  • Writing to the same filesystem from many clients (farm nodes) simultaneously will result in very good fragmentation of files, causing slow read performance on them ever after.

There is no way to move data to a different server while keeping it available to clients. During the process, there's always an extended period when it's read-only. Afterwards, you either have to access your old data under a different path, or all clients need to be reconfigured - which usually does not work smoothly -, or there's some period when the data is not available at all.

In addition, NFS is very insecure. Anyone with (a) a linux notebook, (b) intermediate knowledge, and (c) access to any network wall socket providing a subnet with at least one client to which a filesystem is exported read-write can read, wipe out, or (most interesting) modify your data on that filesystem with very little effort.

Maximize the size of your AFS volumes

Be tidy, keep it all in one place. This very effectively prevents any load balancing because it's impossible to distribute the data across multiple servers and RAID sets. It also prevents load sharing by making it impractical to replicate such volumes. Since it takes so long to move such a volume to a different server (even if the required huige chunk of space is available on one), you run no risk of being offered uninterrupted access to part of your data in case a fileserver needs to be serviced. Finally, you maximize your risk of data loss: if the volume ever gets corrupted (say, due to a hardware malfunction), it will probably affect all your files in the volume.

So, next time you create a repository for some event data or monte carlo,

  • do not create one volume per year or run period, even though it may be very tempting especially if the data is to be organized in corresponding subdirectories
  • if you've got 15000 numbered files making up 150 GB, do not consider creating 15 subdirectories/volumes for them - you could get several times higher throughput that way and would even avoid the overhead due very many entries in a single directory

Store your most valuable data in scratch space without backup

It's most exciting to store the only copy of your source code, your publication, or next week's conference presentation in /usr1/scratch on your desktop PC.

Next best is AFS or NFS scratch space: Even though this is hosted on RAID5 arrays, we can expect an average loss of a few hundred gigabytes of such data per year due to multiple disk failures, defective controllers or cache memory, firmware bugs, and even mistakes made by administrators.

Use the best storage with backup for scratch data

It's good practice to store huge amounts of easily retrievable data in your home directory or group space with daily backup. ISO images of CDs or DVDs are a good example, another one is large software packages or tarballs you can download from the internet anytime again.

Building large applications on remote filesystems is not only a very effective way of wasting bandwidth and fileserver throughput, it also provides you with your well desired coffee breaks. Don't run make clean afterwards, or you will waste less disk and tape space than possible.

Copy and move data around as much as possible

Never write data to the correct location in the first place. Instead, just stow it somewhere temporarily (preferrably in a huge AFS volume), and start organizing your data later. Change directory structures often, copying or moving data from the new to the old location occasionally. Avoid having related data grouped in specific volumes, because that would make it possible to just change the mountpoint to make it appear in the new location. Instead, have a few very large general purpose volumes and reorganize them regularly.

Hint: If you want to unpack tar archives into central storage, it's most inefficient to first copy them into the destination directory and then unpack them there. This way you achieve almost three times the amount of I/O compared to simply unpacking them from wherever they are (temporarily) stored.

Frequently_Asked_Questions/Storage (last edited 2008-10-30 11:40:12 by localhost)