Revision 10 as of 2008-05-28 16:11:13

Clear message

TableOfContents

1. General concepts

1.1. Goodies

}}}

1.2. Working with an AFS+OSD cell

The following techniques are only useful when using OSD as a frontend for HSM:

}}} is useful for finding all objects stored on a given OSD

}}} allows the user to schedule tape restore operations

osd f io_a}}}

}}}

}}} where 12 is the OSD's ID from the OSDDB.

Osd 'io_e' with id=12:

}}}

This is how a fileserver and OSD server can share one machine:

The new fs ls subcommand can tell files apart:

m rwx root 2048 2008-05-08 13:21:01 . m rwx root 2048 2008-05-08 16:31:48 .. f rw- bin 44537893 2008-05-08 11:36:40 ascii-file f rw- bin 1048576 2008-05-08 13:21:14 ascii-file.osd d rwx bin 2048 2008-05-08 09:06:52 dir f rw- bin 0 2008-05-08 13:08:46 empty-file o rw- bin 44537893 2008-05-08 13:19:58 new-after-move }}} where m is a mountpoint, f a file, d a directory and o an object. fs ls will also identify files with their objects wiped from on-line object storage (i.e., with archival copies only).

1.2.1. How to migrate data from an OSD

  1. set a low write priority to stop fileservers from storing data on the OSD in question {{{osd setosd -wrprior 0

}}}

  1. use {{{vos listobj

}}} to identify the files (by fid) that have data on the OSD

  1. use {{{fs replaceosd

}}} to move each file's data to another OSD

1.3. Priorities and choice of storing OSD

}}}

Customizing owner, location:

}}} makes an AFS fileserver (!) known to the OSDDB

Server 'iokaste' with id=141.34.22.101:

osd addserver: create server entry in osddb Usage: osd addserver -id <ip address> -name <osd name> [-owner <group name (max 3 char)>] [-location <max 3 characters>] [-cell <cell name>] [-help] }}} as the name to be specified is not actually an "osd name" but an alias name for the file server you're adding.

1.4. Data held in volumes, DBs etc.

}}}

}}}

1.5. How to upgrade a cell to AFS+OSD

  1. set up OSDDB on the database servers
  2. set up pristine AFS+OSD fileservers + OSDs
  3. move volumes to the AFS+OSD fileservers
    • volserver is supposed to be armed with a -convertvolumes switch for that purpose

    • otherwise, set the osdflag by hand {{{vos setfields <volume> -osd 1

}}}

2. Policies

2.1. Open questions

2.2. Possible representations for a policy

So far we thought of 3 possible notations for policies, each having implications on the overall expressiveness.

2.2.1. Disjoint Normal Form

A policy consists of an arbitrary number of predicates that can be thought of as logically ORed. Evaluation is interrupted as soon as one predicate evaluates to true. Each predicate consists of a number of atomic predicates which are logically ANDed:

( suffix(".root") ) or ( size > 1M and size < 20M ) or ( size > 20M ) 

Of course, each case needs to return a definite "answer" to all aspects covered, e.g.

( suffix(".root") )         => OSD, 1 stripe, 1 site
( size > 1M and size <20M ) => OSD, 1 stripe, 2 sites
( size > 20M )              => OSD, 2 stripes, 1 site
else                        => No Object Storage

the last case being the default. (!) This would need to be set cell-wide.

Discussion
This data model allows for rather efficient evaluation and might easily be represented to an administrator.

2.2.2. Variable list of rules

Like above, the policy consists of a list of predicates. Each can have arbitrary effects on how to store the file's data wrt. the aspects covered by policies (see above). Here too, the predicates must be evaluated in a fixed order. They are limited to logical AND, too. However, evaluation cannot terminate before reaching the end of the list. The default behaviour has to take effect before evaluation starts:

Default                     => No Object Storage
( suffix(".root") )         => OSD
( size > 1M )               => 2 sites
( size > 20M )              => 1 site, 2 stripes 

(this would have much the same effect as the example policy outlines above for DNF).

Discussion
In places, this might have benefits compared to DNF. In human readable form, it would become a sequence of "but if..."s.

2.2.3. Fixed list of predicates

A policy consists of a fixed number of rules that consist of an arbitrary number of atomic predicates that can be linked using AND, OR and NOT and parenthesized. Each corresponds to a certain piece of info about the storing of OSDs:

OSD           = ( size > 1M or suffix(".root") )
Stripes 1     = true
Stripes 2     = size > 20M
Sites 1       = true
Sites 2       = size < 20M

This appears extremely complex though.

Discussion
Constructing expressions for arbitrary rules is difficult. There are invalid combinations of expressions (e.g., the sets of matches of "Stripes1" and "Stripes2" needs to be disjoint, the matches to "OSD" must superset all others etc.). Apart from printing the logical expressions in semi-mathematical form, it would be difficult to bring this into a human readable form.

3. Technical aspects

3.1. Performance

3.2. Backwards compatibility

4. Notes on the code (changes)

The explanations are from vol/namei_ops.c. The new format is used as Linktable version 2, with the original format still being supported as version 1. /!\ Does the code need to support legacy link table format? Volumes are incompatible anyway.

4.2. Technical details on ubik databanks

4.3. Debugging techniques