SGE with AFS HOWTO

to integrate AFS and SGE(EE) you need to install SGE with AFS support (I believe it is inst_sge -m -afs for the master). Then you have to use the hooks within SGE and call procedures set_token_cmd and get_token_cmd.

These are very tricky, as you need to prolong an AFS token, which is normally not allowed/desired/possible.

To achieve this we use the following procedure: A server (arcd) is set up that uses Kerberos4 to authenticate user requests and tries to prolong AFS tokens that are sent to the server.

The AFS token is extracted from the memory using a program called Gettoken. This token is pgp encoded ant sent to the arcd server. There another program called forge is doing the token prolongation after unpacking what was sent. To achieve this the arcd server does need access to the master AFS key to be able to decode and recode the ticket. In the last step the server sends back the token, which is received by the client. The client calls Settoken to put back the new AFS token into the process space where it normally resides.

If token prolongation is not required, then the procedure would be much simpler, as no server needs to be involved. The drawback is that the maximum job duration cannot exceed the token lifetime which in our case is 25h. And if the token is about to expire, then a submitted job would suffer from that as well.

The simple mechanism could even be achieved with a few lines of perl after having installed the AFS perl module of Norbert Gruener:

get_token_cmd:

use AFS::KTC_TOKEN;
my $index = 0;
my $service = AFS::KTC_PRINCIPAL->ListTokens($index);
($token, my $user) = AFS::KTC_TOKEN->GetToken($service);
## TODO: store the file in a central place, where all SGE clients have
## TODO: access, make it readable only to the owner and sge_execd !!!
open F, ">aaa";
print F $token->string;
close F;

set_token_cmd:

use AFS::KTC_TOKEN;
use AFS::Cell qw(localcell);
## TODO: fetch the file from the central place defined in get_token_cmd
open F, "aaa";
{local $/ = undef; $string = <F>};
close F;
$token = AFS::KTC_TOKEN->FromString($string);
$service = AFS::KTC_PRINCIPAL->new("afs","",localcell);
$user    = AFS::KTC_PRINCIPAL->new($ARGV[0]);
AFS::KTC_TOKEN->SetToken($service, $token, $user, 0);

If you are interested in the full blown solution I can send you the sources of the programs mentioned above (arcd, Gettoken, ...) If you do have AFS access you even could fetch it yourself from the authors AFS directory (Rainer Toebbicke):

arc client and arcd server: /afs/cern.ch/user/r/rtb/public/arc.tar.gz Gettoken etc.: /afs/cern.ch/user/r/rtb/public/afs_admin

Depending on what else you need I could also try to describe the whole SGE setup process we are using including the arc aware get_token_cmd and set_token_cmd procedures.

Setup of a gridengine master

pts creategroup grdserver
pts createuser 141.34.15.121 #ice1
pts createuser 141.34.15.122
pts createuser 141.34.15.171 #bear1
pts createuser 141.34.15.220 #linos
pts adduser 141.34.15.121 grdserver
...
/usr/afsws/etc/vos create -server fauna -partition vicepd -name grd.master -maxquo
ta 200000
fs mkmount -dir /afs/.ifh.de/project/grd -vol grd.master
/usr/afsws/etc/vos release project
cd /afs/ifh.de/project/grd
fs sa . 141.34.15.121 all
fs sa . 141.34.15.122 all
fs sa . 141.34.15.171 all
fs sa . 141.34.15.220 all
cp -rp /usr1/GRD/default/common .
mkdir spool
cd spool
cp -rp /usr1/GRD/default/spool/qmaster .
cd /usr1/GRD/default
mv common common.old; ln -s /afs/ifh.de/project/grd/common .
cd spool

Setup of an arcx Server

To solve the problem of the AFS token prolongation we now use a different scheme than described above. Instead of storing the AFS token, decoding its content on request and modifying the timestamp inside, we let SGE retrieve a new Kerberos5 Ticket at job start and whenever it is neccessary later on. To do that safely SGE requests such a ticket from a server (the arcx server) through the script set_soken_cmd.

To set up the arcx server several steps are necessary

  1. Installation of the required Software
    • - perl - Kerberos5 (heimdal preferred, as the AFS token is obtained automatically) - SASL (with the GSSAPI mechanism) for authentication - the perl Module ARCv2 and depending modules (SASL, ...)
  2. Preparation of the kerberos and AFS environment
    • we assume in the following that the hostname of the arcx server is 'server1.dom.ain' - create an AFS user arc.server1.dom.ain - add this user to the pts group system:administrators (e.g. for access to AFS file space) - add this user to the list of trusted users on all file servers (e.g. to create volumes)

  for i in `vos listaddrs` do
  bos adduser $i arc.server1.dom.ain
done

  kadmin add -r arc/server1.dom.ain
  kadmin ext_keytab arc/server1.dom.ain
  1. Configure arcx and start the server
    • - create the /etc/arcx directory and populate it with configuration files described elsewhere. At least you need arcxd.conf
  2. try the arcx client

  arcx -h server1 whoami

SGEwithAFS (last edited 2010-09-06 11:04:59 by WaltrautNiepraschk)