Differences between revisions 35 and 36
Revision 35 as of 2007-07-05 09:03:56
Size: 24018
Comment: `
Revision 36 as of 2007-07-05 13:20:53
Size: 24450
Comment:
Deletions are marked like this. Additions are marked like this.
Line 160: Line 160:
=== C++ ABI === ==== C++ ABI ====
Line 236: Line 236:

=== Building Software ===
In general, it is more strictly necessary than before to '''compile anything that will be used in a shared object with `-fPIC`'''. In particular, if one receives this error from the linker:
{{{/usr/bin/ld: xyz.o:
relocation R_X86_64_32 against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
}}}
one should really do as told.
Line 368: Line 376:
|| 2007-07-05 || 30 || 0 ||||

TableOfContents

Status and Availability

i386

Available

x86_64

Available

  • Scientific Linux 5 for the i386 platform was released May 7, 2007, with x86_64 following May 15.
  • Since May 8, 2007, 32-bit SL5 is available for early adopters at DESY Zeuthen, with 64-bit following May 16.
  • A presentation in the technical seminar was given June 5, 2007. (Slides: attachment:SL5Z.odp / attachment:SL5Z.pdf)
  • Requests to update systems to both 32-bit and 64-bit SL5 are welcome.

    <!> Notice SL5 is quite young. Early users will undoubtedly discover problems not yet known.

    • Betas and release candidates have been in use on more than a dozen systems for months though, and we believe SL5 is very usable. It is running on the majority of PCs in the computing centre, the Myrinet and InfiniBand Clusters, and a number of servers.

  • Public Login systems for both platforms are available (see below). Problem reports for both are welcome.
  • An SL5 queue on the batch farm (64-bit only) is available (see [:Batch_System_Usage]). It now controls more than 100 of the fastest cores in the farm, with (90% of) 2 GB RAM per core. This is a substantial fraction of the farm's total computing power. As of July 4, 2007, batch jobs are scheduled in any free Slot regardless of the operating system on the worker node. Jobs can still request to be scheduled on SL3 (or SL5) nodes only.

Public Preview Systems

Two DNS aliases have been set up:

  • sl5.ifh.de (x86)

  • sl5-64.ifh.de (x86_64)

They point to (virtual) systems suitable for getting familiar with SL5, finding remaining problems with it to report to uco, and to test or port user software. These systems are not for production use.

The public PCs in 2L01 are running SL5 as well.

Improvements over SL3

SL5 users will benefit directly from these:

  • more recent versions of GNOME, KDE, TeX, and lots of other software
  • new applications, like scribus

  • better support for recent hardware
  • better support for hotpluggable storage
  • better interactive response during I/O (much better)

  • faster startup time for large applications
  • does not require more hardware resources than SL3 (except some more space in /)
    • especially on old PCs, SL5 is much more fun to work with than SL3
  • one user reports that the new gcc-4.1.1 compiles his code much faster than 3.2.3 on SL3
  • improved security (SELinux)

There are also many new features primarily of interest to administrators, like virtualization (Xen), configurable I/O schedulers, ionice(1), improved power management (really important for the farm these days), ...

General Points

AFS Sysname List

The AFS sysname list (the output of the fs sysname command) in Zeuthen is:

Platform

Sysname List

32bit (i686)

i586_rhel50

i386_linux26

i586_rhel30

i586_linux24

i386_linux24

64bit (x86_64)

amd64_rhel50

amd64_linux26

amd64_rhel30

i586_rhel50

i386_linux26

i586_rhel30

i586_linux24

i386_linux24

Login Shells

We made a serious effort to make bash a supported login shell, but it's impossible.

There is no way to reliably give bash users a working environment identical to that of zsh and tcsh users, due to bash's limited functionality w.r.t. startup files processing.

We recommend using zsh, zsh or zsh as the login shell. Tcsh is available for those who insist. Bash is not, sorry.

Notice this does not prevent users from writing or using bash scripts in any way.

Language Support, UTF8

It was initially planned to introduce use of UTF8 as the default with SL4. Alas, it was found that it's causing too much trouble, and makes things too incompatible to the rest of our environment. This hasn't changed with SL5. Also notice that a lot of software still has bugs when used under a UTF8 locale.

Hence the default for the LANG environment variable on SL5 systems is C, as on SL4, SL3 and DL5 before. Early SL4 systems had a default of en_US, because it makes certain GNOME applications behave more sensible (in particular gnome-terminal), but it was found to cause other problems (among them, changes in the date format and sorting order).

Any user may change the personal default by creating a file ~/.i18n:

# recommended and default:
#LANG=C

# alternative:
#LANG=en_US

# I want to suffer
LANG=en_US.UTF-8

We do not install language support packages for other languages than US-english, with the sole exception of dictionaries for spellchecking. Languages other than english for the user interface are not supported.

But: Typing and displaying non ASCII characters should work very well under a UTF-8 locale in GNOME and KDE applications. Here's an example of how to achieve this even under the default environment:

export LANG=en_US.UTF-8
exec gnome-terminal --hide-menubar --disable-factory --geometry=80x40 --window-with-profile=alpine

This script will open a new gnome-terminal window and execute the alpine mail client in it. The alpine profile in your gnome-terminal should of course have suitable settings (no scroll bar, maybe a certain font,...).

There are two methods for typing: Compose Character should work in most applications, and for most "european" characters. In addition, the SCIM (smart common input method) is available in GNOME and KDE applications, and can be used to type other character sets, like cyryllic, chinese, japanese, ...

Removable Media and Hotplug Storage Devices

  • Mount points are now in /media. They are created at mount time, not after plugging the device. No fstab entries are used, that's why you cannot use the usual mount and umount commands. The names of the mount points are taken from the file system's volume label if applicable. They are removed when the device is removed.

  • Manual Mounting If the automatic mounting of devices by GNOME/KDE is not used, gnome-mount can be called manually like this, using the appropriate device file:

    gnome-mount -d /dev/sda1
    gnome-umount -d /dev/sda1
  • Hotplug should work better than on SL3 or SL4, thanks to the more modern kernel and hotplug scripts. It is known to not work perfectly in all cases though.

    • Multiple partitions on USB storage devices should work now, at least if all partitions have a supported filesystem, and the partition table format is legal.
    • The new hotplug system does not assign ownership of the device to the user. Hence you can't partition the device or create filesystems, unlike on SL3.

    • The new hotplug system will clean up properly if a mounted device if removed, and it no longer causes problems to log out while a hotplug device is mounted.

      <!> It is still recommended to always umount all filesystems on a hotplug device before removing it physically, and before logging off from the console.

      Notice the GNOME desktop will automount filesystems on hotplug devices when they are

      connected. This feature can be turned off under System->Preferences->Removable Storage:BRattachment:g-v-p.png

  • Notice there is no firewire support

Backward Compatibility

Desktop/User Environment

  • As on SL4, the Display Manager is gdm, no longer kdm as on SL3 and before.

    • Because gdm is the default on SL.
    • Gdm will remember your preferred session without the need to

      move some configuration file, but it doesn't know about the rpeference set for kdm, hence the session type has to be chosen at least once.

  • HEPiX11 was dropped (already with SL4)

    • fvwm2 is no longer available.

    • Configuration in ~/.hepix is no longer used.

    • The Windows-Key can no longer be used to type german Umlaute (äöÜ...).

    • Instead, the right Alt-Key now works as a Compose Character-Key: To input an ä, type [R-Alt], then ", then a (one after another, not simultaneously). This is slightly less convenient, but much more general: This method works for characters like ç ñ ø ô ë ...

  • The old HEPiX profiles were completely replaced

    • The replacement is HEPiX-like where not incompatible with today's defaults on vanilla linux systems.

  • KDE, GNOME, IceWM, WindowMaker are available as on SL3.

    • IceWM is the recommended window manager for older desktops with 256 or even 128 MB RAM

      • an enhanced default configuration for IceWM is provided

Printing

  • Print Service has been changed from LPRng to [:Printing_with_Cups: Cups] because LPRng is obsolete and Cups is better integrated in KDE, Gnome, OpenOffice and other tools.

Binaries

SL5 should be binary backward compatible with SL4. This means that executables built on and for SL4 should work on SL5. It does not mean that any executable that works on SL4 will work on SL5 as well: If it worked only due to legacy support before, it may no longer work on SL5.

In particular, binaries that depend on setting the environment variable $LD_ASSUME_KERNEL to a value lower than 2.6.9 won't work on SL5.

C++ ABI

There was no major change to the C++ ABI from the default compiler on SL3 to the one on SL5, only a few "ABI fixes". Most binaries built on SL3 should work fine with the shared libaries built on and for SL5. Example: Building the tests for ROOT-5.14 on SL4 and running them on SL5 works fine. One of the executables behaves slightly differently - but in this one, glibc detects a memory handling bug (see below).

Missing Shared Libraries

If your executable fails with an error message like this

error while loading shared libraries: libldap.so.2: cannot open shared object file: No such file or directory

this indicates a missing shared library. If such a a shared library is available on SL3/4, we'll try to make it available on SL5 as well, so please report these cases. To find out whether a libray is available on the older systems, log into an SL3/4 system and use rpm:

[sl3] ~ % rpm -q --whatprovides libldap.so.2
openldap-2.0.27-22.i386
[sl4] ~ % rpm -q --whatprovides libldap.so.2
compat-openldap-2.1.30-7.4E.i386

In this case, we created a compatibility package for SL5 from the SL4 one:

[sl5] ~ % rpm -q --whatprovides libldap.so.2 
compat-sl3-openldap-2.1.30-7.4E.i386

This will be possible in almost all cases.

Using Shared Libraries from the SL3 AFS Installation

Many /opt/products packages are installed in AFS space, to make it possible to reference them with a symbolic link at least on systems with smaller disks. In many cases, this allows using this software on SL5 even if it's not installed into /opt/products on this platform. For example, to use the shared libraries from the root64-5.12.00 build from SL3, one can export LD_LIBRARY_PATH=/afs/ifh.de/amd64_rhel30/products/root64/5.12.00/lib64 . The corresponding path for the SL3/32bit installation is afs/ifh.de/i586_rhel30/products/.

Executables failing with *** glibc detected *** error messages

If your executable does not work, but instead fails like this

 ~ % voms-proxy-init
Cannot find file or dir: /afs/ifh.de/user/w/wiesand/.glite/vomses
Your identity: /O=GermanGrid/OU=DESY/CN=Stephan Wiesand
Enter GRID pass phrase:
Creating proxy ............................................................ Done
*** glibc detected *** voms-proxy-init: munmap_chunk(): invalid pointer: 0xbf97bd02 ***
======= Backtrace: =========
/lib/libc.so.6(cfree+0x1bb)[0x47c216db]

and so on, that's a bug in the application's memory management. Starting with SL3, glibc began detecting such bugs and warning about them. Since SL4, processes exhibiting such bugs are terminated by default. With SL5, glibc detects more of these problems, hence this may affect applications that worked on older releases.

As a workaround, you can set the environment variable MALLOC_CHECK_ to 1, to keep glibc from terminating such processes: {{{~ % MALLOC_CHECK_=1 voms-proxy-init malloc: using debugging hooks Cannot find file or dir: /afs/ifh.de/user/w/wiesand/.glite/vomses Your identity: /O=GermanGrid/OU=DESY/CN=Stephan Wiesand Enter GRID pass phrase: Creating proxy ................................................................................. Done *** glibc detected *** voms-proxy-init: free(): invalid pointer: 0xbf900d01 *** Your proxy is valid until Wed May 9 03:49:39 2007 }}}

Notice the 'malloc: using debugging hooks' message before each command. Since this also costs performance, do not add this environment variable to your profile.

Executables failing with "cannot restore segment prot after reloc"

SL5 is the first SL release exposing end users to SELinux - which was present on SL4, but much more permissive by default, except for the targeted daemons. For example, processes normally have permission to either execute a memory location or write to it, but not both (wherever possible and practical, anyway).

This restriction makes life hard for attackers trying to exploit bugs like buffer overflows in the software. It also prevents a few existing applications from being executed. Here's an example for a certain release of GRID User Interface Software:

.../gui-2.3/lcg/bin/lcg-cp: error while loading shared libraries: .../gui-2.3/lcg/lib/libgfal_pthr.so: cannot restore segment prot after reloc: Permission denied

There is some information on SELinux memory protection available in http://people.redhat.com/~drepper/selinux-mem.html , and more about text relocations in http://people.redhat.com/~drepper/textrelocs.html . In short, the author of these documents explains that executables (including shared libraries) exhibiting this error are incorrectly built:

  • "A text relocation is the result of a reference to an object with a variable address at runtime using an absolute addressing mode. The instruction encoding itself contains the address and therefore the executable text of the binary must be changed to contain the correct address when taking the actual load addresses at runtime into account.

    The result of a text relocation is that the binary text is written to. This means this page of the binary cannot be physically shared with other processes on the system (this is the goal of DSOs, aka shared libraries). It also means that the binary must have permission to change the access permissions for the memory page to include writing and then back to executing. This is a privileged operation when SELinux is enabled."

The solution is usually to recompile the source with -fPIC .

If you can't do that, the workaround is to apply a security label to the binary which will allow it to perform text relocation: {{{[sl5] % chcon -t textrel_shlib_t .../gui-2.3/lcg/lib/libgfal_pthr.so }}} <!> This only works on the local disk. Not in NFS/AFS/panfs.

Source Code

  • GCC4 is much stricter than previous versions, some C and C++ code may need to be adapted.

  • The FORTRAN frontend of GCC4 is gfortran, no longer g77, and some code may need to be adapted

    • the corresponding runtime library is libgfortran, no longer libg2c

Building Software

In general, it is more strictly necessary than before to compile anything that will be used in a shared object with -fPIC. In particular, if one receives this error from the linker: {{{/usr/bin/ld: xyz.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC }}} one should really do as told.

Applications

  • ghostscript options sometimes have a different syntax. This may bite you if you have configured special options in frontends like gv in the past. For example, -scale 1 is no longer accepted and this has to be changed to -scale=1.

Software

Browser

Firefox ist the recommended web browser. We provide the java, flash and realplayer plugins.

Mail Client

Pine is the recommended e-mail client. It will however soon be superseded by alpine, and you may want to give that a try instead.

Compilers

GCC

  • The default compiler suite is GCC 4.1.1.
  • The compatibility release 3.4.6 (default on SL4, available on SL3) is installed as well.
    • Invoke as gcc34, g++34, g77

    <!> g77 is the FORTRAN frontend from the old compiler. The new frontend is gfortran .

Intel

Version 9.1 of the C, C++, and FORTRAN compilers are available. The 32-bit runtime environment is installed on 64-bit Systems as well.

  • Invoke as icc, icpc, ifort

Portland Group

Version 7.0 of the PGI compiler is installed. The 32-bit runtime system is available on 64-bit hosts.

  • Invoke as cc, CC, f77, f90 after ini pgi

Java

Version 1.5 is the default and installed locally. It's 32-bit even on 64-bit systems, to make the browser plugin work.

Versions 1.4.2 and 1.6 are available in /opt/products, as well as a 64-bit Version of 1.5.

ROOT

We provide version 5.14.00 (now patch release f) built with GCC4. Older versions for older compilers could be provided on request.

As on SL3/4, the 64-bit build is installed in /opt/products/root64, the 32-bit one in /opt/products/root.

Known Problems

vi misbehaves in konsole & gnome-terminal if $TERM=vt100

If the environment variable TERM is set to vt100, some escape sequences produced by vi (or vim) are not processed correctly by konsole and gnome-terminal. Please do not set TERM in your dot files.

Software known not to work on SL5

  • Grid Software (LCG/Glite)

    • the older the release, the more problems to use it
    • some executables need the MALLOC_CHECK_ workaround and/or relabelling of shared libs as described above
    • commands gathering information generally work
    • commands that actually do something (copy, ...) do not
    • this is under investigation, but this software may not work properly before the project finally releases a native SL4 build

Software that is not yet available, but will be eventually

  • ICA/Wincenter
    • workaround: use rdesktop (in particular, try the winrdp command; you prefer that over the ICA Client afterwards anyway)

      • you'll find /usr1 and your AFS home directory in \\tsclient on the terminal server

      • winrdp is a simple script; create your personal copy and modify it to export other filesystems

PC Upgrades

Requirements

  • A supported PC (najade class or newer), 128 MB RAM, 20 GB disk
  • A root filesystem of at least 6 GB ( 8 GB for 64-bit)

    • example: this is sufficient for a 32-bit installation: {{{[host] ~ % df -H /

Filesystem Size Used Avail Use% Mounted on /dev/hda7 6.3G 4.9G 1.1G 82% / }}}

  • If your / is too small, the disk has to be repartitioned. This should only affect a few of the oldest najade class Systems.

Procedure

  1. The owner or group admin of the PC contacts uco by e-mail with a subject of SL5 upgrade, providing the following information:

    • the PC's hostname
    • whether or not the /usr1 partition should be preserved
    • date of the upgrade
    • whether the PC should be rebooted by DV or by the user
  2. DV prepares the upgrade on the required date.
  3. The next reboot starts the upgrade installation. This must happen the same day.

Checklist

To prepare the PC for the upgrade, please

  • make sure the PC and the monitor are switched on

  • unplug all USB devices (except a USB mouse)
  • remove all CDs, DVDs, Floppies

Don't Panic

If you watch the installation (there's no need to stay around, and we actually recommend doing something else), some observations may be a bit disturbing, but are completeley harmless:

  • If your PC spends several minutes displaying "retrieving installation information" in the early installation phase, with a spinning cursor and the progress bar indicating that it should be finished, that's perfectly normal.

  • Lots of cryptic failure messages starting "audit:", in bunches - just ignore them.
  • Error messages about removing packages failed because they are not installed anyway. Again, please ignore.
  • Depending on the video hardware, the screen may go blank and stay like this for several minutes even if you press a key like Shift or Ctrl. This happens because the video card is confused after probing it.

Only if you see a clear, persistent error message on the screen indicating that installation was aborted and you should press "ok" to reboot, something went really wrong. Otherwise, please be patient. The one and only way to make an installation fail very reliably is to reboot the PC while it's in progress.

Installation Times

Time to upgrade an SL3/4 system to SL5 (begin of reboot to gdm ready for login):

Generic Name

Model

CPU

RAM

Upgrade takes

pre-najade

Comptronic White Box

PIII 750 MHz

128 MB

2h30m

najade

Comptronic White Box

PIII 850 MHz

128 MB

unknown

nereide

Comptronic White Box

P4 1.7 GHz

256 MB

1h30m

hyade

Dell Precision 350

P4 2.4 GHz

256 MB

1h

dryade

Dell Precision 360

P4 2.8 GHz

512 MB

45m

satyr (<60)

Dell Precision 370

P4 3.2 GHz

512 MB

35m

satyr (>60)

Dell Precision 380

Pentium D 2.8 GHz

512 MB

35m

oreade

Dell Precision 390

Core2 Duo 2.13 GHz

1 GB

30m

Installing the 64bit flavour takes 10% longer.

Desktop Upgrade Record, Success/Failure Rates

This table covers only the upgrades of existing PCs in production use. New PCs delivered to the user with SL5 don't show up here. Nor do the test installations/upgrades we run frequently (several times a week) to make sure the process works flawlessly.

Date

Total Systems Upgraded

Total Failures

Remarks

2007-05-08

2

0

2007-05-09

4

0

2007-05-10

6

0

2007-05-11

7

0

2007-05-14

10

0

2007-05-22

11

0

2007-05-24

12

0

2007-05-25

13

0

2007-06-07

14

0

the first 64-bit Desktop at DESY Zeuthen (and it's not a DV user)

2007-06-20

15

0

2007-06-21

16

0

2007-06-23

17

0

2007-06-26

18

0

2007-06-29

20

0

2007-07-02

24

0

2007-07-03

26

0

2007-07-04

28

0

2007-07-05

30

0

Fixed Problems

Problem

Date Reported

Date Solved

Solution

some binaries need ldap libraries compatible with SL3

2007-05-18

2007-05-24

repackaged compat-openldap (i386 only yet) from SL4; on x86_64 also provide required openssl097b.i386 from 32bit branch

slow window redraws (turned out to affect only WindowMaker, and only with 16bit colours)

2007-05-10

2007-05-11

autoconfigure 24bit colour if video card has >= 8MB RAM (should be all)

SL5_User_Information (last edited 2017-05-23 11:28:28 by StephanWiesand)