Differences between revisions 37 and 38
Revision 37 as of 2022-04-08 15:28:24
Size: 12740
Editor: GötzWaschk
Comment: WIP: results from trex01
Revision 38 as of 2022-04-11 07:43:22
Size: 13074
Editor: GötzWaschk
Comment: trex01 results
Deletions are marked like this. Additions are marked like this.
Line 127: Line 127:
||trex01||Intel Xeon 5120||471.77||2200||19712+0||28 (SMT off)||384 (12 modules)||Dell 0N6JWX||
Line 151: Line 152:
||Intel Xeon 5120||469.84||473.33||471.77||
Line 170: Line 172:
||trex01||Intel Xeon 5120||666.1||2200||19712+0||28 (SMT on)||384 (12 modules)||Dell 0N6JWX||
||trex01||Intel Xeon 5120||538.12||2200||19712+0||28 (SMT off)||384 (12 modules)||Dell 0N6JWX||



1. Overview

The High Energy Physics (HEP) SPEC benchmark is a set of test applications which stress the processor with operations and algorithms used commonly in applications from the physics community, and provide the SPEC result used for example in describing the resources relevant to HEP applications, provided to the Grid infrastructure. Alternatively the HEPSPEC benchmark suite can be used to assess the condition of different configurations and detect latent bottlenecks and problems.

2. Interactive script for HEPSPEC runs

You can find the interactive script for running HEPSPEC benchmarks on both local machines and in SGE farm in AFS under

/afs/ifh.de/group/rz/HEPSPEC/interactiveHEPSPEC.sh . This script takes several parameters, the description of which can be obtained by running the script with the '-h' option:

host /afs/ifh.de/group/rz/HEPSPEC
10-05-12 13:55 # ./interactiveHEPSPEC.sh -h

Please specifiy the required arguments!
# ./genericSPEcrun.sh [compiler] [architecture] [run mode]
Valid values are:
        for [compiler]     : icc or gcc
        for [architecture] : 32 or 64
        for [mode]         : farm or local

HEPSPEC benchmarks can be run with Intel icc v. 11.0 or GNU GCC v. 4.1.2 . Furthermore one can choose between compiling and running the HEPSPEC benchmarks for 32 or 64 bit architecture and whether to run in a local environment or on some of the SGE farm hosts. The tests reported in Section 4 were run with gcc for 32 bit architecture as perscripted by HEP computing community.

2.1. Running HEPSPEC in the SGE Farm

<!> Currently, the batch job doesn't work in the farm as the compiler part exceeds the memory limit. Instead use the script automaticHEPSPECondisabledfarmnode.sh on a disabled farm node. First make /batch writable by the user, then remove /etc/cron.hourly/regular-batch-cleanups on the node.

Here is an example of submitting a HEPSPEC benchmark run to a host in the SGE Farm using icc as compiler and 32 bit architecture:

host /afs/ifh.de/group/rz/HEPSPEC
10-05-12 14:06 # ./interactiveHEPSPEC.sh icc 32 farm
Full path to the unmodified HEPSPEC distribution:
/afs/ifh.de/group/rz/HEPSPEC/2006-1.1
Setting SPECDIR= /afs/ifh.de/group/rz/HEPSPEC/2006-1.1

Full path to the CERN HEPSPEC .cfg file:
/afs/ifh.de/group/rz/HEPSPEC/spec2k6
Setting CERNCONFDIR= /afs/ifh.de/group/rz/HEPSPEC/spec2k6

Full path to folder for storing the result (must not be in the temporary folder)
/afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY
Setting RESULTDIR= /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY

Submitting /afs/ifh.de/group/rz/HEPSPEC/framHEPSPEC.sh
Your job 792245 ("farmHEPSPEC.sh") has been submitted

The "interactiveHEPSPEC" scripts submits a so-called subscript, farmHEPSPEC, which has the following options embedded, according to the rules for running farm jobs found in Batch System Usage.

#(the cpu time for this job)
#$ -l h_cpu=09:00:00
#(the maximum memory usage of this job)
#$ -l h_rss=2G
#(Acquire disk space)
#$ -l tmpdir_size=4G
#(stderr and stdout are merged together to stdout)
#$ -j y
#(send mail on job's end and abort)
#$ -m bae
#(Execute the sript from the current working directory)
#$ -cwd
#(specifiy project)
#$ -P yourgrouphere
#(parallelism level)
#$ -pe multicore 64
#(reserve slot for the whole job)
#$ -R y
  • It can also be submitted directly to the farm with the options in the following order: compiler, architecture, specdir, cernconfdir, resultsdir:

    qsub farmHEPSPEC.sh icc 32 /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 /afs/ifh.de/group/rz/HEPSPEC/spec2k6 /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY -l hostname=pizza00

2.2. Running HEPSPEC locally on a given machine

If you want to test a particular machine to which you have login access you can do this by invoking the script with the 'local' parameter. In this case you will be additionally asked for the path where to create and use a temporary folder for copying and compiling the HEPSPEC benchmarks. Note that this path should point to a folder which can hold at least 4 GB data, otherwise the HEPSPEC benchmark will exit with an error.

10-05-12 14:30 # ./interactiveHEPSPEC.sh gcc 32 local
Full path to the unmodified HEPSPEC distribution:
/afs/ifh.de/group/rz/HEPSPEC/2006-1.1
Setting SPECDIR= /afs/ifh.de/group/rz/HEPSPEC/2006-1.1

Full path to the CERN HEPSPEC .cfg file:
/afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6
Setting CERNCONFDIR= /afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6

Full path to temporary scratch folder for HEPSPEC copy and compile (about 4BG):
(DON'T ENTER YOUR AFS HOME DIRECTORY)
/tmp
Setting TEMPDIR= /tmp

Full path to folder for storing the result (must not be in the temporary folder)
/afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY
Setting RESULTDIR= /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY

***************************************************
Starting on host host.ifh.de at 20100512, argc=3
***************************************************
SPECDIR     = /afs/ifh.de/group/rz/HEPSPEC/2006-1.1
CERNCONFDIR = /afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6
COMPILER    = gcc
ARCH        = 32
TEMPDIR     = /tmp
Hit any key to continue....
Using GNU C/C++ compiler suite
Logfile name is HEPSPEC_host.ifh.de_gcc_20100512_cerncfg.log
Copy the installation files to TEMPDIR

3. Results

The following table holds the results for running the HEP SPEC benchmarks with the default configuration file provided by CERN and run with the 'gcc', '32' and 'farm' parameters.

HEPSPEC Results @ DESY SGE and PAX Farm/Cluster for 32 bit

hostname

Processor

Result

CPU Frequency[MHz]

L2+L3 Cache

Cores

Memory[GB]

Motherboard

westmere

Intel Xeon X5670

172.32

2930

12288+0

6

24

pax0X

Intel Xeon X5560

120.11

2794

8192+0

8

16 (8 modules)

Dell 0H723K

photon

Intel Xeon E5540

129.96

2530

8192+0

16 (SMT on)

16 (8 modules)

Dell 0H723K

blade0X / blade1X

Intel Xeon E5345

59.01

2333

4096+0

8

16 (8 modules)

Dell 0H723K

blade2X

Intel Xeon 5160

58.20

3000

4096+0

4

16 (8 modules)

Dell 0H723K

blade4X

Intel Xeon 5440

72.95

2834

4096+0

8

16 (8 modules)

Dell 0H723K

blade5X / blade6X

Intel Xeon E5450

75.24

3000

6144+0

8

16 (8 modules)

Dell 0H723K

blade9X

Intel Xeon X5550

109.51

2661

8192+0

8

16 (8 modules)

Dell 0H723K

wgs0

Intel Xeon X5690 @ 3.47

246.13

3470

20480+0

6

48 (6 modules)

pax9x

Intel Xeon E5-2660

320.61

2200

20480+0

8

24 (4 modules)

wgs1

Intel Xeon E2640v3

370.12

2594

20480+0

8

64 (8 modules)

pax10

Intel Xeon E2640v3

371.39

2954

20480+0

8

64 (8 modules)

Supermicro X10DRT-HIBF

trex01

Intel Xeon 5120

590.18

2200

19712+0

28 (SMT on)

384 (12 modules)

Dell 0N6JWX

trex01

Intel Xeon 5120

471.77

2200

19712+0

28 (SMT off)

384 (12 modules)

Dell 0N6JWX

qftquad19

Intel Xeon E5-2640v4

421.01

2400

25600+0

10

384 (24 modules)

pax11

Intel Xeon E5-2697A v4

730.14

2600

20480+0

32 (SMT on)

128 (8 modules)

Supermicro X10DRT-HIBF

pizza00

AMD EPYC 7702P

1214,12

2000

16384+0

64 (SMT on)

512 (8 modules)

Dell 04F3CJ

pizza00

AMD EPYC 7702P

1035.29

2000

16384+0

64 (SMT off)

512 (8 modules)

Dell 04F3CJ

qftquad25

AMD EPYC 72F3

770.04

3700

32768+0

16 (SMT on)

2048 (32 modules)

Dell 024PW1

qftquad25

AMD EPYC 72F3

604.57

3700

32768+0

16 (SMT off)

2048 (32 modules)

Dell 024PW1

wgs41

AMD EPYC 7713P

1337.35

2000

32768+0

64 (SMT on)

256 (8 modules)

Dell 035YY8

wgs41

AMD EPYC 7713P

1217.7

2000

32768+0

64 (SMT off)

256 (8 modules)

Dell 035YY8

ampere01

AMD EPYC 7502

1548.65

2500

16384+0

64 (SMT on)

512 (16 modules)

Gigabyte G492-Z50-00

ampere01

AMD EPYC 7502

1252.67

2500

16384+0

64 (SMT off)

512 (16 modules)

Gigabyte G492-Z50-00

This table summarizes the minimum, maximum and average results for the different processor types:

HEPSPEC Results @ DESY SGE and PAX Cluster for 32 bit

Min Result

Max Result

Average

Intel Xeon X5560 (pax)

119.35

120.71

120.11

Intel Xeon X5550 (blade90-9f)

105.27

115.20

109.51

Intel Xeon E5450 (blade50-6f)

73.02

77.91

75.24

Intel Xeon E5440 (blade40-4f)

72.00

75.14

72.95

Intel Xeon E5345 (blade00-25)

57.67

61.22

59.01

AMD Opteron

28.93

28.93

28.93

Intel Xeon E5540 (photon)

129.96

129.96

129.96

Intel Xeon E5-2660 (pax9)

310.95

322.98

320.61

Intel Xeon E5-2640 v3 (pax10)

370.46

371.75

371.39

Intel Xeon 5120

469.84

473.33

471.77

Intel Xeon E5-2697A v4 (pax11)

721.31

738.05

730.14

AMD EPYC 7702P

1025.55

1050.23

1035.29

AMD EPYC 72F3

603.74

605.58

604.57

AMD EPYC 7713P

1213.2

1221.4

1217.7

AMD EPYC 7502

1233.65

1272.99

1252.67

The following table holds the results for running the HEP SPEC benchmarks with the default configuration file provided by CERN and run with the 'gcc', '64' and 'farm' parameters.

HEPSPEC Results @ DESY SGE and PAX Farm/Cluster for 64 bit

hostname

Processor

Result

CPU Frequency[MHz]

L2+L3 Cache

Cores

Memory[GB]

Motherboard

westmere

Intel Xeon X5670

196.45

2930

12288+0

6

24

pax0X

Intel Xeon X5560

132.59

2794

8192+0

8

16 (8 modules)

Dell 0H723K

photon

Intel Xeon E5540

151.78

2530

8192+0

16 (SMT on)

16 (8 modules)

Dell 0H723K

blade0X / blade1X

Intel Xeon E5345

59.84

2333

4096+0

8

16 (8 modules)

Dell 0H723K

blade2X

Intel Xeon 5160

--.--

3000

4096+0

4

16 (8 modules)

Dell 0H723K

blade4X

Intel Xeon 5440

72.74

2834

4096+0

8

16 (8 modules)

Dell 0H723K

blade5X / blade6X

Intel Xeon E5450

75.57

3000

6144+0

8

16 (8 modules)

Dell 0H723K

blade9X

Intel Xeon X5550

127.45

2661

8192+0

8

16 (8 modules)

Dell 0H723K

pax9x

Intel Xeon E5-2660

330.25

2200

20480+0

8

24 (4 modules)

trex01

Intel Xeon 5120

666.1

2200

19712+0

28 (SMT on)

384 (12 modules)

Dell 0N6JWX

trex01

Intel Xeon 5120

538.12

2200

19712+0

28 (SMT off)

384 (12 modules)

Dell 0N6JWX

pax10

Intel Xeon E2640v3

443.73

2954

20480+0

8

64 (8 modules)

Supermicro X10DRT-HIBF

qftquad19

Intel Xeon E5-2640v4

482.00

2400

25600+0

10

384 (24 modules)

pax11

Intel Xeon E5-2697A v4

808.17

2600

20480+0

32 (SMT on)

128 (8 modules)

Supermicro X10DRT-HIBF

pizza00

AMD EPYC 7702P

1346,32

2000

16384+0

64 (SMT on)

512 (8 modules)

Dell 04F3CJ

pizza00

AMD EPYC 7702P

1194,90

2000

16384+0

64 (SMT off)

512 (8 modules)

Dell 04F3CJ

qftquad25

AMD EPYC 72F3

1000.31

3700

32768+0

16 (SMT on)

2048 (32 modules)

Dell 024PW1

qftquad25

AMD EPYC 72F3

745.08

3700

32768+0

16 (SMT off)

2048 (32 modules)

Dell 024PW1

wgs41

AMD EPYC 7713P

1469.51

2000

32768+0

64 (SMT off)

256 (8 modules)

Dell 035YY8

wgs41

AMD EPYC 7713P

1557.59

2000

32768+0

64 (SMT off)

256 (8 modules)

Dell 035YY8

ampere01

AMD EPYC 7502

1803.05

2500

16384+0

64 (SMT on)

512 (16 modules)

Gigabyte G492-Z50-00

ampere01

AMD EPYC 7502

1471.76

2500

16384+0

64 (SMT off)

512 (16 modules)

Gigabyte G492-Z50-00

As one can see from the tables above, the slight performance gain whe running the benchmark in the 64-bit mode is about 10,4% on the PAX machines (Xeon X5560), about 16% on the SMT-enabled photon host as well as on the machines in the blade9 chasis. Interesting fact to mention is that the machines blade0x through blade6x show similar results in both 32 nad 64-bit mode. This could be due to the fact that the processors Xeon E5345, Intel Xeon 5160, Intel Xeon 5440 and Intel Xeon E5450 rely on the old CPU interconnect to the main memory, namely the FSB which operates on a static frequency and so the time for 32 bit accesses translates into the time for a 64 bit accesses to main memory.

HEPSPEC (last edited 2022-04-11 07:43:22 by GötzWaschk)