#pragma section-numbers on ---- <> ---- = Overview = The High Energy Physics (HEP) SPEC benchmark is a set of test applications which stress the processor with operations and algorithms used commonly in applications from the physics community, and provide the SPEC result used for example in describing the resources relevant to HEP applications, provided to the Grid infrastructure. Alternatively the HEPSPEC benchmark suite can be used to assess the condition of different configurations and detect latent bottlenecks and problems. = Interactive script for HEPSPEC runs = You can find the interactive script for running HEPSPEC benchmarks on both local machines and in SGE farm in AFS under /afs/ifh.de/group/rz/HEPSPEC/interactiveHEPSPEC.sh . This script takes several parameters, the description of which can be obtained by running the script with the '-h' option: {{{#!c host /afs/ifh.de/group/rz/HEPSPEC 10-05-12 13:55 # ./interactiveHEPSPEC.sh -h Please specifiy the required arguments! # ./genericSPEcrun.sh [compiler] [architecture] [run mode] Valid values are: for [compiler] : icc or gcc for [architecture] : 32 or 64 for [mode] : farm or local }}} HEPSPEC benchmarks can be run with Intel icc v. 11.0 or GNU GCC v. 4.1.2 . Furthermore one can choose between compiling and running the HEPSPEC benchmarks for 32 or 64 bit architecture and whether to run in a local environment or on some of the SGE farm hosts. The tests reported in Section 4 were run with gcc for 32 bit architecture as perscripted by HEP computing community. == Running HEPSPEC in the SGE Farm == Currently, the batch job doesn't work in the farm as the compiler part exceeds the memory limit. Instead use the script {{{automaticHEPSPECondisabledfarmnode.sh}}} on a disabled farm node. First make /batch writable by the user, then remove /etc/cron.hourly/regular-batch-cleanups on the node. Here is an example of submitting a HEPSPEC benchmark run to a host in the SGE Farm using icc as compiler and 32 bit architecture: {{{#!c host /afs/ifh.de/group/rz/HEPSPEC 10-05-12 14:06 # ./interactiveHEPSPEC.sh icc 32 farm Full path to the unmodified HEPSPEC distribution: /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 Setting SPECDIR= /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 Full path to the CERN HEPSPEC .cfg file: /afs/ifh.de/group/rz/HEPSPEC/spec2k6 Setting CERNCONFDIR= /afs/ifh.de/group/rz/HEPSPEC/spec2k6 Full path to folder for storing the result (must not be in the temporary folder) /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY Setting RESULTDIR= /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY Submitting /afs/ifh.de/group/rz/HEPSPEC/framHEPSPEC.sh Your job 792245 ("farmHEPSPEC.sh") has been submitted }}} The "interactiveHEPSPEC" scripts submits a so-called subscript, farmHEPSPEC, which has the following options embedded, according to the rules for running farm jobs found in [[Batch_System_Usage|Batch System Usage]]. {{{#!c #(the cpu time for this job) #$ -l h_cpu=09:00:00 #(the maximum memory usage of this job) #$ -l h_rss=2G #(Acquire disk space) #$ -l tmpdir_size=4G #(stderr and stdout are merged together to stdout) #$ -j y #(send mail on job's end and abort) #$ -m bae #(Execute the sript from the current working directory) #$ -cwd #(specifiy project) #$ -P yourgrouphere #(parallelism level) #$ -pe multicore 64 #(reserve slot for the whole job) #$ -R y }}} . It can also be submitted directly to the farm with the options in the following order: compiler, architecture, specdir, cernconfdir, resultsdir: {{{qsub farmHEPSPEC.sh icc 32 /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 /afs/ifh.de/group/rz/HEPSPEC/spec2k6 /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY -l hostname=pizza00}}} == Running HEPSPEC locally on a given machine == If you want to test a particular machine to which you have login access you can do this by invoking the script with the 'local' parameter. In this case you will be additionally asked for the path where to create and use a temporary folder for copying and compiling the HEPSPEC benchmarks. Note that this path should point to a folder which can hold at least 4 GB data, otherwise the HEPSPEC benchmark will exit with an error. {{{#!c 10-05-12 14:30 # ./interactiveHEPSPEC.sh gcc 32 local Full path to the unmodified HEPSPEC distribution: /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 Setting SPECDIR= /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 Full path to the CERN HEPSPEC .cfg file: /afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6 Setting CERNCONFDIR= /afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6 Full path to temporary scratch folder for HEPSPEC copy and compile (about 4BG): (DON'T ENTER YOUR AFS HOME DIRECTORY) /tmp Setting TEMPDIR= /tmp Full path to folder for storing the result (must not be in the temporary folder) /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY Setting RESULTDIR= /afs/ifh.de/group/rz/HEPSPEC/SPECresultsDESY *************************************************** Starting on host host.ifh.de at 20100512, argc=3 *************************************************** SPECDIR = /afs/ifh.de/group/rz/HEPSPEC/2006-1.1 CERNCONFDIR = /afs/ifh.de/group/rz/HEPSPEC/SPEC/spec2k6 COMPILER = gcc ARCH = 32 TEMPDIR = /tmp Hit any key to continue.... Using GNU C/C++ compiler suite Logfile name is HEPSPEC_host.ifh.de_gcc_20100512_cerncfg.log Copy the installation files to TEMPDIR }}} = Results = The following table holds the results for running the HEP SPEC benchmarks with the default configuration file provided by [[https://twiki.cern.ch/twiki/pub/FIOgroup/TsiBenchHEPSPEC/spec2k6-2.23.tar.gz|CERN]] and run with the 'gcc', '32' and 'farm' parameters. ||||||||||||||||||'''HEPSPEC Results @ DESY SGE and PAX Farm/Cluster for 32 bit''' || ||hostname ||Processor ||Result ||CPU Frequency[MHz] ||L2+L3 Cache ||Cores ||Memory[GB] ||Motherboard || ||westmere ||Intel Xeon X5670 ||172.32 ||2930 ||12288+0 ||6 ||24 || || ||pax0X ||Intel Xeon X5560 ||120.11 ||2794 ||8192+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||photon ||Intel Xeon E5540 ||129.96 ||2530 ||8192+0 ||16 (SMT on) ||16 (8 modules) ||Dell 0H723K || ||blade0X / blade1X ||Intel Xeon E5345 ||59.01 ||2333 ||4096+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade2X ||Intel Xeon 5160 ||58.20 ||3000 ||4096+0 ||4 ||16 (8 modules) ||Dell 0H723K || ||blade4X ||Intel Xeon 5440 ||72.95 ||2834 ||4096+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade5X / blade6X ||Intel Xeon E5450 ||75.24 ||3000 ||6144+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade9X ||Intel Xeon X5550 ||109.51 ||2661 ||8192+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||wgs0 ||Intel Xeon X5690 @ 3.47 ||246.13 ||3470 ||20480+0 ||6 ||48 (6 modules) || || ||pax9x ||Intel Xeon E5-2660 ||320.61 ||2200 ||20480+0 ||8 ||24 (4 modules) || ||wgs1 ||Intel Xeon E2640v3 ||370.12 ||2594 ||20480+0 ||8 ||64 (8 modules) || || ||pax10 ||Intel Xeon E2640v3 ||371.39 ||2954 ||20480+0 ||8 ||64 (8 modules) ||Supermicro X10DRT-HIBF || ||trex01||Intel Xeon 5120||590.18||2200||19712+0||28 (SMT on)||384 (12 modules)||Dell 0N6JWX|| ||trex01||Intel Xeon 5120||471.77||2200||19712+0||28 (SMT off)||384 (12 modules)||Dell 0N6JWX|| ||qftquad19 ||Intel Xeon E5-2640v4 ||421.01 ||2400 ||25600+0 ||10 ||384 (24 modules) || || ||pax11 ||Intel Xeon E5-2697A v4 ||730.14 ||2600 ||20480+0 ||32 (SMT on) ||128 (8 modules) ||Supermicro X10DRT-HIBF || ||pizza00 ||AMD EPYC 7702P ||1214,12 ||2000 ||16384+0 ||64 (SMT on) ||512 (8 modules) ||Dell 04F3CJ || ||pizza00 ||AMD EPYC 7702P || 1035.29 ||2000 ||16384+0 ||64 (SMT off) ||512 (8 modules) ||Dell 04F3CJ || ||qftquad25||AMD EPYC 72F3 || 770.04||3700||32768+0||16 (SMT on)||2048 (32 modules)|| Dell 024PW1|| ||qftquad25||AMD EPYC 72F3 || 604.57||3700||32768+0||16 (SMT off)||2048 (32 modules)|| Dell 024PW1|| ||wgs41||AMD EPYC 7713P || 1337.35||2000||32768+0||64 (SMT on)|| 256 (8 modules)||Dell 035YY8|| ||wgs41||AMD EPYC 7713P || 1217.7||2000||32768+0||64 (SMT off)|| 256 (8 modules)||Dell 035YY8|| ||ampere01||AMD EPYC 7502 || 1548.65||2500||16384+0||64 (SMT on)||512 (16 modules)||Gigabyte G492-Z50-00|| ||ampere01||AMD EPYC 7502 || 1252.67||2500||16384+0||64 (SMT off)||512 (16 modules)||Gigabyte G492-Z50-00|| This table summarizes the minimum, maximum and average results for the different processor types: ||||||||||||||||||'''HEPSPEC Results @ DESY SGE and PAX Cluster for 32 bit''' || || ||Min Result ||Max Result ||Average || ||Intel Xeon X5560 (pax) ||119.35 ||120.71 ||120.11 || ||Intel Xeon X5550 (blade90-9f) ||105.27 ||115.20 ||109.51 || ||Intel Xeon E5450 (blade50-6f) ||73.02 ||77.91 ||75.24 || ||Intel Xeon E5440 (blade40-4f) ||72.00 ||75.14 ||72.95 || ||Intel Xeon E5345 (blade00-25) ||57.67 ||61.22 ||59.01 || ||AMD Opteron ||28.93 ||28.93 ||28.93 || ||Intel Xeon E5540 (photon) ||129.96 ||129.96 ||129.96 || ||Intel Xeon E5-2660 (pax9) ||310.95 ||322.98 ||320.61 || ||Intel Xeon E5-2640 v3 (pax10) ||370.46 ||371.75 ||371.39 || ||Intel Xeon 5120||469.84||473.33||471.77|| ||Intel Xeon E5-2697A v4 (pax11) ||721.31 ||738.05 ||730.14 || ||AMD EPYC 7702P ||1025.55 ||1050.23|| 1035.29 || ||AMD EPYC 72F3 ||603.74||605.58||604.57|| ||AMD EPYC 7713P || 1213.2||1221.4||1217.7|| ||AMD EPYC 7502|| 1233.65||1272.99||1252.67|| The following table holds the results for running the HEP SPEC benchmarks with the default configuration file provided by [[https://twiki.cern.ch/twiki/pub/FIOgroup/TsiBenchHEPSPEC/spec2k6-2.23.tar.gz|CERN]] and run with the 'gcc', '64' and 'farm' parameters. ||||||||||||||||||'''HEPSPEC Results @ DESY SGE and PAX Farm/Cluster for 64 bit''' || ||hostname ||Processor ||Result ||CPU Frequency[MHz] ||L2+L3 Cache ||Cores ||Memory[GB] ||Motherboard || ||westmere ||Intel Xeon X5670 ||196.45 ||2930 ||12288+0 ||6 ||24 || || ||pax0X ||Intel Xeon X5560 ||132.59 ||2794 ||8192+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||photon ||Intel Xeon E5540 ||151.78 ||2530 ||8192+0 ||16 (SMT on) ||16 (8 modules) ||Dell 0H723K || ||blade0X / blade1X ||Intel Xeon E5345 ||59.84 ||2333 ||4096+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade2X ||Intel Xeon 5160 ||--.-- ||3000 ||4096+0 ||4 ||16 (8 modules) ||Dell 0H723K || ||blade4X ||Intel Xeon 5440 ||72.74 ||2834 ||4096+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade5X / blade6X ||Intel Xeon E5450 ||75.57 ||3000 ||6144+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||blade9X ||Intel Xeon X5550 ||127.45 ||2661 ||8192+0 ||8 ||16 (8 modules) ||Dell 0H723K || ||pax9x ||Intel Xeon E5-2660 ||330.25 ||2200 ||20480+0 ||8 ||24 (4 modules) || || ||trex01||Intel Xeon 5120||666.1||2200||19712+0||28 (SMT on)||384 (12 modules)||Dell 0N6JWX|| ||trex01||Intel Xeon 5120||538.12||2200||19712+0||28 (SMT off)||384 (12 modules)||Dell 0N6JWX|| ||pax10 ||Intel Xeon E2640v3 ||443.73 ||2954 ||20480+0 ||8 ||64 (8 modules) ||Supermicro X10DRT-HIBF || ||qftquad19 ||Intel Xeon E5-2640v4 ||482.00 ||2400 ||25600+0 ||10 ||384 (24 modules) || || ||pax11 ||Intel Xeon E5-2697A v4 ||808.17 ||2600 ||20480+0 ||32 (SMT on) ||128 (8 modules) ||Supermicro X10DRT-HIBF || ||pizza00 ||AMD EPYC 7702P ||1346,32 ||2000 ||16384+0 ||64 (SMT on) ||512 (8 modules) ||Dell 04F3CJ || ||pizza00 ||AMD EPYC 7702P ||1194,90 ||2000 ||16384+0 ||64 (SMT off) ||512 (8 modules) ||Dell 04F3CJ || ||qftquad25||AMD EPYC 72F3 || 1000.31||3700||32768+0||16 (SMT on)||2048 (32 modules)|| Dell 024PW1|| ||qftquad25||AMD EPYC 72F3 || 745.08||3700||32768+0||16 (SMT off)||2048 (32 modules)|| Dell 024PW1|| ||wgs41||AMD EPYC 7713P || 1469.51||2000||32768+0||64 (SMT off)|| 256 (8 modules)||Dell 035YY8|| ||wgs41||AMD EPYC 7713P || 1557.59||2000||32768+0||64 (SMT off)|| 256 (8 modules)||Dell 035YY8|| ||ampere01||AMD EPYC 7502 || 1803.05||2500||16384+0||64 (SMT on)||512 (16 modules)||Gigabyte G492-Z50-00|| ||ampere01||AMD EPYC 7502 || 1471.76||2500||16384+0||64 (SMT off)||512 (16 modules)||Gigabyte G492-Z50-00|| As one can see from the tables above, the slight performance gain whe running the benchmark in the 64-bit mode is about 10,4% on the PAX machines (Xeon X5560), about 16% on the SMT-enabled photon host as well as on the machines in the blade9 chasis. Interesting fact to mention is that the machines blade0x through blade6x show similar results in both 32 nad 64-bit mode. This could be due to the fact that the processors Xeon E5345, Intel Xeon 5160, Intel Xeon 5440 and Intel Xeon E5450 rely on the old CPU interconnect to the main memory, namely the FSB which operates on a static frequency and so the time for 32 bit accesses translates into the time for a 64 bit accesses to main memory.