Symptoms

Hardware

What didn't help

Probable Cause

Bugs in Nehalem deep C-States.

Links:

The Citrix document is currently unavailable, but still in the google cache. It talks about lockups of Xenserver 5.6 which supports these C-States. It suggests the cause are the following bugs in Nehalem and Westmere CPUs:

This may explain why we have only observed this problem on the M610: The W3503 in the workstations doesn't support Hyperthreading, and the RAM in our Westmere systems is probably (hopefully) not operating in "extended temperature range".

Possible Workarounds

The Citrix document recommends disabling C-States in the BIOS. In the BZ it is discussed that this won't help because the SL6 kernel finds them anyway.

The KB article recommends, and that's the first recommendation in the BZ too, to use the kernel parameter intel_idle.max_cstate=2, which avoids C6. One reporter in the BZ though claims that this reduces the hangs by 90%, but not completely. The next recommendation is intel_idle.max_cstate=0 processor.max_cstate=1, which disables the intel specific code for entering deep C-states, and limits the ACPI code to C1 (I believe... notice there's a difference in nomenclature between "Intel C-states" and "ACPI C-states" - ACPI C3 seems to be Intel C6 ...). Alas, the latter parameters make the idle system consume significantly more power:

C-States.gif

Power consumption rises from 80W to more than 130. Using the first parameter instead, it stays below 100W. For reference, the consumption under SL5 is about 115W.

Another workaround should be disabling Hyperthreading.

How to verify that Workarounds are applied

Intel driver, unrestricted:

# dmesg|grep idle
using mwait in idle threads.
intel_idle: MWAIT substates: 0x1120
intel_idle: v0.4 model 0x1A
intel_idle: lapic_timer_reliable_states 0x2
ACPI: acpi_idle yielding to intel_idle
cpuidle: using governor ladder
cpuidle: using governor menu

# grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
/sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE
/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:C0
/sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295
/sys/devices/system/cpu/cpu0/cpuidle/state0/time:4483024
/sys/devices/system/cpu/cpu0/cpuidle/state0/usage:11242
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:MWAIT 0x00
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:3
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:NHM-C1
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:1000
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:452411903
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:1346944
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:MWAIT 0x10
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:20
/sys/devices/system/cpu/cpu0/cpuidle/state2/name:NHM-C3
/sys/devices/system/cpu/cpu0/cpuidle/state2/power:500
/sys/devices/system/cpu/cpu0/cpuidle/state2/time:1797289080
/sys/devices/system/cpu/cpu0/cpuidle/state2/usage:2232648
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:MWAIT 0x20
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:200
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:NHM-C6
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:350
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:541786973905
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:19614317

Intel driver, limited to C3:

# dmesg|grep idle
Command line: ro root=UUID=894f2e3a-62c0-4eee-93d5-21360859b6b4 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us pci=bfsort crashkernel=auto intel_idle.max_cstate=2
Kernel command line: ro root=UUID=894f2e3a-62c0-4eee-93d5-21360859b6b4 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us pci=bfsort crashkernel=129M@0M intel_idle.max_cstate=2
using mwait in idle threads.
intel_idle: MWAIT substates: 0x1120
intel_idle: v0.4 model 0x1A
intel_idle: lapic_timer_reliable_states 0x2
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
intel_idle: max_cstate 2 reached
ACPI: acpi_idle yielding to intel_idle
cpuidle: using governor ladder
cpuidle: using governor menu

# grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
/sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE
/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:C0
/sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295
/sys/devices/system/cpu/cpu0/cpuidle/state0/time:55080
/sys/devices/system/cpu/cpu0/cpuidle/state0/usage:121
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:MWAIT 0x00
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:3
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:NHM-C1
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:1000
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:8614997
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:79650
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:MWAIT 0x10
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:20
/sys/devices/system/cpu/cpu0/cpuidle/state2/name:NHM-C3
/sys/devices/system/cpu/cpu0/cpuidle/state2/power:500
/sys/devices/system/cpu/cpu0/cpuidle/state2/time:10529011005
/sys/devices/system/cpu/cpu0/cpuidle/state2/usage:458335

Notice NHM-C6 is absent.

With the more restrictive parameters:

Sep 25 14:07:24 blade8d kernel: Kernel command line: ro root=UUID=894f2e3a-62c0-4eee-93d5-21360859b6b4 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us pci=bfsort crashkernel=129M@0M intel_idle.max_cstate=0 processor.max_cstate=1
Sep 25 14:07:24 blade8d kernel: using mwait in idle threads.
Sep 25 14:07:24 blade8d kernel: intel_idle: disabled
Sep 25 14:07:24 blade8d kernel: ACPI: acpi_idle registered with cpuidle
Sep 25 14:07:24 blade8d kernel: cpuidle: using governor ladder
Sep 25 14:07:24 blade8d kernel: cpuidle: using governor menu

No clue how to verify the effectiveness of the processor.max_cstate=1 parameter, but the power consumption suggests it works...

SL6 Development/Nehalem Hangs (last edited 2011-09-26 19:24:28 by StephanWiesand)