Asus UX31E i915_enable_rc6=1 sudden shutdown solution
As anyone else, I had issues with VGA rc6 sleep option, manifesting as a sudden death syndrome. I have taken the obvious steps to remedy the situation, like disabling the IOMMU and so on, but unfortunately, this led nowhere. I am using Gentoo Linux and tried both 3.2.1 and 3.3_rc1 kernels, but notebook kept shutting down under both of them. I therefore concentrated my efforts on the following problems:
- Temperature - my machine exhibits rather high temperatures - 95° C on the CPU under full load, this triggers CPU throttling and of course emits an MCE event. I therefore removed the dust filter, which is located beneath the venting ports on the bottom and which I found rather restrictive. This hardware hack led to better heat exchange, but it had no effect on the maximum temperature. I believe, that this has no effect on the rc6 shutdown problem and I will revert to previous state with filter installed, after I reach a one week uptime.
- Fine tuning the kernel power management - I did away with CONFIG_INTEL_IPS (Drivers->X86 platform specific device drivers) and also CONFIG_PM_DEVFREQ (Drivers). It is possible I will return to studying these options after sufficient stability is achieved.
- Analysing the dmesg output - it is very necessary to pay close attention to that. For example - while this laptop contains two native and two HT CPU cores, it is needed to set CONFIG_NR_CPUS to 16, otherwise the CPU initialisation doesn't go smoothly. The dmesg output analysis led me to a probable culprit of the sudden shutdown problem, which I think is a faulty DSDT.
- ACPI DSDT errors. While I am using the latest (210) BIOS, it had no impact on the shutdown problem. After seeing some error messages in dmesg concerning ACPI and DSDT, I made the following steps: I downloaded and decompiled the DSDT, afterward I recompiled it using IASL. I encountered one error and ten or eleven warnings. While the error was easy to fix, one warning gave me rather hard time. The warning concerned missing memory address range for hardware initialisation. At first I tried to figure out the correct range, but when I found out, that the concerned device is a plain old IDE controller, I commented out the entire initialisation block. Last but not least, I overrode the OSYS variable to 0x07D9 (Windows 2009), it seems the kernel likes it more than 0x03E8 (Linux).
The conclusion is simple: with these tweaks I am into a 79th hour of uptime and I am certainly not giving the laptop an easy time. During this period the computer went through many suspend-resume cycles, a lot of compiling, running a qemu-kvm virtual machine, browsing, DVD burning and watching assorted video files. I will be willing to say that the computer is stable after 168 hour uptime.
I am using the following kernel line: kernel /vmlinuz root=/dev/sda2 i915.powersave=1 i915.semaphores=1 i915.i915_enable_rc6=1 quiet
UPDATE: after a week long uptime it seems more and more clear, that the faulty DSDT is indeed responsible for the sudden death syndrome. I have already reinserted the dust filter, without any noticeable impact.
DOWNLOADS: fixed DSDT source (from BIOS 210): ux31e_dsdt.dsl and a binary blob to compile into the kernel: ux31e_dsdt.hex. As always, proceed at your own peril :-)
DOWNLOADS - update: for the lost souls with UX21E - fixed DSDT source (from BIOS 209): ux21e_dsdt.dsl and a binary blob to compile into the kernel: ux21e_dsdt.hex.
PS: I definitely recommend building your own kernel. My config is here. Beware, I am using KVM, ordinary users should disable it.