Tentative fix/work-around for i915 GPU hangs

Some of you may have noticed the GPU hangs on Haswell Chromebooks in recent versions of you favourite distro. Well, a man called Fabrice G. (via email) furnished me with what appears to be a fix, this afternoon. It’s rather a long addition to the kernel cmdline, and I’m not sure if *absolutely* all of it is needed, but it certainly appears to be doing the job on my HP Chromebook 14, in CentOS 7, with a 3.17 kernel.

Simply add the following to GRUB_CMDLINE_LINUX in /etc/default/grub:

drm.debug=0 drm.vblankoffdelay=1 i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.powersave=1 i915.enable_ips=1 i915.disable_power_well=1 i915.enable_hangcheck=1 i915.enable_cmd_parser=1 i915.fastboot=0 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0 i915.enable_psr=0

then run:

su -c "grub2-mkconfig > /boot/grub2/grub.cfg"

in CentOS/Fedora, or:

update-grub

in Debian/Ubuntu/etc

Enjoy!

Join the conversation

22 Comments

  1. John,

    I’ve recently added this line to my kernel. I tried it on 3.17 linux-chromebook from the Arch Linux AUR, and 3.16-1 of the same (with a custom tpm/tis patch that Scot Doyle developed to work with your ROM’s)

    Both of them still seem to hang. In fact, whereas before frequency was not often at all, with that line I am getting it about once a day. I can’t find anything relavent in the logs to help though.

    If you know where to look I would be happy to start debugging it.


    Jesse

  2. I am using syslinux actually:

    but my kernel boot line is (usually):
    root=/dev/mapper/lvmpool-root cryptdevice=/dev/sda2:crypt ro

    so nothing fancy there
    when I used the kernel line you provided it was:
    root=/dev/mapper/lvmpool-root cryptdevice=/dev/sda2:crypt ro drm.debug=0 drm.vblankoffdelay=1 i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.powersave=1 i915.enable_ips=1 i915.disable_power_well=1 i915.enable_hangcheck=1 i915.enable_cmd_parser=1 i915.fastboot=0 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0 i915.enable_psr=0

  3. It looks like I am. I’ll leave this line enabled and see if I can reproduce it, and then check journalctl, but for now /proc/commandline contains what it should:

    BOOT_IMAGE=../vmlinuz-linux-chromebook root=/dev/mapper/lvmpool-root cryptdevice=/dev/sda2:crypt ro drm.debug=0 drm.vblankoffdelay=1 i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.powersave=1 i915.enable_ips=1 i915.disable_power_well=1 i915.enable_hangcheck=1 i915.enable_cmd_parser=1 i915.fastboot=0 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0 i915.enable_psr=0 initrd=../initramfs-linux-chromebook.img

  4. John,

    I was wondering how you set up Centos 7 on your Chromebook. I recently installed Scientific Linux on mine with updates, and was wondering how I should take off from there?

    Previously I was running Fedora 20 with the 3.15 kernel following advice from http://forums.fedora-fr.org/viewtopic.php?id=61252 with kernel-3.16 blacklisted from yum.conf. By the way mine is the Toshiba Chromebook Leon model.

    Also wondering about how to handle updates and repos. Do you use the kernel-ml from elrepo and then build kernel modules against that. Does your kernel keep ABI compatibility with el7 kernels. I ask that because I may need to install zfs in the future and don’t want to risk shooting myself in the foot in the future.

    Anyways, thanks for all the work you put into your writings. I don’t mind playing guinea pig for a while.

    1. Just a straight CentOS 7 install on ext4. The kernel is from kernel.org, copied the existing CentOS config and enabled chromeos_laptop and i2c_designware_pci to get the mouse working.

      If you want my opinion, ZFS on Linux is even more risky than BTRFS, and BTRFS ate my USB data the weekend I was at the coreboot Hackathon …

  5. How does this GPU hang manifest itself? I’m running Fedora 20, started with the 3.15 kernel, then the 3.16 kernel, and finally running the 3.17 rawhide kernel which no longer requires any patching for the mouse. After applying the suspend fixes on the ARCH wiki in relation to systemd and some sound fixes I have the Acer C720 working perfectly. I’ve never experienced a GPU hang and suspend works flawlessly and quickly. From my POV the Acer C720 is 100% fully supported after a few configuration file changes and the updated kernel. Nothing I know of doesn’t work as expected. After doing some SSD optimizations and setting up “tuned” for power management I get bewteen 8 and 12 hours of battery life.

    My kernel command line is:

    BOOT_IMAGE=/vmlinuz-3.17.2-300.fc20.x86_64 root=UUID=ad66023d-0681-4a4b-8d13-a30fba904eee ro vconsole.font=latarcyrheb-sun16 rhgb quiet tpm_tis.interrupts=0 nmi_watchdog=0 elevator=noop LANG=en_US.UTF-8

    1. If you look at dmesg you’ll see messages about i915 render ring hangs, or words to that effect. It manifests itself by the screen hanging, except for the mouse pointer, for a few seconds at a time. Might only happen once a day, could happen a number of times, and be pretty annoying.

  6. This was a stubborn bug. None of the proposed workarounds fixed it for me (Asus Chromebox/Ubuntu 14.04) until I tried the kernel that Hugh Greenberg linked to. I’m on day 2 with no hangs or crashes. Chrome is stable now.

    Thank you very much! And thanks John for all the work you’ve put into this.

  7. Any news on this? Did it make it into 3.18?

    I have a suspicion that some of the flags get reset on suspend. I don’t get any hangs when I first boot with the flags (removing the sillier ones) but later start noticing hangs again.

    I haven’t tried the new driver. Hence why I ask if it’s in 3.18. Or is there a patch against 3.18 available?

    1. The only other thing I have to add is that CentOS 7 has an older graphics stack than some of the latest distro’s and perhaps that has something to do with it working for me, and not so much others.

  8. Hey guess what. I recently updated to Linux 4.0.0-trunk and the problem seems to be entirely gone. I haven’t had a freeze in over 24 hours now.

    I think the fix *was* supposed to be in 3.18 but I ran 3.19-trunk for a while and it was better but still had freezes. So there must have been further changes in 4.0.

    This combined with non-hard-coded DPI assumptions in Chrome (aka “HiDPI support”) have made my Pixel suddenly so much more better in the span of a few days.

    1. Progress is good. I’m also using Linux 4.0 with my CB5-571-39VM, and although it’s still early days, I haven’t noticed any issues beyond the weird “flicker” which is fixed by i915.ips_enable=0.

  9. May be you folks want to start using /etc/modprobe.d/i915.conf for all your tweaking needs? It is far more convenient than having to edit GRUB’s kernel line…
    HTH

    1. I think you’re overstating the case – “far more convenient” != 1 command less to type, but thanks for the tip. I don’t think this stuff is needed any more anyway.

Leave a comment