Ubuntu: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [nvidia-smi:566]



Question:

I installed Xubuntu xfce 16.04, everything was fine, the boot time was about 10 sec, but I could not play games. Then I installed nvidia drivers (I have GTX950, and monitor 2560x1080), it would only boot on recovery mode (just entered and resumed). So I made this modification on grub:

/etc/default/grub    GRUB_GFXPAYLOAD_LINUX="keep"    GRUB_GFXMODE="1920x1080x32"    

After that, it boots, but it takes so much time... the screen freezes at the line:

 Starting File System Check on /dev/disk/by-uuid/d6e1f5fa-e0e8-455f-8888-268ce02d3a9d...   Starting File System Check on /dev/disk/by-uuid/322B-518B...  [OK] Started File System Check Daemon to report status.  _  

After about 25sec I got a Beep sound, then it resumes and the login screen shows up. I have no problem after that.

Below is the journalctl:

-- Logs begin at Qua 2017-06-28 08:13:58 AMT, end at Qua 2017-06-28 17:17:01 AMT. --  Jun 28 08:13:58 wilhovisk systemd-journald[261]: Runtime journal (/run/log/journal/) is 8.0M, max 78.4M, 70.4M free.  Jun 28 08:13:58 wilhovisk kernel: Linux version 4.8.0-56-generic (buildd@lcy01-33) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #61~16.04.1-Ubuntu SMP Wed Jun 14 11:58:22 UTC 2017 (Ubuntu 4.8.0-56.61~16.04.1-generic 4.8.17)  Jun 28 08:13:58 wilhovisk kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-4.8.0-56-generic.efi.signed root=UUID=aadba17f-3ae3-489a-9278-7314f985f55c ro  Jun 28 08:13:58 wilhovisk kernel: KERNEL supported cpus:  Jun 28 08:13:58 wilhovisk kernel:   Intel GenuineIntel  Jun 28 08:13:58 wilhovisk kernel:   AMD AuthenticAMD  Jun 28 08:13:58 wilhovisk kernel:   Centaur CentaurHauls  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.  Jun 28 08:13:58 wilhovisk kernel: x86/fpu: Using 'eager' FPU context switches.  Jun 28 08:13:58 wilhovisk kernel: e820: BIOS-provided physical RAM map:  [...]  Jun 28 08:13:58 wilhovisk kernel: NX (Execute Disable) protection: active  Jun 28 08:13:58 wilhovisk kernel: efi: EFI v2.31 by American Megatrends  Jun 28 08:13:58 wilhovisk kernel: efi:  ACPI 2.0=0xc8f9c000  ACPI=0xc8f9c000  SMBIOS=0xf04c0  MPS=0xfd450   Jun 28 08:13:58 wilhovisk kernel: SMBIOS 2.7 present.  Jun 28 08:13:58 wilhovisk kernel: DMI: Gigabyte Technology Co., Ltd. H97M-D3H/H97M-D3H, BIOS F7 08/03/2015  [...]  Jun 28 08:13:58 wilhovisk kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.8.0-56-generic.efi.signed root=UUID=aadba17f-3ae3-489a-9278-7314f985f55c ro  Jun 28 08:13:58 wilhovisk kernel: PID hash table entries: 4096 (order: 3, 32768 bytes)  Jun 28 08:13:58 wilhovisk kernel: Calgary: detecting Calgary via BIOS EBDA area  Jun 28 08:13:58 wilhovisk kernel: Calgary: Unable to locate Rio Grande table in EBDA - bailing!  Jun 28 08:13:58 wilhovisk kernel: Memory: 7778412K/8250420K available (8829K kernel code, 1441K rwdata, 3836K rodata, 1552K init, 1296K bss, 472008K reserved, 0K cma-reserved)  Jun 28 08:13:58 wilhovisk kernel: SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1  Jun 28 08:13:58 wilhovisk kernel: Hierarchical RCU implementation.  Jun 28 08:13:58 wilhovisk kernel:         Build-time adjustment of leaf fanout to 64.  Jun 28 08:13:58 wilhovisk kernel:         RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=4.  Jun 28 08:13:58 wilhovisk kernel: RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=4  Jun 28 08:13:58 wilhovisk kernel: NR_IRQS:33024 nr_irqs:456 16  Jun 28 08:13:58 wilhovisk kernel: Console: colour dummy device 80x25  Jun 28 08:13:58 wilhovisk kernel: console [tty0] enabled  Jun 28 08:13:58 wilhovisk kernel: clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns  Jun 28 08:13:58 wilhovisk kernel: hpet clockevent registered  Jun 28 08:13:58 wilhovisk kernel: tsc: Fast TSC calibration using PIT  Jun 28 08:13:58 wilhovisk kernel: tsc: Detected 3092.965 MHz processor  Jun 28 08:13:58 wilhovisk kernel: Calibrating delay loop (skipped), value calculated using timer frequency.. 6185.93 BogoMIPS (lpj=12371860)  Jun 28 08:13:58 wilhovisk kernel: pid_max: default: 32768 minimum: 301  Jun 28 08:13:58 wilhovisk kernel: ACPI: Core revision 20160422  Jun 28 08:13:58 wilhovisk kernel: ACPI: 6 ACPI AML tables successfully acquired and loaded  Jun 28 08:13:58 wilhovisk kernel:   Jun 28 08:13:58 wilhovisk kernel: Security Framework initialized  Jun 28 08:13:58 wilhovisk kernel: Yama: becoming mindful.  Jun 28 08:13:58 wilhovisk kernel: AppArmor: AppArmor initialized  Jun 28 08:13:58 wilhovisk kernel: Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)  Jun 28 08:13:58 wilhovisk kernel: Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)  Jun 28 08:13:58 wilhovisk kernel: Mount-cache hash table entries: 16384 (order: 5, 131072 bytes)  Jun 28 08:13:58 wilhovisk kernel: Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes)  Jun 28 08:13:58 wilhovisk kernel: CPU: Physical Processor ID: 0  Jun 28 08:13:58 wilhovisk kernel: CPU: Processor Core ID: 0  Jun 28 08:13:58 wilhovisk kernel: mce: CPU supports 9 MCE banks  Jun 28 08:13:58 wilhovisk kernel: CPU0: Thermal monitoring enabled (TM1)  Jun 28 08:13:58 wilhovisk kernel: process: using mwait in idle threads  Jun 28 08:13:58 wilhovisk kernel: Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024  Jun 28 08:13:58 wilhovisk kernel: Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4  Jun 28 08:13:58 wilhovisk kernel: Freeing SMP alternatives memory: 32K (ffffffff8aeee000 - ffffffff8aef6000)  Jun 28 08:13:58 wilhovisk kernel: ftrace: allocating 33462 entries in 131 pages  Jun 28 08:13:58 wilhovisk kernel: smpboot: APIC(0) Converting physical 0 to logical package 0  Jun 28 08:13:58 wilhovisk kernel: smpboot: Max logical packages: 1  [...]  Jun 28 08:13:58 wilhovisk kernel: pci 0000:01:00.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]  Jun 28 08:13:58 wilhovisk kernel: PCI: CLS 64 bytes, default 64  Jun 28 08:13:58 wilhovisk kernel: Unpacking initramfs...  Jun 28 08:13:58 wilhovisk kernel: Freeing initrd memory: 46720K (ffff9399b24b0000 - ffff9399b5250000)  Jun 28 08:13:58 wilhovisk kernel: PCI-DMA: Using software bounce buffering for IO (SWIOTLB)  Jun 28 08:13:58 wilhovisk kernel: software IO TLB [mem 0xb7cde000-0xbbcde000] (64MB) mapped at [ffff939a37cde000-ffff939a3bcddfff]  Jun 28 08:13:58 wilhovisk kernel: Scanning for low memory corruption every 60 seconds  Jun 28 08:13:58 wilhovisk kernel: futex hash table entries: 1024 (order: 4, 65536 bytes)  Jun 28 08:13:58 wilhovisk kernel: audit: initializing netlink subsys (disabled)  Jun 28 08:13:58 wilhovisk kernel: audit: type=2000 audit(1498652036.756:1): initialized  Jun 28 08:13:58 wilhovisk kernel: Initialise system trusted keyrings  Jun 28 08:13:58 wilhovisk kernel: workingset: timestamp_bits=40 max_order=21 bucket_order=0  Jun 28 08:13:58 wilhovisk kernel: zbud: loaded  Jun 28 08:13:58 wilhovisk kernel: squashfs: version 4.0 (2009/01/31) Phillip Lougher  Jun 28 08:13:58 wilhovisk kernel: fuse init (API version 7.25)  Jun 28 08:13:58 wilhovisk kernel: Allocating IMA blacklist keyring.  Jun 28 08:13:58 wilhovisk kernel: Key type asymmetric registered  Jun 28 08:13:58 wilhovisk kernel: Asymmetric key parser 'x509' registered  Jun 28 08:13:58 wilhovisk kernel: Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)  Jun 28 08:13:58 wilhovisk kernel: io scheduler noop registered  Jun 28 08:13:58 wilhovisk kernel: io scheduler deadline registered (default)  Jun 28 08:13:58 wilhovisk kernel: io scheduler cfq registered  Jun 28 08:13:58 wilhovisk kernel: pcieport 0000:00:1c.2: enabling device (0000 -> 0003)  Jun 28 08:13:58 wilhovisk kernel: pci_hotplug: PCI Hot Plug PCI Core version: 0.5  Jun 28 08:13:58 wilhovisk kernel: pciehp: PCI Express Hot Plug Controller Driver version: 0.4  Jun 28 08:13:58 wilhovisk kernel: efifb: probing for efifb  Jun 28 08:13:58 wilhovisk kernel: efifb: framebuffer at 0xf1000000, using 8128k, total 8128k  Jun 28 08:13:58 wilhovisk kernel: efifb: mode is 1920x1080x32, linelength=7680, pages=1  Jun 28 08:13:58 wilhovisk kernel: efifb: scrolling: redraw  Jun 28 08:13:58 wilhovisk kernel: efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0  Jun 28 08:13:58 wilhovisk kernel: Console: switching to colour frame buffer device 240x67  Jun 28 08:13:58 wilhovisk kernel: fb0: EFI VGA frame buffer device  Jun 28 08:13:58 wilhovisk kernel: intel_idle: MWAIT substates: 0x42120  Jun 28 08:13:58 wilhovisk kernel: intel_idle: v0.4.1 model 0x3C  Jun 28 08:13:58 wilhovisk kernel: intel_idle: lapic_timer_reliable_states 0xffffffff  [...]  Jun 28 08:13:58 wilhovisk kernel: microcode: sig=0x306c3, pf=0x2, revision=0x1c  Jun 28 08:13:58 wilhovisk kernel: microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba  [...]  Jun 28 08:13:58 wilhovisk kernel: nvidia: loading out-of-tree module taints kernel.  Jun 28 08:13:58 wilhovisk kernel: nvidia: module license 'NVIDIA' taints kernel.  [...]  Jun 28 08:13:58 wilhovisk kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel  [...]  Jun 28 08:13:58 wilhovisk kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 246  Jun 28 08:13:58 wilhovisk kernel: vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem  Jun 28 08:13:58 wilhovisk kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  381.22  Thu May  4 00:55:03 PDT 2017 (using threaded interrupts)  Jun 28 08:13:58 wilhovisk kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  381.22  Thu May  4 00:21:48 PDT 2017  Jun 28 08:13:58 wilhovisk kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver  [...]  Jun 28 08:13:59 wilhovisk systemd[1]: Found device SanDisk_SDSSDA120G silver-cloud.  Jun 28 08:13:59 wilhovisk ureadahead[264]: ureadahead:/home/wilhovisk/.cache/xfce4-notifyd-theme.rc: No such file or directory  Jun 28 08:13:59 wilhovisk kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input22  Jun 28 08:13:59 wilhovisk kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input23  Jun 28 08:13:59 wilhovisk kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input24  Jun 28 08:13:59 wilhovisk kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input25  Jun 28 08:13:59 wilhovisk kernel: clocksource: Switched to clocksource tsc  Jun 28 08:13:59 wilhovisk systemd[1]: Found device SanDisk_SDSSDA120G 1.  Jun 28 08:13:59 wilhovisk systemd[1]: Starting File System Check on /dev/disk/by-uuid/322B-518B...  Jun 28 08:13:59 wilhovisk systemd[1]: Starting File System Check on /dev/disk/by-uuid/d6e1f5fa-e0e8-455f-8888-268ce02d3a9d...  Jun 28 08:13:59 wilhovisk systemd[1]: Started File System Check Daemon to report status.  Jun 28 08:13:59 wilhovisk systemd-fsck[596]: silver-cloud: clean, 5480/603840 files, 1037656/2413056 blocks  Jun 28 08:13:59 wilhovisk systemd[1]: Started File System Check on /dev/disk/by-uuid/d6e1f5fa-e0e8-455f-8888-268ce02d3a9d.  Jun 28 08:13:59 wilhovisk systemd[1]: Mounting /media/wilhovisk/SSD-silver-cloud...  Jun 28 08:13:59 wilhovisk kernel: NVRM: Your system is not currently configured to drive a VGA console  Jun 28 08:13:59 wilhovisk kernel: NVRM: on the primary VGA device. The NVIDIA Linux graphics driver  Jun 28 08:13:59 wilhovisk kernel: NVRM: requires the use of a text-mode VGA console. Use of other console  Jun 28 08:13:59 wilhovisk kernel: NVRM: drivers including, but not limited to, vesafb, may result in  Jun 28 08:13:59 wilhovisk kernel: NVRM: corruption and stability problems, and is not supported.  Jun 28 08:13:59 wilhovisk systemd-fsck[595]: fsck.fat 3.0.28 (2015-05-16)  Jun 28 08:13:59 wilhovisk systemd-fsck[595]: /dev/sda1: 8 files, 6890/286310 clusters  Jun 28 08:13:59 wilhovisk systemd[1]: Started File System Check on /dev/disk/by-uuid/322B-518B.  Jun 28 08:13:59 wilhovisk systemd[1]: Mounting /boot/efi...  Jun 28 08:14:25 wilhovisk kernel: EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: errors=remount-ro  Jun 28 08:14:25 wilhovisk kernel: usb 3-9: reset high-speed USB device number 3 using xhci_hcd  Jun 28 08:14:25 wilhovisk kernel: ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5392, rev 0223 detected  Jun 28 08:14:25 wilhovisk kernel: ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 5372 detected  Jun 28 08:14:25 wilhovisk kernel: ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'  Jun 28 08:14:25 wilhovisk kernel: usbcore: registered new interface driver rt2800usb  Jun 28 08:14:25 wilhovisk kernel: rt2800usb 3-9:1.0 wlx001a3fd2ebb4: renamed from wlan0  Jun 28 08:14:25 wilhovisk kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [nvidia-smi:555]  Jun 28 08:14:25 wilhovisk kernel: Modules linked in: arc4 rt2800usb rt2x00usb rt2800lib rt2x00lib mac80211 cfg80211 nvidia_uvm(POE) joydev input_leds intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper ablk_helper cryptd intel_cstate snd_hda_codec_hdmi intel_rapl_perf snd_hda_codec_realtek snd_hda_codec_generic serio_raw snd_soc_rt5640 lpc_ich snd_hda_intel mei_me snd_hda_codec mei snd_soc_ssm4567 snd_soc_rl6231 snd_soc_core snd_hda_core shpchp snd_hwdep snd_compress ac97_bus snd_pcm_dmaengine snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd snd_soc_sst_acpi snd_soc_sst_match dw_dmac soundcore elan_i2c dw_dmac_core mac_hid i2c_designware_platform tpm_infineon spi_pxa2xx_platform  Jun 28 08:14:25 wilhovisk kernel:  i2c_designware_core 8250_dw acpi_pad parport_pc ppdev lp parport autofs4 dm_mirror dm_region_hash dm_log hid_generic usbhid i915 nvidia_drm(POE) nvidia_modeset(POE) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt nvidia(POE) fb_sys_fops psmouse ahci libahci drm video fjes i2c_hid sdhci_acpi hid sdhci  Jun 28 08:14:25 wilhovisk kernel: CPU: 0 PID: 555 Comm: nvidia-smi Tainted: P           OE   4.8.0-56-generic #61~16.04.1-Ubuntu  Jun 28 08:14:25 wilhovisk kernel: Hardware name: Gigabyte Technology Co., Ltd. H97M-D3H/H97M-D3H, BIOS F7 08/03/2015  Jun 28 08:14:25 wilhovisk kernel: task: ffff939b9b47d880 task.stack: ffff939ba1770000  Jun 28 08:14:25 wilhovisk kernel: RIP: 0010:[<ffffffffc043b9fc>]  [<ffffffffc043b9fc>] os_io_read_dword+0xc/0x10 [nvidia]  [...]  

this is the systemd-analyze blame:

     25.844s systemd-fsck@dev-disk-by\x2duuid-322B\x2d518B.service        1.444s dev-sda2.device         519ms NetworkManager-wait-online.service         204ms systemd-fsck@dev-disk-by\x2duuid-d6e1f5fa\x2de0e8\x2d455f\x2d8888\x2d268ce02d3a9d.service         165ms systemd-modules-load.service         129ms accounts-daemon.service         120ms keyboard-setup.service         113ms networking.service         110ms ModemManager.service         104ms NetworkManager.service          95ms grub-common.service          94ms systemd-tmpfiles-setup-dev.service          83ms systemd-journald.service          78ms upower.service          78ms gpu-manager.service          74ms systemd-logind.service          73ms apparmor.service          59ms lightdm.service          58ms ondemand.service          52ms systemd-udev-trigger.service          51ms thermald.service          50ms speech-dispatcher.service          50ms console-setup.service          49ms irqbalance.service          48ms apport.service          46ms systemd-backlight@backlight:acpi_video0.service          45ms resolvconf.service          39ms lm-sensors.service          35ms snapd.socket          32ms ufw.service          28ms sys-kernel-debug.mount          27ms systemd-timesyncd.service          25ms dev-hugepages.mount          25ms dev-mqueue.mount          23ms systemd-journal-flush.service          22ms polkitd.service          21ms systemd-udevd.service          21ms systemd-rfkill.service          19ms avahi-daemon.service          19ms kmod-static-nodes.service          16ms rsyslog.service          16ms udisks2.service          14ms systemd-tmpfiles-setup.service          12ms user@108.service          12ms alsa-restore.service          12ms binfmt-support.service          12ms plymouth-read-write.service          12ms systemd-update-utmp.service          11ms snapd.autoimport.service          11ms boot-efi.mount          10ms wpa_supplicant.service          10ms user@1000.service           9ms systemd-user-sessions.service           8ms pppd-dns.service           8ms systemd-sysctl.service           8ms systemd-remount-fs.service           7ms systemd-hostnamed.service           5ms media-wilhovisk-SSD\x2dsilver\x2dcloud.mount           5ms hddtemp.service           4ms proc-sys-fs-binfmt_misc.mount           3ms systemd-random-seed.service           3ms sys-fs-fuse-connections.mount           2ms setvtrgb.service           2ms rtkit-daemon.service           2ms plymouth-quit-wait.service           2ms systemd-update-utmp-runlevel.service           1ms nvidia-persistenced.service           1ms rc-local.service  

This is my fstab, I copied the entry for the root /, and replace "/" for the "/media/wilhovisk/SSD-silver-cloud", maybe there is too much space after the mount point. Can I delete the spaces and hit one time tab key?

# /etc/fstab: static file system information.  #  # Use 'blkid' to print the universally unique identifier for a  # device; this may be used with UUID= as a more robust way to name devices  # that works even if disks are added and removed. See fstab(5).  #  # <file system> <mount point>   <type>  <options>       <dump>  <pass>  # / was on /dev/sda2 during installation  UUID=aadba17f-3ae3-489a-9278-7314f985f55c /               ext4    errors=remount-ro 0       1  # /boot/efi was on /dev/sda1 during installation  UUID=322B-518B  /boot/efi       vfat    umask=0077      0       2  # / was on /dev/sda2 during installation  UUID=d6e1f5fa-e0e8-455f-8888-268ce02d3a9d /media/wilhovisk/SSD-silver-cloud               ext4    errors=remount-ro 0       2  

How can this be solved? I can post other reports here, just say what I have to post that can be useful to resolve this bug.

Thank you.


Solution:1

Here is what solved it for me: I saw this post on Linux Mint Forums and followed the advice of Laurent85:

At startup edit the Grub menu entry and append kernel parameter nomodeset or modprobe.blacklist=nouveau to existing parameters quiet splash".

So I edited the file /etc/default/grub, and edited the GRUB_CMDLINE_LINUX_DEFAULT line to this

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset".   

Afterwards I ran sudo update-grub. Thats it, 8 seconds the boot time, no beep sound, I am so happy, no CPU stuck on journalctl anymore! :)


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »