Tuesday, January 1, 2013

2013 Off to a Shaky Start

Here's a bit of irony for you: I had just finished reading a blog post in which the author pooh-poohed the notion that the year 2013 would be unlucky due to the last two digits when my system crashed ... repeatedly. Maybe the triskaidekaphobics are onto something (or maybe the Mayans were just off by a week and change).

When I first installed Linux Mint 14 (Cinnamon), upgrading from Mint 11, it appeared to me that my display seemed a bit dimmer than it had been. (My monitor is a Samsung SyncMaster 932GW; the video card is an NVIDIA GeForce 6150SE nForce 430.) My eyes adjusted to the difference, which I suspect was caused by the open-source Nouveau driver that ships with Mint. Until today, that's been the only issue with the Nouveau driver.

After reading the aforementioned blog post about 2013, I moved on to another post on the same blog, one containing a modest amount of graphics and nothing that, to my eye, would require any sort of hardware acceleration. As I scrolled down the post, my system suddenly did a spontaneous reboot. I think it was the X system rather than Linux itself, but I'm no expert on these things. The symptoms were a crash to a black screen, then the Mint log-in screen.

So I logged in again, and as soon as the password was submitted, I got a black screen (which is actually normal between log-in and desktop) and then the log-in screen again. Uh-oh! After logging in (again), the cycle repeated, only this time what I assume was the log-in screen was unreadable (the sort of colored "sleet" you get when there is a mismatch between monitor and display card with regard to scan frequency). I couldn't restart the X display with ctrl-alt-F1, so I powered down the computer the old fashioned way (the power switch), started it up again, and managed to log in successfully.

Poking around in the syslog file with the system log viewer (Menu > System Tools > System Log),  I found the following at what I think was the point of the initial crash:

Jan  1 17:49:20 HomePC kernel: [15178.991600] [drm] nouveau 0000:00:0d.0: Failed to idle channel 1.
Jan  1 17:49:20 HomePC gnome-session[1620]: Gdk-WARNING: gnome-session: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.#012
Jan  1 17:49:20 HomePC mdm[1193]: WARNING: mdm_slave_xioerror_handler: Fatal X error - Restarting :0

Note the reference to nouveau having a problem, followed by a fatal X error. The next chunk of the log shows the same pattern as my two log-in attempts lead to more crashes:

Jan  1 17:49:21 HomePC acpid: 1 client rule loaded
Jan  1 17:49:35 HomePC kernel: [15193.814381] [TTM] Failed to expire sync object before buffer eviction
Jan  1 17:49:35 HomePC kernel: [15193.814443] [TTM] Failed to expire sync object before buffer eviction
Jan  1 17:49:35 HomePC kernel: [15193.817930] [TTM] Failed to expire sync object before buffer eviction
Jan  1 17:49:51 HomePC kernel: [15209.778576] [drm] nouveau 0000:00:0d.0: Failed to idle channel 1.
Jan  1 17:49:51 HomePC kernel: [15209.782509] [drm] nouveau 0000:00:0d.0: Setting dpms mode 3 on vga encoder (output 0)
Jan  1 17:49:51 HomePC gnome-session[5894]: Gdk-WARNING: gnome-session: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.#012
Jan  1 17:49:51 HomePC mdm[5858]: WARNING: mdm_slave_xioerror_handler: Fatal X error - Restarting :0
Jan  1 17:49:51 HomePC pulseaudio[6317]: [pulseaudio] client-conf-x11.c: xcb_connection_has_error() returned true
Jan  1 17:49:51 HomePC pulseaudio[6322]: [pulseaudio] client-conf-x11.c: xcb_connection_has_error() returned true
Jan  1 17:49:51 HomePC pulseaudio[6325]: [pulseaudio] pid.c: Stale PID file, overwriting.
Jan  1 17:49:51 HomePC pulseaudio[6325]: [pulseaudio] bluetooth-util.c: org.bluez.Manager.ListAdapters() failed: org.freedesktop.DBus.Error.AccessDenied: Rejected send message, 2 matched rules; type="method_call", sender=":1.111" (uid=1000 pid=6325 comm="/usr/bin/pulseaudio --start --log-target=syslog ") interface="org.bluez.Manager" member="ListAdapters" error name="(unset)" requested_reply="0" destination="org.bluez" (uid=0 pid=853 comm="/usr/sbin/bluetoothd ")
Jan  1 17:49:51 HomePC pulseaudio[6325]: [pulseaudio] server-lookup.c: Unable to contact D-Bus: org.freedesktop.DBus.Error.NoServer: Failed to connect to socket /tmp/dbus-z2bkq6PwPB: Connection refused
Jan  1 17:49:51 HomePC pulseaudio[6325]: [pulseaudio] main.c: Unable to contact D-Bus: org.freedesktop.DBus.Error.NoServer: Failed to connect to socket /tmp/dbus-z2bkq6PwPB: Connection refused
Jan  1 17:49:51 HomePC pulseaudio[6330]: [pulseaudio] pid.c: Daemon already running.
Jan  1 17:49:54 HomePC acpid: client 5871[0:0] has disconnected
Jan  1 17:49:54 HomePC acpid: client connected from 6332[0:0]
Jan  1 17:49:54 HomePC acpid: 1 client rule loaded
Jan  1 17:49:54 HomePC kernel: [15213.146184] [drm] nouveau 0000:00:0d.0: Failed to idle channel 4.
Jan  1 17:49:54 HomePC kernel: [15213.173003] [drm] nouveau 0000:00:0d.0: Setting dpms mode 0 on vga encoder (output 0)
Jan  1 17:49:54 HomePC kernel: [15213.173011] [drm] nouveau 0000:00:0d.0: Output VGA-1 is running on CRTC 0 using output A
Jan  1 17:50:18 HomePC kernel: [15237.313716] [drm] nouveau 0000:00:0d.0: Failed to idle channel 1.
Jan  1 17:50:18 HomePC mdm[6318]: WARNING: mdm_slave_xioerror_handler: Fatal X error - Restarting :0
Jan  1 17:50:21 HomePC kernel: [15240.366064] [drm] nouveau 0000:00:0d.0: Failed to idle channel 4.
Jan  1 17:50:21 HomePC acpid: client 6332[0:0] has disconnected
Jan  1 17:50:21 HomePC acpid: client connected from 6745[0:0]
Jan  1 17:50:21 HomePC acpid: 1 client rule loaded
Jan  1 17:50:28 HomePC kernel: [15247.081273] [drm] nouveau 0000:00:0d.0: Failed to idle channel 3.
Jan  1 17:50:41 HomePC kernel: Kernel logging (proc) stopped.

So I decided to switch to the proprietary NVIDIA driver. The process is easy, although the download is a bit time consuming: Menu > Preferences > Software Sources, switch to the last tab (Additional Drivers), select the NVIDIA binary driver (proprietary and tested version in my case). The system did not demand a reboot after installation was complete, but inxi -G produced output that did not specify the new driver, so I rebooted just to be safe. After the reboot, inxi -G reports "FAILED: nvidia", which is apparently a bug (it reported "FAILED: nouveau" before the driver change) but shows "GLX Version: 2.1.2 NVIDIA 304.43", indicating I've successfully switched to the NVIDIA driver.

We'll see if my decision to switch drivers proves rash. The Nouveau driver worked fine for months, and it's possible some other thing will blow up the NVIDIA driver.

No comments:

Post a Comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.