Monday, January 3, 2011

Dropped laptop and file system errors

So, dropped my laptop computer yesterday. It was about a 3 foot drop. As much as I hate my magnetically attached power cable popping out every ten minutes, I really wish they had things like this for USB cables right now. I'll give Apple a tip of the hat, though.  The metal case dents very well, but was still running when I opened it up.  Besides a nice new dent on the corner, it landed on the plugged in USB cable which, while it probably broke the fall, it also provided leverage to tweak the edge of the case. The USB port seems to no longer function though it still holds a voltage. Considering it was rotated about 10-15 degrees off the plane of the mainboard, it is very possible that it was pulled out of the circuit. Time will tell if the firewire or ethernet connections are still intact. Below are some pictures of the damage.

USB, Firewire, Ethernet, DVI, in that order.  Yes I'm watching 24 on Netflix right now.

With thumb drive inserted as an illustration of the damage to the USB port.

The picture doesn't quite convey how ridiculously bent-up this looks.

Anyway, after the drop and subsequent freak out, I closed the laptop and left it alone until yesterday evening. When I started using it again, I slowly realized that the root filesystem was mounted read only. This happens when Ubuntu detects an error in the file system.  Now, I cannot say that this has anything to do with the drop, especially considering my root file system has mysteriously been remounted read only in the past. However, in the past, I typically restart the system, it does a fsck on the un-clean partitions, and repairs any errors. I tried this and, boom, stopped up in GRUB.
From above: BUG: unable to handle kernel paging request at 01bc0000.  After this, GRUB can't mount a bunch of stuff and craps out to the (initramfs) prompt.

Around 18 months ago I set up a pretty good backup system so that I never have to worry about it. It performs autonomous, daily, incremental backups of my important stuff (see rdiff-backup). This mean I don't have to worry about a long recovery or data lost if the disk is really pooched, what a relief. Let's not get ahead of ourselves, though, let's see if it is recoverable.

The OS X partition still boots, but you can't repair (or even read) ext4 from OS X. Okay, no biggie, I'll just boot off a live CD and repair the partition there. Uh-oh, on vacation and didn't bring any live CDs. Okay, download Maverick and burn a new one. Let's boot it up. Shoot, forgot, the CD drive in my MBPro has always been finicky on reading CD's that it didn't write (to the point that I bet it is worthless for most CD-R's); it sees it as a blank CD. Try again writing from the OS X parition of the MBPro, Uh-oh, that failed, couldn't write the CD. Use the wifes laptop to write one; sees that one as blank too. Now I have one coaster and two Maverick CD's but no way to boot from them.

Then I realized that I have a thumb drive and booting from it is just about my only chance to recover the computer. I install the Maverick image onto the thumb drive (which I had to do from Windows as the Mac directions didn't work). Then I booted Maverick. A note here, I had to alter some kernel options (remove prevalence or whatever from the options) to make sure that it didn't try to mess with the screwy partitions as that hung the boot. Then I tried to run fsck on the partition, but it says the device is busy. That's weird as it isn't even mounted. A little Googling and I find that some people see similar behavior when using raid (not sure why this would effect me). I found someone else that hit it in a situation like mine and was able to work around it using a different distribution, SLAX. So back to the Windows computer, download SLAX, write it to the thumb drive. Boot up, fsck, and it's fixed. Phew.

Another day in the life of a GNU/Linux user.  Moral of the story, always carry bootable media if you hold your computer together with shoelaces and bubblegum like I do.

Update (01/14): The file system errors continued, sometimes as frequently as having to repair it each day.  I figured this must be detrimental to my data even though I haven't noticed any problems yet.  I decided to instead backup my entire home space, blow away everything and reinstall (and upgrade while I'm at it, now at 10.10).  Things have been working since... fingers crossed.  I guess the worst case scenario is that I am forced to purchase a new computer, maybe one that can last longer than an hour and a half unplugged.

Update (01/31): Still having issues.  Twice my drive was remounted RO while I was using it (and not doing anything really odd).  The first time I brought it up in SLAX and fixed the file system errors.  I wonder if the fact that fsck does does anything means, definitively, that I have lost some data, somewhere.  The backups showed no difference, but I don't back up every file.  I also did a non-destructive, read-write surface scan on the entire 72 GiB partition (for me that took about 4.5 hours).  It showed no bad blocks.  A few days later (yesterday) I found my computer had remounted RO the root file system again.  This time I backed up my entire home directory and deleted the partition, then repartitioned and reinstalled Ubuntu with three differences.  I made sure to have parted align the partitions to the cylinder instead of the MiB (align to MiB is the default).  Shouldn't matter but it also got rid of the annoying tiny blocks of free space between my partitions.  I also included a small 100 MiB partition that I can mount and save stuff to if the root file system is remounted RO.  Lastly, I used Ext3 partitions for everything this time instead of the default Ext4 file system.  Not sure if it will matter though I hear Ext4 is more performant; I can always upgrade later.  Anyway, fingers crossed again.

No comments :

Post a Comment