Tuesday, October 16, 2007

Solaris 10 fsck

Today a POS E250 that runs some Sunray kiosks crapped out for no particular reason. I am working from home today, so I tried to SSH into the machine, with no luck. I next tried telnet, also no luck. Ping worked though.

Uh oh...hung system.

I had to get a Windows admin on-site to do a cold boot on the machine, which did not help. I was all set to drive into the office on my flex-day...again...and then I remembered the magic of RSC. E250's come with an RSC (Remote System Control) card and, further, have a neato little utility that shows you a GUI version of the front panel and also allows a command console, even when the server is screwed up or off. Newer versions of RSC are web-based.

Anyway, as I suspected, the idiot server was hung waiting for an fsck to be run on a particular slice. How to handle this? Well, first you will need to enter the root password to get into single-user mode. Then you will need to check the appropriate slice. In this case, the slice having difficulty was listed as:

/dev/dsk/c1t8d0s6

So, now we run fsck on that slice. There are two important points here. First, fsck requires the raw device (if the slice is larger than 2Gb), which means you would run it on:

/dev/rdsk/c1t8d0s6

The second point is the -y flag. If you do not use the -y flag, you will spend the rest of your life pressing 'y' to fix busted inodes. So, the command you want is:

fsck -F ufs -y /dev/rdsk/c1t8d0s6

or

fsck -F ufs -o f,p /dev/rdsk/c1t8d0s6

Both will have the same result. The second method may be slightly safer because the options are based upon the filesystem. In this case, f=force and p=preen (non-interactive mode)

The '-F' is optional and tells what type of filesystem is in use. The default filesystem for the machine is defined in /etc/default/fs and fsck will use that value if you do not provide one. I use the -F out of habit and to be sure I get it right. YMMV.

Once the fsck completes you can either reboot the machine or type exit to continue the boot process. This is especially useful if you have more than one filesystem that is botched, or if you are not sure of the status of the remaining filesystems. I usually use the exit feature to make sure I got everything and then do a graceful reboot to be sure.

-TheDave

No comments: