Wednesday, July 25, 2007

nfs failure caused ClearCase error

Problem.
1. command "cd /;ls -l" hanged, while "ls" could work.

2. in /var/adm/atria/log/albd_log, "Daily View Space" normally takes less than 1 minutes. Now it hanged.

2007-07-22T04:30:03-04 albd_server(411): Ok: Job 5 "Daily View Space" (10430) Started.
....
2007-07-23T04:30:03-04 albd_server(411): Warning: Job 5 "Daily View Space" is still running -- skipping scheduled execution.

Check "cleartool schedule -edit –schedule". Confirmed that jobs were skipped.

3. command "su" failed with Segmentation Fault, while "su -" could work.


Solution.
Check /var/adm/messages file.

Jul 24 12:05:43 dse-750-poc1 nfs: [ID 733954 kern.info] NOTICE: [NFS4][Server: 10.2.194.31][Mntpt: /mnt]NFS server 10.2.194.31 not responding; still trying
Jul 24 12:11:02 dse-750-poc1 last message repeated 1 time
Jul 24 12:12:12 dse-750-poc1 su: [ID 810491 auth.crit] 'su root' failed for qinl on /dev/pts/3
Jul 24 12:17:02 dse-750-poc1 nfs: [ID 733954 kern.info] NOTICE: [NFS4][Server: 10.2.194.31][Mntpt: /mnt]NFS server 10.2.194.31 not responding; still trying
Jul 24 12:25:02 dse-750-poc1 nfs: [ID 733954 kern.info] NOTICE: [NFS4][Server: 10.2.194.31][Mntpt: /mnt]NFS server 10.2.194.31 not responding; still trying
Jul 24 16:23:34 dse-750-poc1 nfs: [ID 733954 kern.info] NOTICE: [NFS4][Server: 10.2.194.31][Mntpt: /mnt]NFS server 10.2.194.31 not responding; still trying
...


Check the mount point /mnt. Could not cd into.

Log on to Server: 10.2.194.31, restart nfs daemon.
/etc/init.d/nfs.server start

After the nfsd started, every problem is gone.

1 comment:

Anonymous said...

Hello all, goodglad to be herei
PPC Ad