15.4. Incorrect directory content caching
A user of a Solaris NFS client reported having 
 intermittent problems
accessing files mounted from a non-Unix NFS server. The Solaris NFS
client 
tarsus was apparently able to list files
that had previously been removed by another NFS client, but was
unable to access the contents of the files. The files would
eventually disappear. The NFS client that initially removed the files
did not experience any problems and the user reported that the files
had indeed been removed from the server's directory. He
verified this by logging into the NFS server and listing the contents
of the exported directory.
We suspected the client 
tarsus was not
invalidating its cached information, and proceeded to try to
reproduce the problem while capturing the NFS packets to analyze the
network traffic:
[1] tarsus$ ls -l /net/inchun/export/folder
total 8
-rw-rw-rw-   1 labiaga  staff       2883 Apr 10 20:03 data1
-rw-rw-rw-   1 root     other         12 Apr 10 20:01 data2
[1] protium$ rm /net/inchun/export/folder/data2
[2] tarsus$ ls /net/inchun/export/folder
data1   data2
[3] tarsus$ ls -l /net/inchun/export/folder
/net/inchun/export/folder/data2: Stale NFS file handle
total 6
-rw-rw-rw-   1 labiaga  staff       2883 Apr 10 20:03 data1
The first directory listing on 
tarsus correctly
displayed the contents of the NFS directory
/net/inchun/export/folder before anything was
removed. The problems began after the NFS client
protium removed the file
data2. The second directory listing on
tarsus continued showing the recently removed
data2 file as part of the directory, although
the extended directory listing reported a "Stale NFS
filehandle" for 
data2.
This was a typical case of inconsistent caching of information by an
NFS client. Solaris NFS clients cache the directory content and
attribute information in memory at the time the directory contents
are first read from the NFS server. Subsequent client accesses to the
directory first validate the cached information, comparing the
directory's cached modification time to the modification time
reported by the server. A match in modification times indicates that
the directory has not been modified since the last time the client
read it, therefore it can safely use the cached data. On the other
hand, if the modification times are different, the NFS client purges
its cache, and issues a new NFS
Readdir
request to the server to obtain the updated directory contents and
attributes. Some non-Unix NFS servers are known for not updating the
modification time of directories when files are removed, leading to
directory caching problems. We used 
snoop to
capture the NFS packets between our client and server while the
problem was being reproduced. The analysis of the snoop output should
help us determine if we're running into this caching problem.
To facilitate the discussion, we list the 
snoop
packets preceded by the commands that generated them. This shows the
correlation between the NFS traffic and the Unix commands that
generate the traffic:
[1] tarsus $ ls -l /net/inchun/export/folder
total 8
-rw-rw-rw-   1 labiaga  staff       2883 Apr 10 20:03 data1
-rw-rw-rw-   1 root     other         12 Apr 10 20:01 data2
 7   0.00039    tarsus -> inchun    NFS C GETATTR2 FH=FA14
 8   0.00198    inchun -> tarsus    NFS R GETATTR2 OK
 9   0.00031    tarsus -> inchun    NFS C READDIR2 FH=FA14 Cookie=0
10   0.00220    inchun -> tarsus    NFS R READDIR2 OK 4 entries (No more)
11   0.00033    tarsus -> inchun    NFS C LOOKUP2 FH=FA14 data2
12   0.00000    inchun -> tarsus    NFS R LOOKUP2 OK FH=F8CD
13   0.00000    tarsus -> inchun    NFS C GETATTR2 FH=F8CD
14   0.00000    inchun -> tarsus    NFS R GETATTR2 OK
15   0.00035    tarsus -> inchun    NFS C LOOKUP2 FH=FA14 data1
16   0.00211    inchun -> tarsus    NFS R LOOKUP2 OK FH=F66F
17   0.00032    tarsus -> inchun    NFS C GETATTR2 FH=F66F
18   0.00191    inchun -> tarsus    NFS R GETATTR2 OK
Packets 7 and 8 contain the request and reply for attributes for the
/net/inchun/export/folder directory. The
attributes can be displayed by using the 
-v
directive:
Excerpt from: snoop -i /tmp/capture -p 7,8 -v
ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 8 arrived at 20:45:9.75
...
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 1 (Get file attributes)
NFS:  Status = 0 (OK)
NFS:  File type = 2 (Directory)
NFS:  Mode = 040777
NFS:   Type = Directory
NFS:   Setuid = 0, Setgid = 0, Sticky = 0
NFS:   Owner's permissions = rwx
NFS:   Group's permissions = rwx
NFS:   Other's permissions = rwx
NFS:  Link count = 2, UID = 0, GID = -2, Rdev = 0x0
NFS:  File size = 512, Block size = 512, No. of blocks = 1
NFS:  File system id = 7111, File id = 161
NFS:  Access time       = 11-Apr-00 12:50:18.000000 GMT
NFS:  Modification time = 11-Apr-00 12:50:18.000000 GMT
NFS:  Inode change time = 31-Jul-96 09:40:56.000000 GMT
Packet 8 shows the 
/net/inchun/export/folder
directory was last modified on April 11, 2000 at 12:50:18.000000 GMT.
tarsus caches this timestamp to later determine
when the cached directory contents need to be updated. Packet 9
contains the request made by 
tarsus for the
directory listing from 
inchun. Packet 10
contains 
inchun's reply with four entries
in the directory. A detailed view of the packets shows the four
directory entries: ".", "..",
"
data1", and
"
data2". The EOF indicator notifies
the client that all existing directory entries have been listed, and
there is no need to make another NFS 
Readdir
call:
Excerpt from: snoop -i /tmp/capture -p 9,10 -v
ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 10 arrived at 20:45:9.74
...
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 16 (Read from directory)
NFS:  Status = 0 (OK)
NFS:   File id  Cookie Name
NFS:       137   50171 .
NFS:       95    50496 ..
NFS:       199   51032 data1
NFS:       201   51706 data2
NFS:    4 entries
NFS:  EOF = 1
NFS:
The directory contents are cached by 
tarsus, so
that they may be reused in a future directory listing. The NFS
Lookup and NFS 
Getattr
requests, along with their corresponding replies in packets 11 thru
18, result from the long listing of the directory requested by
ls -l. An NFS 
Lookup
obtains the filehandle of a directory component. The NFS
Getattr requests the file attributes of the file
identified by the previously obtained filehandle.
NFS Version 2 filehandles are 32 bytes long. Instead of displaying a
long and cryptic 32-byte number, 
snoop generates
a shorthand version of the filehandle and displays it when invoked in
summary mode. This helps you associate filehandles with file objects
more easily. You can obtain the exact filehandle by displaying the
network packet in verbose mode by using the -v option. The packet 7
filehandle FH=FA14 is really:
Excerpt from: snoop -i /tmp/capture -p 7 -v
NFS: ----- Sun NFS -----
NFS:
NFS:  Proc = 1 (Get file attributes)
NFS:  File handle = [FA14]
NFS:   0204564F4C32000000000000000000000000A10000001C4DFF20A00000000000
Next, 
protium, a different NFS client comes into
the picture, and removes one file from the directory previously
cached by 
tarsus:
[1] protium $ rm /net/inchun/export/folder/data2
22   0.00000   protium -> inchun    NFS C GETATTR2 FH=FA14
23   0.00000    inchun -> protium   NFS R GETATTR2 OK
24   0.00000   protium -> inchun    NFS C REMOVE2 FH=FA14 data2
25   0.00182    inchun -> protium   NFS R REMOVE2 OK
Packets 22 and 23 update the cached attributes of the
/net/inchun/export/folder directory on
protium. Packet 24 contains the actual NFS
Remove request sent to
inchun, which in turn acknowledges the
successful removal of the file in packet 25.
tarsus then lists the directory in question, but
fails to detect that the contents of the directory have changed:
[2] tarsus $ ls /net/inchun/export/folder
data1   data2
39   0.00000    tarsus -> inchun    NFS C GETATTR2 FH=FA14
40   0.00101    inchun -> tarsus    NFS R GETATTR2 OK
This is where the problem begins. Notice that two NFS
Getattr network packets are generated as a
result of the directory listing but no 
Readdir
request. In this case, the client issues the NFS
Getattr operation to request the
directory's modification time:
Excerpt from: snoop -i /tmp/capture -p 39,40 -v
ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 40 arrived at 20:45:10.88
...
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 1 (Get file attributes)
NFS:  Status = 0 (OK)
NFS:  File type = 2 (Directory)
NFS:  Mode = 040777
NFS:   Type = Directory
NFS:   Setuid = 0, Setgid = 0, Sticky = 0
NFS:   Owner's permissions = rwx
NFS:   Group's permissions = rwx
NFS:   Other's permissions = rwx
NFS:  Link count = 2, UID = 0, GID = -2, Rdev = 0x0
NFS:  File size = 512, Block size = 512, No. of blocks = 1
NFS:  File system id = 7111, File id = 161
NFS:  Access time       = 11-Apr-00 12:50:18.000000 GMT
NFS:  Modification time = 11-Apr-00 12:50:18.000000 GMT
NFS:  Inode change time = 31-Jul-96 09:40:56.000000 GMT
The modification time of the directory is the same as the
modification time before the removal of the file!
tarsus compares the cached modification time of
the directory with the modification time just obtained from the
server, and determines that the cached directory contents are still
valid since the modification times are the same. The directory
listing is therefore satisfied from the cache instead of forcing the
NFS client to read the updated directory contents from the server.
This explains why the removed file continues to show up in the
directory listing:
[3] tarsus $ ls -l /net/inchun/export/folder
/net/inchun/export/folder/data2: Stale NFS file handle
total 6
-rw-rw-rw-   1 labiaga  staff       2883 Apr 10 20:03 data1
44   0.00000    tarsus -> inchun    NFS C GETATTR2 FH=FA14
45   0.00101    inchun -> tarsus    NFS R GETATTR2 OK
46   0.00032    tarsus -> inchun    NFS C GETATTR2 FH=F66F
47   0.00191    inchun -> tarsus    NFS R GETATTR2 OK
48   0.00032    tarsus -> inchun    NFS C GETATTR2 FH=F8CD
49   0.00214    inchun -> tarsus    NFS R GETATTR2 Stale NFS file handle
The directory attributes reported in packet 45 are the same as those
seen in packet 40, therefore 
tarsus assumes that
it can safely use the cached filehandles associated with the cached
entries of this directory. In packet 46, 
tarsus
requests the attributes of filehandle F66F, corresponding to the
data1 file. The server replies with the
attributes in packet 47. 
tarsus then proceeds to
request the attributes of filehandle F8CD, which corresponds to the
data2 file. The server replies with a
"Stale NFS filehandle" error because there is no file on
the server associated with the given filehandle. This problem would
never have occurred had the server updated the modification time
after removing the file causing 
tarsus to detect
that the directory had been changed.
Directory caching works nicely when the NFS server obeys Unix
directory semantics. Many non-Unix NFS servers provide such semantics
even if they have to submit themselves to interesting contortions.
Having said this, there is nothing in the NFS protocol specification
that requires the modification time of a directory to be updated when
a file is removed. You may therefore need to disable Solaris NFS
directory caching if you're running into problems interacting
with non-Unix servers. To permanently disable NFS directory caching,
add this line to 
/etc/system:
set nfs:nfs_disable_rddir_cache = 0x1
The Solaris kernel reads 
/etc/system at startup
and sets the value of 
nfs_disable_rddir_cache to
0x1 in the 
nfs kernel module. The change takes
effect only after reboot. Use 
adb to disable
caching during the current session, postponing the need to reboot.
You still need to set the tunable in 
/etc/system
to make the change permanent through reboots:
aqua# adb -w -k /dev/ksyms /dev/mem
physmem    3ac8
nfs_disable_rddir_cache/W1
nfs_disable_rddir_cache:    0x0        =    0x1
adb is an interactive assembly level debugger
that enables you to consult and modify the kernel's memory
contents. The 
-k directive instructs
adb to perform kernel memory mapping accessing
the kernel's memory via 
/dev/mem, and
obtaining the kernel's symbol table from
/dev/ksyms. The 
-w directive allows
you to modify the kernel memory contents. A word of caution:
adb is a power tool that will cause serious data
corruption and potential system panics
when misused.
  |   |   | 
| 15.3. Boot parameter confusion |   | 15.5. Incorrect mount point permissions |