index'd auth files, mkuser and duplicate group ids
- From: openstream rob <rob@xxxxxxxxxxxxxxxx>
- Date: Thu, 31 Jan 2008 14:00:55 -0800 (PST)
Hi
I've had a good look around and have not seen anything reported like
the problem I'm experiencing.
I'm developming an application environment where each server is
sharing all the passwd/group/user files from /etc/and /etc/security
via symbolic links to a common GPFS filesystem.
Everything has been working well in development, however I've just
started to have a problem as I've introduced the use of mkpasswd to
index the group and passwd files(and associated files). As the base
auth files are replaced by links to a GPFS shared filesystem, I have
wrapped the mkpasswd command so that the created *.idx files are also
copied to the GPFS filesystem and local sym linkes are made on each
server.
I've described the linking process incase someone raises and eyebrow
and says that's not going to work because to simplfy the fault
environment I have repeated the problem on a local server, with NO sym
links at all. Just the regular files and indexes.
One more point before I describe the symptoms. As we will have several
thousand users, all needing to be members of the same groups, I have
made the group file so that GIDs are duplicated with different group
names. For example:
<smippet>
admin:!:200:
admin-100901:!:200:
admin-200101:!:200:
</snippet>
Total group flile is 5500 lines
This will allow me to split the load on each group file line so that
the max line length is avoided.(I'm concerned this will upset the
index mechanism.)
The problem is this:
After the indexes have been frshly made, all is well. If I do an ls -l
the output is as expected.
If I do a mkuser, all is still well.
If I then rmuser, the output from an "ls -l" will display uids and
gids, not the expected decoded names.
Bizzarrly, this is not allways consisent, as "SOME" names are decoded.
Most often the "root" name and "system" group are displayed as digits.
This also means "whoami" returns an error. (does this indicate more
than just a username decode problem?)
Mostly, if I run "li -l" (not ls -l), it will "bring back" the "user"
names.(but not the groups), and "lsgroups ALL" will bring back the
group names!
Re indexing, always seems to sort the problem out, until the next
mkuser/rmuser.
I am struggling to understand this scenario, but it looks like there
could be some row level locking on the passwd/group files or indexes
not being cleared? How does the locking work?
I've used truss to examine a "whoami" both when the system is normal
and exhibiting the fault. In the fault case, the seeks never appear to
find the correct file entires, whereas in the "normal" truss, I can
see the seek calls preceeding a read of the root passwd file entry.
Does this give a clue?
The last clue is that I am running the exact same software on two
clusters. The only difference is that one cluster is comprised of
p510s and the other with slightly faster (30%) p51As. On the faster
cluster I had to repeat the test time and time again before it
happened. (previously I was hoping the problem to be a build fault on
the slower cluster, where the fault occurs every time)
With the large valume of groups in the group file, even with indexing,
some operations are quite slow. Is there a timing issue possible to
cause this fault?
As an experiment, I removed the group.id.idx file. To my suprise, the
response times were quicker on both servers. On both clusters, I
cannot repeat the fault when there is no group.id.idx. (so far!)
Questions:
Is it legitimate to rely on the group.id.idx file where there are
duplicate gids in the /etc/group file?
Is it legitimate to index all files accept the group.id.idx ?
How does mkuser, for example, handle locking on the auth files and
indexes. Is it memory or file based locking?
Has anyone else had this issue?
Any feedback gratefully received.
Thanks
Rob
.
- Prev by Date: Re: New version of cfg2html available
- Previous by thread: Using @ in a password
- Index(es):
Relevant Pages
|