Re: mlock(2) for ordinary users
- From: Kostik Belousov <kostikbel@xxxxxxxxx>
- Date: Sat, 22 Jul 2006 18:16:31 +0300
On Sat, Jul 22, 2006 at 03:52:37PM +0100, Robert Watson wrote:
On Fri, 21 Jul 2006, Peter Jeremy wrote:
Currently mlock() and munlock() are restricted to the root user - which
prevents an ordinary user locking their process into RAM to the detriment
of the system as a whole. Whilst this is a valid concern, there are good
security reasons for allowing a user to lock small amounts of memory (a
few pages) to ensure that sensitive information (private keys, passwords
etc) don't wind up on swap devices.
There is a resource limit for locked pages (RLIMIT_MEMLOCK) and, despite
the man page, a quick look at the code implies that it really is honoured.
Could someone with more VM-foo please confirm whether the last line of the
man page is still correct.
I would like to suggest that the suser() tests in mlock() and munlock() be
removed and the default RLIMIT_MEMLOCK is reduced from infinity to (say)
1. The only gotcha I can see is that lots of sysctl() functions use
RLIMIT_MEMLOCK via sysctl_wire_old_buffer() and vslock().
I think I'd like to see the functionality you suggest -- i.e., the ability
to allocate pinned memory pages to unprivileged processes. However, I have
to wonder about whether this isn't already enabled for a reason -- in
particular, I have to wonder if it works at all. The whole idea of
resources limits is that you bill new use to a credential, and credit
reduced use to a similar credential. Probably, we're interested only in
memory pinned at the request of the process, not memory pinned by the
kernel on its behalf. The normal questions I'd try to answer about whether
it works currently are:
- When pages become locked on behalf of a credential, is it correctly billed
to the credential?
- When pages become unlocked (or are released), are any credentials that
have
requested it be locked credited?
- What happens when the credential on a process changes between when memory
is locked and unlocked?
- What happens if more than one credential requests the same page of memory
be
locked and unlocked?
- Is locked memory properly credited back to the credential on process exit
and other non-explicit unmapping points?
Note in particular that more than one credential can request that the same
page be locked -- if two processes map the same page from a file, or one is
a fork of the other and has inheritted a shared mapping, we need to handle
that "correctly". And we need to handle cases like setuid -- as with other
resource limit implementations, the right credential needs to be credited.
In the case of socket limits, for example, we actually keep a reference to
the allocating credential in the struct socket so that when the socket is
freed, we can credit the resources back to the original credential, not to
the credential of whatever process last references the socket. Presumably
something similar would be required here, and a quick glance doesn't
suggest this is implemented.
As far as I remember, RLIMIT_MEMLOCK is per-process instead of per-cred.
As consequence, allowing mlock() for non-root users actually allow such
user to allocate value-of(RLIMIT_MEMLOCK) * value-of(RLIMIT_NPROC).
In fact, I had to make the answers to the asked questions when I
implemented the per-user swap limits. The design I ended with was to
add reference to the originating cred to vm_map_entry and vm_object
(with somewhat complicated logic to move the ref from entry to object
on occasion).
Attachment:
pgpDdO5ia3yIR.pgp
Description: PGP signature
- Follow-Ups:
- Re: mlock(2) for ordinary users
- From: Peter Jeremy
- Re: mlock(2) for ordinary users
- References:
- mlock(2) for ordinary users
- From: Peter Jeremy
- Re: mlock(2) for ordinary users
- From: Robert Watson
- mlock(2) for ordinary users
- Prev by Date: Re: mlock(2) for ordinary users
- Next by Date: Re: mlock(2) for ordinary users
- Previous by thread: Re: mlock(2) for ordinary users
- Next by thread: Re: mlock(2) for ordinary users
- Index(es):
Relevant Pages
|
|