Re: VM images for FreeBSD
- From: Benjamin Kaduk <kaduk@xxxxxxx>
- Date: Thu, 3 Nov 2011 23:46:02 -0400 (EDT)
On Wed, 19 Oct 2011, Alexander Yerenkow wrote:
I'm working currently on creating images with a set pre-installed packages.
I looked at project pkgng (candidate for replacing current pkg_* subsystem),
and also I have some thought about current packages/ports system.
1. pkg_add can be launched with parameter -p $PREFIX. So, my first thought
was: I create empty directory structure with mtree, and I'll install there
all required packages; after that I need only update this installation tree
(manually by pkg_delete $old pkg_add $new, or with some tool). But I cannot
specify to pkg_add relative root, instead of real one.
Let me show example:
PKG_DBDIR=/zpool0/testroot/var/db/pkg pkg_add -p /zpool0/testroot/usr/local
installs package, and in /zpool0/testroot/var/db/pkg/ubench-0.32/+CONTENTS
there will be such record:
I can't specify to pkg_add that it should treat /zpool0/testroot as root, as
I need (so record really should be @cwd /usr/local)
Instead, pkg_add allows me to make chroot, which as you understand is not
good (In specified chroot all required by pkg* binaries/libraries must
exists, unfortunately I can't specify some empty dir and install there).
Why is that? Because there is +INSTALL script in packages, in which
package/port system allows execute any code/script written by porter.
This is indeed a frustrating problem.
2. In ports enhancements task list (somewhere i read it) there was one item:
Make packages non-executable (or something similar). To do this properly, we
must get rid of of free-form post-install post-deinstall scripts.
To do this, we need some deep analysis of what types of actions there
happening, formalize them and provide some way to porters specify all needed
actions in Makefile.
I downloaded all packages for 9-current i386, found all +INSTALL scripts,
and kinda categorized them, you can get all of them here:
To summarize my efforts:
I checked 21195 packages;
I found 880 install scripts;
3 scripts contains plain "exit 0"
8 install scripts contains some perl code;
17 scripts contains some additional "install" commands;
70 scripts contains some chgroup/chown actions (which probably could be done
by specifying mtree file?...)
75 contains uncategorized actions (print of license, some interactive
questions, ghostscript actions, tex, fonts etc.)
161 scripts contains some file commands, like (ld / cp / mv, creating
backups, creating configs if they aren't exists etc. )
166 scripts contains useradd/groupadd commands (many similar constructions,
not too hard to move this to .mk, in pkgng group/users can be specified in
380 contains pear component registration (md5 -q * | uniq - produces
exactly one result, so these all scripts are really one, could be moved to
Thank you for doing this analysis/breakdown!
However, I worry that it may have missed @exec statements in pkg-plist files ... for example, net/openafs (which I maintain) runs kldxref in /boot/modules after installing a kernel module, which is needed in order for kldload to find the module. Now, this is clearly a case that a potential nonexecutable package framework could handle, checking for installed kernel modules and acting accordingly. However, having not done the survey of the sort you did for install scripts, it is an uneasy dangling unknown.
Why I'm interested in non-executable install of package (e.g. simple unpack
+ execute some typical actions based on package description):
- Unpacking of hundreds Mb packages takes several minutes (to mdconfig-ed
- Installation of these packages via pkg_add (they downloads from local ftp)
took hours in my case (to mdconfig-ed filesystem)
This is quite a telling statistic :)
As you understand, to make efficient image building system, I need to deal
with package installation without spending too many cpu/disk resources.
Ideally I consider all required packages are extracted to some their own
directory, like for ubench:
$X/packages/ubench/ (and here goes all directory structure which should be
copied to new root)
plus separated info of new users/groups (maybe there need some additional
data to make package installed in such way fully working).
There would certainly be additional data needed, e.g. for installing sample configuration files and copying to the real location, and removing both copies on uninstallation if the "real" file is unchanged from the sample. I'm sure there are others, too.
So, maybe someone working in this direction, or have any comments?
I would be very hesitant to proceed in this direction without doing some investigation of other package-based systems.
For example, Debian packages are inherently binary-based, there is not a real parallel to our ports framework. Yet if anything, I think that "executable packages" may be even more heavily used in Debian than in FreeBSD. In addition to the tarball of files to extract, the maintainer can also supply "maintainer scripts" which run before and/or after installation and/or uninstallation. (Not to mention the infrastructure components which implement things like diversions.) I have an incomplete survey of a biased sample of Debian(-style) packages in my slides at http://web.mit.edu/kaduk/Public/bsdcan-ports-talk-20110511.pdf , which shows that in addition to being used to manage users and groups, these maintainer scripts also are used to start/stop services, update gconf keys, the PAM stack, and more. It quickly becomes quite a pile of "additional data needed" (per the above) that I fear would be too much infrastructure to safely maintain in a non-executable package framework.
Another incredibly useful (though hopefully infrequently used) feature of maintainer scripts is the ability they give to recover from packaging errors. The first example that comes to mind is unfortunately not a very good one, but recently here at Athena we had a bug in our TeX configuration package which resulted in a dangling symlink from a broken diversion (which has no direct parallel in FreeBSD, making this a bad example). In any case, this packaging bug made the package uninstallable!
However, we could produce an updated version of the package which had a preinst that corrected for the previous packaging error, offering us a way out that did not require manual user intervention, which I feel is something that we should try very hard to avoid.
Because of this, I don't think that having it be impossible for a package to have a custom executable component is a realistic goal. (Which is not to say it is not a goal worth having.) However, it would probably be feasible to add pieces to our framework (e.g. USERS/GROUPS) that make it easier for packages to avoid executable components. If appropriately flagged, then a package could just be an "unpack this tarball" operation, possibly with a couple hooks (e.g. users/groups) from the packaging system.
3. Other "ports" ideas/thoughts.
I proposed small enahcement to pkgng, but instead in pkgng this should be
implemented in ports subsystem, it's about specifying abstract dependencies,
and correct resolving of them:
Who can comment/elaborate about this? It shouldn't be very complicated,
since currently almost same functionality provided in .mk. files ( like
Interesting, though I don't think I'm really one to comment/elaborate on it. It does seem vaguely analogous to the concept of a (Debian) "virtual package", which can be depended on like a first-class package, but is actually "provide"d by any number of candidate packages.
I don't have a sense of how hard it would be to implement for us, though, and I have not had any time to look at pkgng at all. :(
4. Where's the "right" place to discuss ports system? :)
Presumably freebsd-ports@xxxxxxxxxxx, though I alas do not regularly read there.
freebsd-current@xxxxxxxxxxx mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"