Re: Resuming from a crashdump
From: Steven Smith (sos22_at_cantab.net)
Date: Tue, 25 Jan 2005 21:45:43 +0000 To: Matthew Dillon <email@example.com>
> You basically would either have to make all device drivers support a new
> hibernation/restore API (because it is not really possible to restore
> a device driver based on a dump),
How much overlap is there likely to be between this and the sorts of
things you need in order to resume from power management modes?
> Also, if the machine has a lot of memory it could take longer to save
> and restore then to reboot from scratch. A typical laptop HD is
> ~30 MB/sec. If your laptop has 512MB then it would take 16 seconds
> to go into hibernation mode, and 16 seconds to come out of, plus BIOS
> and loader overhead.
*shrug* If the image you're saving is just sitting at a login prompt,
it probably doesn't buy you much, but once you've got a couple of dozen
xterms open it could easily take more than 30 seconds to restore all
of the state by hand.
Also, have you ever looked at the live migration stuff Xen uses? The
aim here is to move a running operating system from one machine to
another with minimal downtime. Essentially, you just start copying
pages across willy nilly, keeping track of pages which get dirtied.
After every page has been copied, you go back over the list of dirty
pages, and just migrate them, and so on, until you stop making any
progress. At that point, you stop the guest operating system and copy
everything that's left in one big go, and start it going on the new
If you just send pages to disk rather than to another machine on the
network, then you should be able to suspend-to-disk an entire
operating system with minimal user-perceived downtime. One
possibility here would be to e.g. live suspend the machine every five
minutes or so, and guarantee the user never loses more than five
minutes of work.
> I think it would probably be more realistic to persue a process
> save/restore rather then a kernel save/restore. The overhead is going
> to be the disk I/O anyway and that seems to be about the same either
> way (maybe less for a process restore), plus you can at least demand-load
> the process restore.
The problem with a process checkpoint is that it's then rather
difficult to get all of the inter-process stuff right. If you
checkpoint an entire OS, that comes for free.
-- 'Double-entry bookkeeping ....simple to adapt to modern computer methods by using positive or negative electric charges to signal whether an account should be debited or credited.' -- Accounting Theory and Practice, Glautier M.W.E
- application/pgp-signature attachment: stored