[20080823]
|
Trying out journaling
After NetBSD got journaling integrated into FFS recently, I've
built and installed -current, and had a look. In short: it works
just as expected. In other words: Yai! :-) :-) :-)
The wapbl(4)
manpage gives more details:
To enable, a kernel with "options WAPBL" needs to run, which is
available in NetBSD-current since end of July 2008. Userland
from a similar date is useful, as the mount(8) command needs to know about
the new "log" option. With the proper system, it's pretty much a no-brainer:
- In /etc/fstab, enable logging for the file system(s) you need,
in my case it's just /:
/dev/wd0a / ffs rw,log 1 1
This is actually the only thing that needs to be done.
All the rest writen here just explains things in a bit more
details.
- Note that journaling is not active on the file system(s)
at this point, so pressing the reset button for testing will
result in a file system check (fsck) - don't do it right now. :)
- Reboot the system. Nothing special will show up in the boot messages:
...
audio2 at pad0: half duplex
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
Fri Aug 22 20:45:55 CEST 2008
swapctl: adding /dev/wd0b as swap device at priority 0
Starting file system checks:
/dev/rwd0a: file system is clean; not checking
Setting tty flags.
...
- Let's recall what happens here: after probing the hardware and
initializing device drivers (audio, ...), the kernel looks at disk
drives for a file system with a root partition (i.e. a disk with BSD
disklabel, "a" partition, and a known file system in it). It will use
the first root file system it finds, and mount it read-only.
As the above output is from a multi-user boot (not a single-user
boot), the kernel continues to run init(8), which in turn runs /etc/rc
(which then runs all of /etc/rc.d/* etc.).
First things in the boot process can be determined by using the
rcorder(8) tool just like /etc/rc does:
$ cd /etc/rc.d/
$ rcorder * | head
wdogctl
raidframe
cgd
ccd
swap1
fsck
root
...
Of the above scripts, raidframe, cgd and ccd configure additional
disk devices, wdogctl and swap1 are of minor interest here.
The two interesting scripts are "fsck" and "root":
"fsck" runs fsck(8), which in turn goes through the list of known
file systems in /etc/fstab, and checks for each file system if it
was unmounted cleanly last time. If not, the file system will be
checked, possibly repaired, and marked as clean. This is the much-hated,
time consuming process preventing a fast reboot when the system
crashed.
After ensuring all file systems are in a consistent state, the
"root" script mounts the root (/) file system read-write.
Following that, all other scripts run, create temporary files,
configure network devices, enable login and whatnot. Important
parts here are the order of the kernel first mounting the root
file system read-only, and after checking enable writing.
- As we have marked the root file system for journaling,
the log (journal) is created when
mounting the file system read-write.
For NetBSD, the log has only meta-data,
i.e. information on what changes were made to the file system's
management data structures like directories, link counts, etc.
No data blocks are journaled. This may not be 100% optimal from a user
point, but it ensures that the file system is in a consistent state
with respect to meta-data.
- When the file system is mounted with journaling enabled, bad things
are welcome (well, sort of :-) to happen, and the system will handle
them gracefully: kernel panics, power failures, someone pressing the
reset button - everything that disrupts system operation and gets the
file system into an inconsistent state will be caught by replaying the
journal on the next boot.
Note that journaling will not help about user/admin errors
like when you accidentally remove a file!
- After the system went down in flames -- for research purpose and
better predictability, let's
assume we've pressed the reset button -- with the file system
in an unclean state, this will be displayed on the next boot:
...
audio2 at pad0: half duplex
boot device: wd0
root on wd0a dumps on wd0b
/: replaying log to memory
root file system type: ffs
Fri Aug 22 20:49:55 CEST 2008
swapctl: adding /dev/wd0b as swap device at priority 0
Starting file system checks:
/dev/rwd0a: file system is journaled; not checking
/: replaying log to disk
Setting tty flags.
...
- After finding the root file system, the kernel first recognizes
the journal, and assumes that the system crashed. The system doesn't
know what's up with the disk so far, so won't go and alter the disk
by writing the changes from the log onto the disk. Instead, those
changes are replayed to memory only. This leaves
the disk as-is, but the in-memory view of the file system will
be consistent.
Running fsck then recognizes the file system as journaled, and
won't touch it, assuming that the log caught all bads. Mounting
the file system in the next step finally replays the changes
in the journal onto the disk, and finally sets it into a consistent
state permanently. After that, the regular boot process can proceed
as usual.
Please note that the messages "/: replaying log to memory/disk"
are printed by the kernel, as it's the kernel that runs all the
file system code.
- When the system is up and running, the mount(8) command can be
used to determine if logging is enabled or not:
# mount
/dev/wd0a on / type ffs (log, local)
The "log" here in the mount options indicates that
journaling is enabled.
First impressions of journaling are pretty good, the facts that
the journal needs no further maintenance. The fact that it's placed inside
the file system per default and doesn't need extra space is very
nice, too. People that want to keep the log after a partition for
a reason can do so, plus also specify a maximum journal size.
The enduser impact of this is that lenghty file system checks
are (hopefully :-) a thing of the past now!
[Tags: ffs, wapbl]
|