This text was published in the march 2002 issue of the DaemonNews magazine.


Open Source Hackers' Guide Through The Galaxy
A Tour through the NetBSD Source Tree
Part I - Userland
Hubert Feyrer, January 2002

Part I - Userland

NetBSD is one of the major Open Source operating systems on this planet. As such, the full source code is available via various methods like FTP, SUP, rsync, anonymous CVS and of course various vendors selling it on CD, usually accompanying the NetBSD operating system itself. If you unpack the NetBSD source, it extracts to about several hundred megabytes of source code located in /usr/src. This includes the full sources for the userland including compilers, the X Window System and build instructions for 3rd party software from the NetBSD Packages Collection as well as - of course - the NetBSD kernel itself.

In this article series we will give an overview of the userland parts of the NetBSD source tree, the second part will give an overview of libraries available for application programmers while the third part will give in-depth information on the kernel part.

All the files and directories discussed here are located under /usr/src, and we will save typing this every time. So if we e.g. refer to the "games" dir, you know you can find that in "/usr/src/games".

Now let's see what there is in /usr/src!

Makefile:
About the only file that you will find in /usr/src is a Makefile. This Makefile contains descriptions on how to build the source tree, and install a working system from it. Two interesting targets are "build" and "release". The former compiles all sources and libraries, and installs them on the system, while "make release" also creates distribution archives as e.g. found in a full NetBSD release.

There are some variables that can influence the build, the ones most worth noting here are DESTDIR, which allows building and installing the new system in a different directory, which is useful to test e.g. inside a chroot environment, to see if the build works ok in general, or for later installation by copying files to the live system. The other interesting variable is RELEASEDIR, which tells "make release" where to put the install sets.

There are a few other targets and variables, they are all documented at the top of the Makefile.

include:
This directory contains the system's general interface definitions and APIs in the form of C header files, which will be installed into /usr/include when building the system.

Note that most headers belonging to various libraries etc. are not located here, but in the same directory as the library. If you want to install all headerfiles that are available in the source tree, that can be done by running "make includes" in the /usr/src directory.

bin:
This directory has all the sources for the system's /bin directory, e.g. cp, cat, sh, rm etc.. Each utility has it's own seperate subdirectory in src/bin, and there's a Makefile for each of the utilities' directories, which is responsible for building the utilities.

As usual for binaries in /bin, the Makefile for each utility makes sure the binary is linked statically for maximum availability. This setting is set for all programs in bin/Makefile.inc, which is read by all the program's Makefiles (via the BSD Makefile maze).

sbin:
Similar to the src/bin directory, this directory contains commands targeted at system administration, like disklabel, dmesg, dump, mount, ifconfig and many others.

Again, there's one subdirectory for each program, and the Makefiles contained in there make sure the programs are linked statically, for desaster cases where one doesn't want to rely on shared libraries, shared lib loaders, etc. As in bin, there's a Makefile.inc file that determines some setting that apply to all programs in here, e.g. the destination directory (/sbin) or the fact that the programs should be linked statically.

usr.bin, usr.sbin:
Similar to src/bin and src/sbin, these two directories contain sources for programs that end up in the /usr/bin and /usr/sbin directories. As above, there's one directory per utility, and general settings can be found in the Makefile.inc files.

Some more complex programs in these directories that come with several accompanying tools like e.g. ssh, lint or vi have one directory and further subdirectories for seperate tools, e.g.:

usr.bin/ssh
usr.bin/ssh/libssh
usr.bin/ssh/scp
usr.bin/ssh/sftp
usr.bin/ssh/sftp-server
usr.bin/ssh/ssh
usr.bin/ssh/ssh-add
usr.bin/ssh/ssh-agent
usr.bin/ssh/ssh-keygen
usr.bin/ssh/ssh-keyscan
usr.bin/ssh/sshd     

The Makefile in the usr.bin/ssh directory will descend into the various subdirectories and build and install each tool.

libexec:
Organized in a manner much like the *bin directories mentioned before, this directory contains sources for programs that are not intended to be called directly from users, but that are usually called from other programs, e.g. network daemons started from inetd(8), the LFS file system cleaner, several mail related utilities or the shared library loader, ld.so.

As NetBSD supports two execution formats, a.out and ELF, there is a seperate directory for the a.out and ELF shared library loader, located in libexec/ld.aout_so for a.out and in libexec/ld.elf_so for ELF.

dist:
NetBSD comes with a great wealth of tools and utilities, some of which were inherited from 4.4BSD, some were added by the NetBSD Project, and others come from various 3rd parties. Often, programs from 3rd parties don't follow the NetBSD directory layout (one directory for each program) or build system (see /usr/share/mk/bsd.*.mk). There are several approaches to this problem.

The easiest way to bring a program into shape for NetBSD is by modifying the sources provided by 3rd party vendors to fit into the NetBSD scheme manually, then import them into CVS. The problem is that this causes lots of trouble and grief when updating to later versions, merging in all NetBSD changes again. While CVS can help here, this is still a pain. A slightly automated approach is to use xxx2netbsd scripts (xxx is the 3rd party program in question), which takes an unmodified source tree, then merges in any NetBSD changes and then import into the NetBSD CVS repository. Programs where this is used can be found in usr.bin/file, usr.bin/less and usr.sbin/tcpdump.

The third approach used by most major applications in NetBSD today is to import the applications into a seperate directory tree, without adjusting them to the NetBSD build scheme or operating system. In a second step, patches are committed to make the program going under NetBSD, and so-called "reachover" Makefiles are installed in the program's directories in one of the src/*bin directories. These reachover Makefiles contain the usual NetBSD-based build instructions. The important point is they take the files from the original distribution, without reorganizing the file hierarchy.

The src/dist directory is one of the directories used in NetBSD to store unmodified sources from 3rd parties that follow a BSD copyright and don't have problems otherwise that may lead to the desire to seperate them from the rest of the sources for various reasons - see the "gnu/dist" and "crypto/dist" dirs below for more information on that. Programs that are stored here include am-utils (the amd automounter), bind, dhcp, ipf and ntp. The reachover Makefiles can then be found in the usual directories under the src/*bin hierarchies.

Let's take BIND as an example. While the distribution is in src/dist/bind, the Makefiles related to build it are in src/usr.sbin/bind/*:

usr.sbin/bind/dig/Makefile
usr.sbin/bind/doc/bog/Makefile
usr.sbin/bind/doc/Makefile
usr.sbin/bind/Makefile
usr.sbin/bind/dnskeygen/Makefile
usr.sbin/bind/dnsquery/Makefile
usr.sbin/bind/host/Makefile
usr.sbin/bind/lib/Makefile
usr.sbin/bind/named/Makefile
usr.sbin/bind/named-bootconf/Makefile
usr.sbin/bind/named-xfer/Makefile
usr.sbin/bind/ndc/Makefile
usr.sbin/bind/nslookup/Makefile
usr.sbin/bind/nsupdate/Makefile
usr.sbin/bind/reload/Makefile
usr.sbin/bind/restart/Makefile
Looking e.g. at the src/usr.sbin/bind/nslookup/Makefile, here is the part that's responsible for pulling in the sources from the src/dist/bind directory:

.include "../Makefile.inc"
.PATH: ${BIND_DIST_DIR}/bin/nslookup \
  ${BIND_DIST_DIR}/man 
The part from the src/usr.sbin/bind/Makefile.inc that's responsible for setting BIND_DIST_DIR as appropriate for all the BIND-related tools is:

BIND_DIST_DIR=  ${.CURDIR}/../../../dist/bind 
With the nslookup Makefile residing in src/usr.sbin/bind/nslookup and appending ../../../dist/bind results in src/usr.sbin/bind/nslookup/../../../dist/bind which is the same as src/dist/bind and voila, there's our BIND sources! :-)

gnu:
This directory seperates things from the rest of the source tree that are distributed under the GNU Public License (GPL) or similar licenses, which require people who make modifications to the code to make these modifications public too, which in turn is often not an option for companies that use NetBSD e.g. in embedded applications. The code is seperated in one dir to make it easier to identify and not use it in the worst case.

The directory structure here is similar to the one in the main "src" dir, consisting mostly of a distribution archive, and reachover Makefiles that access these sources then:

Makefile:
This Makefile just descends into all the directories that contain NetBSD build instructions (i.e. not into "dist"), to build and install programs and documentation.

dist:
This directory contains unpacked sources of various programs that will be used via reachover Makefiles. Programs included here are "normal" userland programs like bc, diffutils, gawk, grep, texinfo, various support libraries like libiberty and the whole toolchain consisting of gcc, binutils, gdb, and the C++ libraries libio and libstdc++.

The programs are stored in their original distribution-provided directory layout, and we will not describe them further here. Please check the programms' documentation if needed.

lib:
This directory contains several subdirectories, one for each library that is built from the sources in the src/gnu/dist directory. The libraries are built using reachover Makefiles which use the NetBSD set of make(1) rules to build libraries, <bsd.lib.mk>.

libexec:
This directory contains "only" Ian Taylor's UUCP, fixed to accommodate the NetBSD build structure. Each library and program belonging to UUCP has it's seperate directory under src/gnu/libexec/uucp. In contrast to many other packages from src/gnu, reachover Makefiles are not used here. Instead, the sources were reorganized to fit into the NetBSD scheme of one directory per program/library.

usr.bin:
This directory contains bc, binutils, cpio, dc, diff, egcs (the version of gcc shipped with NetBSD 1.5.x!), gas, gdb, grep, groff, gzip, rcs, sdiff, send-pr, sort, (GNU) tar and texinfo. Most programs here are converted to have distribution sources in src/gnu/dist, but some still have their own sources stored in the NetBSD layout.

usr.sbin:
There are two major directories here: postfix and sendmail. Each of them has the full sources of the corresponding Mail Transport Agent (MTA) whipped into shape for NetBSD. The various programs that come with the MTAs are built in seperate subdirectories, as usual.

crypto:
Similar to code with certain copyright restrictions, there is code that is critical in other regards. For many years, the US export restrictions applied to every country outside the USA, which made it very difficult to include cryptographic code with the NetBSD distribution. This in turn lead to some hassles like maintaining two crypto-archives, one for domestic USA use and one for international use, but fortunately this is now a thing of the past. There are still countries that fall under export restrictions due to cryptographic technology, which needs to be addressed. To make it easy to prepare a NetBSD distribution for them, and to split out crypto code from the NetBSD source tree if there ever is a need in the future for other reasons, it was chosen to keep crypto related sources in their own subdirectory, src/crypto.

The src/crypto directory doesn't contain any infrastructure to build programs itself, it only contains sources that are used from a number of places all over the "normal" places in the NetBSD source tree, i.e. from src/usr.bin, etc.

dist:
This directory contains the unpacked distributions for
  • KTH Heimdal Kerberos (heimdal)
  • MIT Kerberos (krb4)
  • OpenSSL (openssl)
  • OpenSSH (ssh)

games:
Back to normal sources for some entertainment! This directory contains sources for various command-line and curses-base games as well as a number of more or less useful utilities that didn't make it into other places of the source tree - most of the layout here is from historic BSD sources.

Besides a number of games, there is also a list or less useful programs stored here, including:

banner:
Used by the printing system to print ... banners. :)

dm:
This program allows restricted execution of programs based on login time, system load or login terminal.

fortune:
print a random, hopefully interesting, adage

pom:
display phase of the moon - useful for selecting software completion target dates and predicting managerial behavior!

rain, worms:
ASCII eye candy

wtf:
tries to explain acronyms, with fallback to standard unix manual pages.

distrib:
This directory contains all the procedures and data that are used when creating a release, i.e. architecture specific code to create install media, tools used by the install routines, and release documentation.

Interesting directories here are:

alpha, amiga, arm32, atari, bebos, hp300, hpcmips, i386, mac68k, macppc, mvme68k, news68k, newsmips, pc532, pmax, sparc, sparc64, sun3, vax, x68k:
The procedures in here create install media images, usually various floppy images or miniroot filesystems.

miniroot:
Ports using miniroot filesystems to install the system use the code from this directory for the common installation routines. Miniroot-based installation is now deprecated in favour of sysinst, see below.

notes:
General and port-specific installation instructions used to create the INSTALL.* files that come with each NetBSD release. Each port has specific files in it's own subdirectory, common text used for all ports (introduction, ...) is available in the "common" directory. The files are in *roff format, and the various output formats (HTML, PostScript, ASCII) are derived from that.

sets:
This directory has the set lists that define what files and directories belong to each of the install sets (base, comp, text, etc.), as well as a few scripts to help creating and maintaining the lists. The set lists are in seperate directories under sets/lists/<set>, for each set there are several types of files:

  • machine independent list ("mi")
  • CPU-architecture dependent files ("ad.{mips,m68k,powerpc,sh3}")
  • machine dependent files for each port ("md.*")
  • files depending on the availability of shared libraries and execution format ("shl.{elf,mi}")
  • files that were present in previous releases and that are no longer used in the latest release, and thus can (and will be, by sysinst) be removed ("obsolete.*")

utils:
There are a wealth of install tools in this directory, all optimized for little space usage. Most important the sysinst utility, but there are also special versions of programs like "sh", "ftp" etc. that come with reduced feature set to not waste space on install media.

etc:
Here are the system config files that will be installed in various places of a new system. Interesting files are:

regress:
regression test for libraries, kernel features, etc.

share:
stuff for /usr/share; note that most manpages are not stored in here but in the same place as the utility they document

lib:
library source - see part II of this article series

sys:
kernel source - see part III of this article series

Two things that are not located under /usr/src but that are available in seperate source tar-files are the sources of the X Window System used on all the NetBSD platforms as well as the NetBSD Packages Collection:
xsrc:
This directory has XFree 3 in xsrc/xc, and XFree 4 in xsrc/xfree/xc. While the former has many NetBSD-specific changes for non-i386 ports (amiga, sparc, ...), the latter needs to be used with most modern PC graphics cards. Currently, only XFree 3 is built and included with releases, XFree 4 snapshots are made manually as of this writing.

To build XFree 4 instead of XFree 3, the variable USE_XF86_4 needs to be set in /etc/mk.conf.

pkgsrc:
The NetBSD Packages Collection is organized into several directories that group programs by function. Each functional group contains the packages from that group, which can be installed using "make install". The infrastructure of the Packages System is in the *.mk files in the pkgsrc/mk directory. See pkgsrc/Packages.txt for more information.

This first part of our tour through the NetBSD source tree outlined the userland code, distribution facilities as well as random odds and ends. The next part of the tour will give an overview of all the libraries available for application programs.
(c) Copyright 20020110 Hubert Feyrer
$Id: tour-de-source-1userland.html,v 1.1 2002/01/21 00:45:28 feyrer Exp $