This document was published on O'Reilly's ONlamp side.

TTYs and X Windows - Unix User Interaction now and then
- An Introduction of Concepts for Novice Users -
Hubert Feyrer, January 2001

Abstract: This text introduces ancient Unix systems setup with a special emphasis on the terminal subsystem. Based on that, application support for multiple terminal types is discussed before we switch over to the X Window System. We show basic concepts, and how it still doesn't render the traditional Unix concepts outdated.

Introduction

Starting with a description of the hardware setup the Unix operating system was designed for, the article moves to the support that the system offers applications to talk to widely different hardware and how information on the hardware can be determined and set. Then a jump is made to the modern world, explaining how the X Window System works in comparison to the terminal based I/O of traditional Unix systems, giving information of the basic concepts of X. After that, the class of terminal emulators as much-used applications is examined to display that the ancient Unix terminal system is still used even with in the world of modern desktop systems.

Examples in this text use the NetBSD multiplatform operating system, but the concepts are common to all Unix systems, ranging from PDS/Cadmus that the author made his first contact with Unix on to modern Unix versions like NetBSD, Solaris and Linux.

Ancient Unix Systems

* Early support of input devices

Image #1: (click to enlarge!)

* How a terminal works

* Command sets

Image #2: (click to enlarge!)

* Handling many different terminals in applications

Image #3: (click to enlarge!)

In today's world there are two libraries available for doing these mappings of terminal attributes. One is the terminfo library found on many System V and Linux systems, the other is the BSD-originated termcap library which stores all the translation information all in a single file, usually /usr/share/misc/termcap. Be sure to have a look!

* $TERM and ttys

Image #4: (click to enlarge!)

This is from a NetBSD 1.5/sparc64 system, with ttya and ttyb being the machine's two built in serial ports, and the "console" line being used for all system input and output. The first column specifies the terminal line, and 3rd column, "type", tells the type of the terminal, "vt220" for everything but the system console here. Have a look at your ttys(5) manpage for all the details.

The terminal type from column 3 is passed through the login process to the user's login shell, and by that to any applications that the user starts. The application then uses termcap to lookup the specific command sequences they may need. When someone logs in from a terminal hooked up to ttya, it is assumed a VT220 (or compatible) terminal is used.

The information on the terminal type is kept in the TERM environment variable, which can be set to a different value if the user needs to do so for some reason. In addition, the terminal line one's logged in via can be determined with the tty(1) command:

Image #5: (click to enlarge!)

In the above example, I'm logged in on the ttya serial port. I do not have a (hardware) terminal, but use another computer that's connected to ttya via a serial line, and that has a terminal emulation program running. The terminal program talks to the Sun over the serial interface, displays the characters it sends, and interprets any special command sequences. As i know from my terminal program's handbook, it does not emulate a vt220 properly but works fine for vt100 commands. The TERM variable is set here to reflect this in order to make applications behave properly.

Back to modern life

* Console Drivers & Virtual Consoles

Most popular PC operating systems further offer a mechanism called "virtual consoles" which allow more than one "virtual" consoles, among which the user sitting in front of the machine can switch between, usually by pressing the Control+Alt and a function key (F1, ...). The system handles these as seperate lines, and each of them must be configured in /etc/ttys.

The NetBSD operating system, which uses wscons as the console driver on many ports, offers a way to configure which terminal type it should emulate, with "vt100" being used on most ports, and the sun3, sparc and sparc64 port using "sun".

* Graphics

Let's go back in time a bit again for this. When terminal vendors added new capabilities to their hardware, one of the new features was to not only display 80x24 textual information, but also bitmap data of some sort. Of course the command sequences for this were as non-standard as they could be. There was no application-level standard like terminfo or termcap that applications could use to make device independent use of these graphic terminals' capabilities.

Time evolved, machines were built which had main unit and gfx hardware connected with faster connection than serial line, and at the same time networking several main (computer) units to local area networks for data exchange and communication started to get used.

A group of researchers at MIT were exploring modern computer capabilities, and came up with a graphical user interface based on similar parts as the classic setup between computers and terminals was. It included an application process that was issuing commands like "clear screen" or "open window". The commands were transported to a "presentation server" that decoded the commands, and did whatever steps were necessary to e.g. clear a bitmap or draw a rectangular area with a border. The "presentation server" usually got it's commands from a local or remote computer system, and it knew how to talk to gfx board (for output), keyboard and mouse (for input). Events generated by the user operating mouse and keyboard were sent back to the application process for interpretation and action.

Image #6: (click to enlarge!)

* Calling Names

The window system described above is the X Window System. The application process is usually known as "X client", the "presentation server" handling device access is the "X server". Communication between X client and server is done using the X Window System Protocol, which is based on the TCP/IP stack today.

Using Unix' capabilities to run multiple processes, one machine running X clients can issue display requests to one or more X servers running either on the local machine using the local gfx board, keyboard and mouse, or using a X server that's running on a remote machine, and that thus uses a remote machine's gfx card, mouse and keyboard.

And just as vendors started to sell VT220 compatible ASCII terminals when that was mostly a standard, there were things called "X terminals". They were equivalent to their text-only cousins which only know how to handle input and output, without any capabilities to run actual application code themselves. Usually, X terminals consist of monitor, keyboard and mouse and a small box that ran (only) the X server process. The X terminal's X server knew how to talk to keyboard mouse and graphics card, and passed all input and output to an X client connecting to the X server.

An alternative to X terminals are today's Unix workstations, which have computer, gfx and IO-hardware all built into e.g. a PC or a traditional workstation machine. The workstation then runs it's local instance of Unix, including the X server process to talk to the hardware and a number of X client applications, all in one case.

* An X Terminal-Emulation Application - xterm

But... can we really? Let's see how communication with the Unix system works today.

Most (real :-) interaction with the operating system today still happens via a command line interface, which is used to type commands, start processes, and view their output, either via X or on the command line environment that the command was started from. That "command line environment" is usually a shell running either on the system's console, one of it's virtual consoles as described above, or a terminal window on the X desktop.

If the system console or virtual consoles are used, it's quite obvious that the traditional terminal handling is still needed, as it is the base for input and output done by all commands. This applies to both line oriented programs like ls(1) or cat(1) as well as so-called "screen oriented" programs like the "vi"sual editor vi(1), the top(1) process monitor, etc., which still use system libraries like termcap or terminfo to address the console's capabilities.

But we just agreed that noone needs consoles any more, and that we have the X Window System to interact with the system. There, we open up a terminal window, using programs like xterm, or one of it's newer cousins like the KDE "konsole" or GNOME's "gnome-terminal". The programs' names make it quite obvious what they do, and if we take a closer look at how things work, it will be even more obvious where they got their names from.

In the following example, we will use "xterm", but others do about the same. When you are using the X Window System and start "xterm", a rectangular window opens, with a "shell" command line interpreter running. You can type commands into the shell window, and output from the shell commands will be displayed in the xterm window. What's happening in more detail is, that the X server takes the keyboard events, and sends them to the xterm process. The xterm process has a shell process connected to it. The ASCII presentation of the keys pressed are sent from the xterm process to the shell process, just like it happens for both the systems or virtual consoles and ancient, dumb serial hardware terminals. The shell process reads it's input via it's standard input, processes the commands given, and output is sent back to the xterm via the standard output channel. What happens if you run a program that needs to e.g. clear the screen or move the cursor to a certain position? Exactly the same as in the cases described above, the application knows which terminal type it's connected to, it uses the termcap or terminfo libraries to query that terminal's capabilities, and then sends the necessary escape sequences to the xterm. The xterm process in turn recognizes the command sequences, and does not simply echo the command sequence's characters but knows how to interpret them, and does not send character-generating events to the X server for display, but does what the command sequence says - a "clear screen" sequence is sent to the X server as a command to draw a plain white rectangle, and another small black rectangle in the upper left corner, representing the cursor.

Image #7: (click to enlarge!)

Two details are worth having another look here. First, communication between the xterm and shell process. This is done via Unix' ordinary terminal input/output routines, no special mechanism was invented in the X Window System for user interaction between the X server and an application running in a command line environment, with the xterm program "translating" between the two environments. This means that the terminal handling that the Unix operating system has had for more than 30 years now is still useful even in modern environments.

The other detail that's interresting here is the interaction between the xterm process and the X server. The xterm process is just like any graphical application, it sends drawing requests to the X server - a big white filled rectangle, a small black filled rectangle, commands to move parts of the on-screen image around (i.e. do scrolling), of course use the X font mechanism to use a certain font and display chars from it on the X "presentation" server. Although the xterm terminal program does not provide special command sequences so set e.g. the font of the terminal used, xterm allows changing the font it asks the X server to use, and a number of other things. To try this, hold down the control-key and press one of the three mouse buttons at the same time - you'll see that the popup-menu of ctrl+left mouse allows selecting the font that the xterm process asks the X server for.

The Circle Closes

It is because of the simplicity and usefullnes of the basic Unix ideas that allowed the system to grow and gain maturity over all these years.

I'd like to close this rather lengthy overview with mentioning that I left some things like networking out deliberately - the (pseudo) terminals that are used in communication between xterm processes and their shells are also involved in network communication for things like rsh and ssh, and to pass data between different machines as if the data source and destination were processes on the same machine. Also, the description of the X Window System left out *many* of the concepts of the window system. Window managers and session management are only two of many things that come to mind, but going into detail on these is something left for another time. For now, I hope you had as much fun reading this as I had writing. Enjoy!