?he 'POPSYS''Page %'
?fo 'Steve Hardy'- % -'15th February 1977'
.ce2
HOW THE POP11 SYSTEM WORKS
.br
==========================
.sp2
.ce2
Keywords
.br
--------
.sp
Compiler, interpreter, LISP, machine code, syntax analysis,
lexical analysis, list cell, garbage collection.
.sp2
This handout points out those areas of computer science necessary for
an understanding (even slight) of how the
POP11 system works.
.sp2
We can divide an understanding of how the POP11 system works
into two parts. Firstly, how is the translation from POP11 to
a language the machine can understand done, and secondly, how are
the important features of POP11. (In particular list structures,)
implemented?
.pg
An important phase in the translation process is the
formation of an intermediate representation of the program that 
is called a 'parse tree'. This parse tree is in many ways analagous
to a
LISP
program and so an understanding of
LISP 
itself is useful. There is then a second phase of translation
from 
LISP
to machine code. To completely understand the second phase of
translation, the reader must understand something of machine code
itself.
.sp2
The two principal data structures of POP11 (and
LISP)
are "the word" and "the list". We need to know how these can be
represented on the PDP11's memory - this, of course, requires
an understanding of the PDP11's memory.
.sp2
A third important aspect of understanding POP11 is the 
UNIX
system, for it is via 
UNIX
that POP11 programs communicate with the 'world'.
.sp2
.ce2
Translation Into LISP
.br
----------------------
.sp
Translation of a high level language like POP11 into 
a
LISP-like language is usually called syntax analysis. This topic
is described in most books on compiling techniques and is usually
dependent on an understanding of "transformational grammars".
.sp
A related topic is 'lexical analysis', that is the conversion of 
characters (e.g. 'C','A','T') into symbols, (e.g. "CAT").
.sp2
.tp 10
.ce2
LISP
.br
----
.sp
No actual compiler would produce
LISP 
as an intermediate representation of a program. However, the parse
trees produced will usually be very like
LISP
and as
LISP
is well documented 
I will use it as a analogy for a parse tree. There are many manuals
on
LISP
itself.
.tp7
.sp
Here is an example of a POP11 program:
 	: FUNCTION SUM(LIST);
 	:	IF	LIST = []
 	:	THEN	0
 	:	ELSE	HD(LIST) + SUM(TL(LIST))
 	:	CLOSE
 	: END;
.br
Here is the same program in
LISP
 	: (DEFINE (SUM LIST)
 	:	(COND ((NULL LIST) 0)
 	:		(T (PLUS (HD LIST)
 	:			(SUM (TL LIST))))))
.br
Notice that in LISP, expressions are represented by lists.
The HD of such a list is the name of a function to be applied and the TL
its arguments.
Notice, also, that the arguments of
COND
are 'condition - consequent' pairs.
.sp
An important feature of 
LISP 
is that it can be 
"interpreted" by a 
LISP
interpreter.
Therefore, it is useful if a reader understands something
of the way
LISP
is interpreted in a
LISP
system.
.sp2
.ce2
Code Generation
.br
---------------
.sp
One of the disadvantages of interpreting code is that one usually
requires a complicated program, written in a simple language (usually
machine code) to do the interpreting. 
Furthermore, interpreting is inevitably slower than direct
execution by the machine itself. A tremendous increase in
efficiency can be achieved by translating 
LISP
into machine code. When carried out by a compiler, this process
is called "code generation".
.sp
Here is an example of a LISP expression;
.br
 	: (PLUS	(HD LIST)
 	:	(SUM (TL LIST)))
.br
and here is its translation into a simple machine code:
 	: PUSH LIST
 	: CALL HD
 	: PUSH LIST
 	: CALL TL
 	: CALL SUM
 	: CALL PLUS
.br
.sp2
.tp 10
.ce2
Machine Code
------------
.sp
Machine code is a language, specific to each machine, which can be
understood by the hardware of the machine itself. That is, the
interpreter for machine code 
.ul
is
the machine. Perhaps surprisingly,
most modern computers have a very similar sort of language as their
machine code. Each instruction in a machine code program will usually
perform some very simple operation, for example:-
.in+5
.ti-5
(1)\ \ Putting the value of a variable on to a stack.
.br
.ti-5
(2)\ \ Putting a literal (e.g. a number) on to a stack.
.ti-5
(3)\ \ Taking the top item from the stack and putting
it into a variable.
.ti-5
(4)\ \ Calling a sub-routine.
.ti-5
(5)\ \ Reading from a sub-routine.
.ti-5
(6)\ \ A jump instruction which breaks the normal
sequential flow of control.
.ti-5
(7)\ \ All programming languages require the ability to
conditionally execute code. One way of doing this is to have a
"conditional jump instruction". Unlike the jump instruction, this
label will only be "gone to" if some 'condition' is met (e.g. the top element of the stack
being
TRUE).
.in-5
.sp2
.ce2
Machine Memory
.br
--------------
.sp
Most computers have a very simple memory structure. A large
number of numbered "words" - each word of the computer's memory
can hold a number. This contained number can be interpreted 
in any way we wish, for example as an instruction to the hardware,
as a character (e.g. A = 1, B = 2 etc.), as the address of
another word in the machine's memory, "a pointer" or, of course,
as a number.
.sp2
For a more detailed explanation of how we can represent POP11
objects such as words and lists in this sort of memory, see the
WAL
demo and the
BOXES
demo.
.sp2
.ce2
Garbage Collection
.br
------------------
.sp
The computer only has a finite memory (in our case about 60,000
words). As the POP11 program is running it is creating various
data structures (e.g. lists, words, functions, etc.).
These data structures are represented using some of the machine's
memory. A problem arises - what happens when the POP11
system has used all of the memory allocated to it by the operating
system?
Now, many data structures are only temporary, they are created,
used for a while and then no longer wanted. (For example, a function
is re-defined, the data structure used to represent the old
definition is no longer wanted. POP11 contains a set of
functions collectively called the "garbage collector" which attempts
to guess which of the data structures currently existing are
no longer wanted. These unwanted data structures are then made
available for re-use.
.sp2
.tp 10
.ce2
The Operating System
.br
--------------------
.sp
There is a set of programs collectively called 'the operating
system' which exist to help programs make the best use possible
of the computer. They have two important rules:-
.in+5
(1)\ \ The computer would be under-used if only one person
could use it at a time. When that one person was sat thinking,
the machine would be idle. To prevent this happening, the 
operating system allows many people to use the machine on a
"time sharing" basis. First one user has her program
running and then, whilst she is thinking, another user's
program can run. If more than one program is runnable at any
given moment, the operating system gives them equal shares
of the computer's time. The operating system also protects users
from one another (and indeed, protects itself from them) if one
user has an area that should not affect others. For this reason,
the operating system clearly defines what sort of access each user
can have to the machine (for example, how much of the computer's
memory they are allowed to use).
.br
(2)\ \ Almost all programs want to communicate with the world in some
way, for example by reading input from the teletype or sending output
to the teletype or accessing the discs. These tasks are very fiddly
and moreover, it is essentially the same task, no matter what program
the user wants to run. For this reason, instructions on how to
work the individual devices are made part of the operating system
(thus preventing unnecessary duplication of code).
.in-5
