May 1994

* Notes
Here are some notes on Tornado.  These are in no particular order.
Take with a grain of salt.  Your mileage may vary.  Don't try this at
home.  And don't call me.

I've put this file together to try to capture some of the implicit
assumptions behind the Tornado architecture.  I also want to try to
sketch out a useful framework.  Some of the ideas behind the
architecture may seem a little bizarre, so I've added a reading list
in case you want to trace some of these ideas to their sources.
There's a section on cross-platform tools that mostly serves to
support my view that there's more to this than pretty dialogs.  And
finally, I want to talk about some of the issues around adopting one
of the commercial packages we've discussed because it's ``just like
Tornado'' --- feel free to ignore that discussion...

* Assumptions
Here I try to expose some of the underlying assumptions.  Caution,
attitude ahead...

** The Direct-Manipulation Metaphor?
In 1984 we (programmers, at least) got a wack on the sides of our
collective heads with the introduction of the Macintosh and its
direct-manipulation gospel.  The elegant object-action metaphor told
us that our users no longer needed to walk up and down menu trees or
remember cryptic command lines.  Just select an object and then choose
an action from the friendly menu bar.  This worked out OK for many
programs where the appearance, meaning, and behavior of an object
could be easily intuited.  But a Mac interface does not necessarily
present this elegant object-action metaphor; it breaks down quite
readily when objects become complex or abstract.  Consider Helix (the
database where everything is done with graphical templates and a
visual data-flow programming language), 4D (not much better than
Helix, but at least the code looks like code), and the multiple nested
dialogs found in Microsoft's Mac products (the dialogs are really just
a menu tree in disguise).

Direct manipulation is supposed to be good because there are no modes
(well, except for those nested dialogs) and modes are supposed to be
inherently evil.  The main problem with modes, though, is really that
they have the *potential* to introduce unexpected behavior for a given
action.  Fortunately, command-line and menu-tree interfaces have no
monopoly on unexpected behavior, and I'm not just talking about modes.
There are only two problems with direct manipulation.  One is that,
given the state of the art, it's relatively hard to create programs
that support the direct manipulation metaphor.  The user interface is
always difficult, and the new possibilities engendered by direct
manipulation only magnify the difficulty.  The other problem with
direct manipulation is that it's frequently not obvious how to present
an object on the screen.  Rather than describing an object using text
(a relatively simple task), you have to find a way to not only *draw*
some representation of the object, but also to see all the details in
a limited space and find a way to manipulate all of the features in a
way that seems obvious to an initially clueless user.  All of this
just adds more complexity --- see the first problem.

So we put person-years of effort into developing the perfect
interface, test it with ``representative'' users (``Yuh, OK, sure,
fine.''), do all the supporting code, alpha test, beta test (``Yuh,
OK, sure, fine.''), and deliver.  What's the first question from the
users?  Right: ``How do I do X, for X not in the interface.'' And the
response: ``You can't get there from here.''  If we've done our job
really well, we can provide some of the X in the next release.  Just
as likely we've painted ourselves into a corner, either by making it
too hard to code X, or by forcing a user viewpoint that precludes
doing X because it will confuse and befuddle the users who don't want
to do X.  Game over, man...  One defense is to provide multiple views
through different programs (a nice trick, if you can avoid coding
yourself into a corner --- this is why we have CRISP).  The other
defense is to ask the user to do a little more work, and maybe give up
the ``point and drool'' interface.

Gee.  Give up the Mac interface?  But our users *need* and *want* the
graphics.  Right.  They want presentation graphics to show pretty
network diagrams to their customers.  This is a minor (but important)
part of the total job.  All the rest can be based on simple forms:
equipment lists, connections, templates, traffic, configurations, bill
of materials, orders, etc.  The forms don't even have to be graphical
--- take a look at some of the forms capabilities provided by Emacs
packages.  Simple text with obvious delimiters can be just as
effective as fancy graphical forms, and much easier to program ---
functions like lookup, replication, deletion, printing, etc. can be
handled by a text editor.  And the fancy graphics?  Take the text
input and produce graphical output.  Do layout based upon attributes
like geographical location, tweak it using a second pass to avoid
overlaps, and be done with it.  How much value do we add by tweaking
fonts, styles, colors, graphics, and positions on a case-by-case basis?

** Don't Cross the Streams

If you think about the work that needs to be done in terms of data
flows, then you understand the idea of streams.  Most of the so-called
programming methodologies just put a pretty face on this idea.  Of
course, that doesn't stop big companies from paying consultants to
parade the latest darling before their bedazzled eyes...

You can make streams even more useful if you make sure they never
remember anything.  Give a process an input, get an output.  Same
input again, same output.  We're playing fast and loose by confusing
streams and processes, but this is deliberate.  If it helps, think of
a process as a stream that does something along the way.  The catch is
that its output can _only_ depend upon the input.  This rubs a lot of
people (particularly those who don't have to write programs for a
living) the wrong way --- ``But we have to store all of our data
_somewhere_, right?''  Right.  Just not in the stream, OK?

So if you have streams, and streams can't remember anything, then you
need *stores* to hold the data.  This is pretty obvious --- give the
data a place to rest when it's not being processed through a stream.
You can even give the data a name and put a little index tab on it.
Just don't process the data while it's in the store.

Finally, the astute non-programmer will ask, ``Where does the data
come from, and how can I watch it?''  Since this is a family program,
we can't show any pictures.  We can mention input and output (you all
know what that is, right?) and insist that these functions neither
process nor store.

All of this is kindergarden stuff (see the remark about consultants,
above), but it's a necessary prelude.  Sorry.  OK, no I'm not...
We've got input, processing, storage, and output --- think of them as
the four code groups.

Who cares?  All programs do these things.  How could they not?  Right.
But then, everything in the universe is made of Spam --- it's the
arrangement of the Spam molecules that makes things interesting.  Take
any program apart and you'll find little bits of program text from the
four code groups.  The problem with most programs is that these little
bits are all squished together like Spam in a can --- yucky, boring,
slimy, and just plain gross.

The reason we have the four code groups (Quick!  You in the back!
What are the four code groups?) is to help us understand programs (and
to give the consultants something to sell, of course).  That's why
these high-priced methodologies make a point of keeping input, output,
processing, and storage all separate from each other.  The reason we
end up with code that looks like Spam in a can is that the methodology
gurus (many of whom haven't actually written a program in years)
only tell us to separate these important concepts as we _design_ our
programs.  After that there's a lot of handwaving while the consultant
says, ``And then the design becomes code'' (sometimes known as,
``Assume a miracle'').  This is where we usually screw things up, not
because we're stupid, but because the consultant must be right since
we paid all that money...  Don't _we_ get a miracle, too?

OK, now to the point (yes, there _is_ a point to all of this).  If you
structure your _code_ the same way you structure your _design_ (or for
those who don't _do_ designs: they way you're _supposed_ to do designs
--- go back a few paragraphs and start over) you'll be a happier,
well-adjusted person, the sun will shine and the birds will sing,
global warming will cease to be a problem, and you'll get lucky
tonight.  Not only that, but your code will be easier to read,
understand, maintain, and modify.  But you'll still be a programmer
--- even miracles have limits.

Remember 1984?  The Macintosh?  Event-driven programming?  This gave
us a gentle hint that we needed to separate the four code groups, but
we were all too dense to see it.  Don't feel too bad, though --- even
the folks who dropped the hint didn't quite get it.  That's why Apple
is desperately trying to teach us all how to have our programs talk to
themselves using AppleEvents --- separate the I/O from the processing
and you too can be the first weenie on the block to have a
``scriptable, recordable, tinkerable, AppleScript-savvy'' program.
Well, two years have gone by since the heralds first sang this
revelation from the ivory towers of 1 Infinite Loop in Cupertino, and
there's still a stunning lack of ``scriptable, recordable, tinkerable,
AppleScript-savvy'' programs --- I guess we weren't the only ones who
didn't get it right.  (Note: Microsoft wasn't quite so dense --- if
you look at the Windows API you'll see that messages have a loftier
status than in the Mac OS.)

The point?  Why does the CRISP model separate Interaction (played by
Processing), Storage (as itself), and Presentation (played by I/O),
and *require* that all the components talk to each other using
Representation over Communication (a.k.a. stream)?  It's a little more
work, but it guarantees that you can make your code mirror your system
design rather than a can of Spam.

* Framework
I've started a separate file, Tornado/frame/frame.text, with some of
my notes for an implementation framework.  Defining a framework is a
necessary step between system architecture and individual tools, since
it forces decisions on many of the points left intentionally open in
the archtecture.

* Readings
This section lists some reading material that may or may not
illuminate some of the thinking behind Tornado.  Everything on this
list is here because I see a connection to Tornado, which I'll try to
describe if it's not obvious.  Maybe you'll see the connection.  At
worst, it may just be interesting reading.

** Compilers: Principles, Techniques, and Tools
Aho, Sethi & Ullman, Addison-Wesley, 1986

This is one of the most comprehensive books on compiler development.
Compilers are interesting because they're fairly complex pieces of
software, but their design and development is practically a science.
It's useful to look at how certain features have been abstracted to
the point where tools can be defined --- things like parser generators
and compiler-compilers.  The interaction between the syntax of a
language and its translation into semantic actions is also
instructive.  This is a good way to start thinking about ``meta''
without getting too confused.

** The Little Lisper
?, Addison-Wesley

This is probably the best place to start to learn Lisp.  The entire
book alternates explanations with exercises, just like those
``programmed learning'' texts that came and went (without replacing
teachers) during the '60s.  You'll develop a solid understanding of
the fundamentals with this book, then you can move on to a text with a
bit more depth.

** Common Lispcraft
Willensky, W.W. Norton, 1986

This is probably the easiest text on Lisp after the Little Lisper.
It covers many of the features of Common Lisp without getting too deep
or too technical.

** Lisp, 3rd ed
Winston & Horn, Addison-Wesley, 1989

Here's a meaty tome covering a lot of Common Lisp in detail, with many
``lifelike'' applications as examples.  Audience: serious weenies.

** Anatomy of Lisp
Allen, McGraw Hill, 1978

Here's a detailed peek under the hood.  This is worth the effort, as
it has a lot of good stuff about implementing Lisp languages, and
really helps drive home the point about name vs. value and dynamic
data structures.

** Common Lisp: The Language, 2nd ed
Steele, Digital Press, 1990

This the the de-facto reference for implementations of Common Lisp in
1994.  The Common Lisp ANSI standard was submitted for its second
public review in April; if all goes well there will be an ANSI
standard Common Lisp later in the year.  The draft standard document
is available via anonymous ftp from parcftp.xerox.com.

** Object Oriented Programming in Common Lisp
Keene, Addison-Wesley, 1989

This will help you understand how to program using the Common Lisp
Object System (CLOS).

** The Art of the Metaobject Protocol
Kiczales, des Rivieres, and Bobrow, MIT Press, 1991

Once you understand the Common Lisp Object System (CLOS), this book
will give you a detailed look behind the scenes.  This book is
valuable for its treatment of meta.  Source code is available via
anonymous ftp.

** On Lisp
Graham, Prentice Hall, 1994

This is the only book to give a thorough treatment of using Lisp to
build layers of problem-oriented languages.  The idea is that you
adapt Lisp to your problem, rather than coding a solution to your
problem directly in Lisp.  This goes deeply into proper use of macros.

** Structure and Interpretation of Computer Programs
Abelson and Sussman, McGraw Hill 1985

This is supposed the freshman computer science text at MIT.  It's one
of the best introductory books I've seen, with a strong emphasis on
appropriate abstractions.  Even though this is an introductory book,
it covers advanced topics like parsing, databases, matching, and
compilers.  The native language is Scheme.  Source code is available
on disk or via anonymous ftp.

** Artificial Intelligence, 3rd ed
Winston, Addison-Wesley

Not much code here (although source code is available on disk or via
ftp).  However, there are lot of good examples of representing and
processing ``knowledge.''  Highly recommended.

** Artificial Intelligence Programming, 2nd ed
Charniak, Riesbeck, and McDermott, Lawrence Earlbaum Assoc., 1987

This has a lot of practical Lisp code for knowledge representation,
and search.

** The AI Workbench: Babylon
Christaller et al, Academic Press, 1992

This is a description of an expert system workbench that mixes frames,
objects, rules, logic, and other common knowledge representation
techniques.  Source code and a Macintosh executable is available on
disk (with the book, usually) or by anonymous ftp.

** Paradigms of Artificial Intelligence Programming
Norvig, Morgan Kauffman, 1992

If you ever actually write code in Lisp, this book should be
_required_ reading.  This is the only book that covers the details of
efficient (as opposed to simply useful or correct) Lisp programming.
Source code is available on disk or via ftp.

** Algorithms, 2nd ed
Sedgewick, Addison-Wesley

A very useful collection of algorithms for everything from sorting and
searching to geometric algorithms and dynamic programming.  Get this!

** Dictionary of Computing, 3rd ed
Oxford University Press, 1991

This is a very good technical dictionary of computer science terms.

** The New Hacker's Dictionary, 2nd ed
Raymond, MIT Press, 1993

This is a dictionary of computer terms from the hacker subculture of
computing.  A lot of this stuff is in common use.  Here's a sample:

:creationism: n. The (false) belief that large, innovative software
   designs can be completely specified in advance and then painlessly
   magicked out of the void by the normal efforts of a team of
   normally talented programmers.  In fact, experience has shown
   repeatedly that good designs arise only from evolutionary,
   exploratory interaction between one (or at most a small handful of)
   exceptionally able designer(s) and an active user population ---
   and that the first try at a big new idea is always wrong.
   Unfortunately, because these truths don't fit the planning models
   beloved of {management}, they are generally ignored.

** LaTeX: A Document Preparation System
Lamport, Addison-Wesley, 1986

This will help with OzTeX.  All the documents I've done are in LaTeX,
not TeX.

** The TeXbook: Volume A
Knuth, Addison-Wesley, 1986

This is much thicker than the LaTeX book, and you won't need it nearly
as often, but it's good to have it ready at hand.

** GNU Emacs Manual, 7th ed, Version 18
Free Software Foundation, 1992

The user's manual for the version of Emacs on the Mac.

** GNU Emacs Lisp Reference Manual, Ed 1.05, Version 18
Free Software Foundation, 1992

The programmer's manual for the version of Emacs on the Mac.

* Tools
There's more to doing cross-platform development than putting pretty
dialogs on a screen while honoring native look and feel.  The
discussion here is on two planes: development and delivery.

** Languages
C is the way to go.  It's widely available, with multiple
implementations on every platform (even the Mac, which has always been
starved for development tools).  There's an ANSI standard definition
of C.  The standard has a lot more to say than just the syntax of the
language.  If you hope to have any chance of porting your C code
across platforms, or even across compilers on the same platform,
someone had better *study* the ANSI C standard and understand the
pitfalls that get in the way of writing portable code.  At the very
least, this person should advise the rest of the group regarding the
issues.  An even better approach would be to develop a set of coding
standards to be checked during code reviews (you will be doing code
reviews --- remember SEI).  You may be able to ``cheap out'' of this
by finding a good book on writing portable ANSI C; the book should
offer you a checklist for doing it right.  I haven't looked at C books
since '91, and don't know whether such a book exists.

C++ is still a nightmare.  The only good news is that you can compile
an ANSI C program using a C++ compiler.  Once you venture into writing
real C++ code, you run the risk of writing code that is either
non-portable or broken in subtle ways.  There is no formal spec for
C++; vendors still disagree about interpretation of the intentions of
C++'s because some of what Stroustrup (spelling?) has written is
generally ambiguous.  The other problem is that he keeps adding
features.  Not the kind of foundation upon which to build a system; if
you adopt some wizzy new feature that later changes, you could have a
lot of work just tracking updates to the informal language definition.
That said, you'll probably end up using C++ because everyone else
does.  The proper defense is just like the one you'll use to develop
portable C code, except that now you'll need a C++ expert to assess
the risks and define an acceptable subset.  From what I've read (note:
no hands-on experience supports this) exceptions, templates, automatic
constructors and destructors, and maybe some kinds of inheritance can
be troublesome.  Be careful.

Lisp really does have a lot to offer, except commercial success.  If
you were going to do all your development for Unix workstations, I'd
push Lisp very hard.  But on PCs and Macs, Lisp is a tiny niche market
and vendors come and go pretty quickly.  Apple announced that it will
not actively pursue a port of Macintosh Common Lisp to the PowerPC
Macs unless they can get someone else to do this.  A comparable
situation ensued when Apple bought Coral Software (the original
developers of Macintosh Common Lisp) and put Coral's excellent Logo
environment (targeted to the then-healthy educational market) on the
shelf; Logo didn't resurface until about three years later, and has
gained little more than cosmetic improvements at the hands of its new
owner.

Aside from the style of development provided by the typical Lisp
environment, I think Lisp has three things to offer: (1) the duality
of code and data, (2) an extensive set of built-in data structures and
operations, and (3) automatic management of dynamic storage.  I don't
think you can get (1) except with Lisp, but (2) and (3) may be
obtainable by other means.  See the Resources section.

** Interfaces
For dealing with character terminals, you can do full-screen stuff (as
opposed to a glass TTY) using the curses library from Unix.  You can
get free implementations of curses.  The nice thing about curses is
that it hides the details of device control from the program; say what
you want drawn (in characters, of course) on the screen and curses
figures out an efficient way to send control and printing characters
to update the screen (often by redrawing only changed portions).

For dealing with graphic terminals, X is the closest you'll get to a
cross-platform standard.  X runs a display *server* on your local
machine; the program you interact with through the X server displays
is called a *client* (whether or not it's remote).  X servers are
available for the Mac from several commercial vendors, but there are
no freeware implementations that I've heard of.  X servers are
available both freely and commercially for PCs and workstations; X can
use gobs of RAM in some implementations, so you should try to find a
shared library implementation.  X is not very effective over low-speed
(<56K bps) comm lines.  The look and feel of an X server is created by
its widgets --- Motif is the most common, but there are others.  I'm
not sure about the role of widgets in delivering portable systems, but
there are plenty of books on X if you're interested.

** Unix Clones
If you deliver code on a PC, you may want to consider using a Unix OS
as opposed to DOS, Windows, or NT.  The only good reasons to do this
are cost (free) and capability (arguably better).  Unix has preemptive
multitasking, security, networking, scripting, virtual memory, GUI
(X), and lots of development tools.  You can get all of this for free
to run on a 386 or better PC (including Pentium).  There are also
commercial versions of 386 Unix such as Sun Solaris.

** NextStep
Another 386 OS, this is a port of the OS from Steve Job's famous NeXT
computer (NeXT no longer manufactures hardware).  This is a different
approach from all the others; the GUI is Display Postscript, the OS is
built on top of Mach and supposedly has an API based completely on
objects, the native language is not C but an also-ran called Objective
C, and so on...  If you want to break with tradition, this is the way
to go.  If you want to have a choice of people who can program the
system, stick with the mainstream systems (Mac, Unix, DOS, Windows ---
not necessarily in that order).

** TeX
Lowest common denominator is ASCII art --- art that's created by
putting monospaced characters at the proper positions on a page to
convey the impression of boxes and lines.  Simple and effective, but
ASCII art is so... 70s.  Then there's Postscript.  Postscript is
Adobe's page description language.  There are a lot of Postscript
printers in the world, but there are a lot that aren't, too.  You
could require all your users to own Postscript printers, but you risk
alienating the small users who use the system only occasionally and
print even less --- a cheap Postscript printer is still a large
fraction of the cost of a cheap computer.  And forget about
platform-specific graphics (PICT, BMP, GIF, etc.) unless you want to
be doing conversions all the time (you don't, because this irritates
the user and usually sacrifices image fidelity).

TeX to the rescue!  TeX is a document preparation language designed by
Donald Knuth to print the fourth through eighth volumes of his opus
magnus on compuer science.  (TeX has been frozen since 1985; volume 3
appeared in the early 70s --- I guess book schedules are like software
schedules.)  TeX has a lot of features that suggest a page description
language in terms of degree of control, but it's really oriented
toward describing entire documents rather than individual pages.  TeX
is free, it's stable (no changes since '85), it runs on almost every
computer and OS, and can be used with any printer that does graphics
(Postscript can be used, but is not required).  Consider producing
printed documents (orders, bill of material, rack layouts, etc.) by
generating TeX code and automatically sending the code to a TeX
process for printing.  This will save you a lot of work trying to come
up with a platform-independent way of doing printing.

** Emacs
Emacs is cool.  It can edit your text.  Don't like the way it works?
You can write an extension or rebind keys to do your bidding.  Emacs
can even slice bread --- of course, you'd have to write an extension.

Emacs can not only edit text, it can manipulate it sophisticated ways.
There are a lot of powerful text-manipulation primitives built in to
Emacs and accessible through its extension language.  File I/O and
full-screen display is a gimme, as are sophisticated text matching and
replacement facilities.  The user's manual alone is several hundred
pages long, while the extension language has about eight hundred
pages.  If you don't like paper manuals, there's a built-in help
system.  On most systems you can call out to the OS from inside Emacs;
you can use this to do processing or communications.  People have
written databases, forms systems, timecard systems, hypertext
browsers, compilers, email systems, outliners, appointment calendars,
revision control systems, games, and other things using Emacs'
extension language.

Emacs is free.  Source code is available, also for free.  Emacs runs
under Unix, DOS, Windows, OS/2, Macintosh System 7, VMS, and a whole
bunch of other systems you could care less about.  The manuals and
lots of add-ons are also freely available in electronic form.

** Documentation
Getting documentation to the end-user is always a challenge.  Put it
online, they won't read it.  Print it, they won't read it.  Read it to
them, they won't remember.  Not much you can do about any of that...

There are some problems you can fix, though.  One is to make the
online documentation match the printed documentation.  Unless you want
to solve this problem multiple times (hint: expensive tools like Adobe
Acrobat give you manuals that you can read on line on a couple
different kinds of computers, _not_ online help), you'll need
something that really works across platforms.  This means plain
vanilla text (no, Virginia, there is no viable cross-platform display
graphics standard).  Of course, you'd rather not have all your manuals
in this cheesy monospaced font...

Enter TeXInfo.  This is a language and a couple of programs.  The
language lets you write documentation that will look good in plain
text on lowest-common-denominator screens, and at the same time (with
the assistance of the TeXInfo translator and the TeX program
(described elsewhere) produce beautiful typeset manuals.  The nice
part is that the source text is the same; you don't have to maintain
two separate versions.  The other program, called Info, lets you
browse the online help screens using text menus and hypertext
reference links.  All this is free, available in source form; you
could easily make Info a part of your HI for online help.

* Other considerations
This section is a catch-all for ideas that haven't yet found their way
into a Tornado task of some kind.

** Layered Architectures
... has talked about additional architectural layers on top of
CRISP.  One layer provides a framework for managing changes in the
operational state --- rules, data, access, etc. --- of the system.
There are other layers; ... can give you details.

While on the subject of layered architectures, I should note that ...
has observed that Tornado is a ``system architecture'' that
can effectively sit on top of a ``technical architecture'' such as
VITAL.