Chapter 28 - Practical Techniques for Programming

In this chapter, we'll learn some brief yet useful guidelines for Lisp style, followed by practical advice on tradeoffs among debugging, performance, and readability.

Elements of Lisp style

The art of Lisp style is simpler and more fruitful than in most other languages. Lisp's simple, consistent syntax eliminates the need for the rules of style that plague more complicated languages. And the direct, standardized availability of complex functionality within Lisp helps to drive down the size of programs, thereby providing improved readability through brevity. (In my experience, a Lisp program may range in size from 5 percent to 20 percent of an equivalent C++ program.)

Furthermore, the universal availability of Lisp-aware program editing tools -- such as Emacs or its equivalent built into many Lisp IDEs -- means that you can let the computer handle the details of indentation that are so important to the ability of a person to comprehend the structure of a Lisp program. I've said this before, but it bears repeating: you should not be programming in Lisp without the aid of a Lisp-aware editor.

So, if we don't have to worry about conventions for spelling, capitalization, indentation, and other such mundane details, then what remains for us to discuss as elements of Lisp style? How about things that are truly important? Lisp programming style is about the choice of proper abstractions, and about communicating not just with the compiler, but with other people who will eventually read your program. (For that matter, good style will help you read your own program some months or years in the future.)

Property lists are handy for small (very small) ad-hoc databases

A long, long time ago the capabilities of a typical Lisp implementation were very much less than what you'll find in any Common Lisp system today. After all, Lisp has been around for over forty years since John McCarthy first invented the notations (see Chapter 34). Strangely, when Lisp is taught at all in computer science curricula, it is taught using a circa-1965 view of the state of Lisp implementations: interpreted execution, limited data structures, and no real application beyond the manipulation of symbols.

Unfortunately, authors and publishers of Lisp textbooks did little to help correct these misperceptions, ignoring Common Lisp (and indeed, many of its recent forebears) in highly-recommened Lisp textbooks published as recently as 1989.

In the bad old days -- when Lisp didn't have arrays, vectors, hash tables, structures, or CLOS -- programmers learned to rely heavily on property lists as an important mechanism for structuring data. You'll still find -- in bookstores and on the shelves of college libraries -- Lisp and AI books that recommend the use of property lists as the underlying basis for looking up values identified by a symbolic key.

A property list is a list of alternating keys and values. For example: the list (SEX MALE PARENTS (BOB JANE) OCCUPATION MUSICIAN) establishes these relations:

relation   value
---------- ----------
sex        male
parents    (bob jane)
occupation musician
When you attach these relations to a symbol, say JAMES, then you have a way to get information related to the symbol JAMES, namely the properties having the names SEX, PARENTS, and OCCUPATION.

In Common Lisp you can retrieve a symbol's property using (GET symbol property-name), and set a property using (SETF (GET symbol property-name) property-value).

While property lists were useful substitutes for more capable data structures in ancient Lisp implementations, they find few uses in modern Lisp programs. One problem is efficiency. Every time you ask Lisp to retrieve a property value, it must locate the symbol's property list and then search the list for a matching key. If you have five or six properties on a symbol, the search may or may not be faster than using a hash table; the exact crossover point will depend upon your particular Lisp implementation. The other problem is that properties are shared among all parts of your program. It's not too difficult to imagine two programmers using a property named COLOR in two different ways in two parts of the same program. Imagine their surprise when they discover the conflict...

At any rate, you should become familiar with all of the capabilities of Common Lisp, and learn to recognize that information about older Lisp implementation which still haunts us through the mists of history.

Declarations help the compiler, sometimes

Common Lisp defines the following declarations:

declares a variable to have dynamic (not lexical) scope
instructs the compiler how to weight the relative importance of
declares that the programmer expects the lifetime of a function or variable to end when control leaves the enclosing form
declares that a variable will always have values of a given type
declares that a function should expect arguments of specified types, and that the function will return values or given types
declares that a variable is not referenced
declares that a variable may not be referenced
declares that the programmer would like a function to be compiled as inline code
declares that the programmer does not want a function to be compiled as inline code

Of these, only the first and last must be implemented. The rest are advisory; depending upon the implementation, the compiler may or may not honor the given advice. If you've programmed in other languages, you may find it strange that most of Lisp's declarations are advisory rather than mandatory. So let's dig a bit deeper and see what this really means to you.

Lisp by default must have the capability to determine the type of every variable at runtime. This is not to say that a sufficiently smart compiler can't infer at compile time that a variable will always be of a particular type and generate code that does not need to check types at run time. However, an actual "sufficiently smart compiler" remains an elusive creature, much like the Yeti, Bigfoot and the Loch Ness Monster.

Declarations allow the programmer to pass metainformation to the compiler. This is not part of the program, but rather information about the program. Declarations can help the compiler to generate better code by providing information about the programmer's intent.

For example, if you declare that a variable will always be a FIXNUM (an integer value that fits in a single machine word) then the compiler can emit code to load that variable directly into a register in preparation for the next operation. If you declare the result of the operation to also be a FIXNUM, then the compiler can generate code to perform the operation and store the result using simple machine instructions without first checking the actual type of the value. Given such declarations, a good Lisp compiler can generate code comparable to a low-level language in which operations and types in the language map directly onto the underlying machine.

But there's a risk. If you declare certain types, and the compiler emits code that optimizes the program according to your declarations, and the program then contradicts those declarations by providing a value of a different type at runtime, then bad things will happen. Tell the compiler to expect two numbers to add, then pass it a number and a symbol, and all bets are off.

Fortunately, the declarations that guide the compiler are themselves moderated by the OPTIMIZE declaration. The OPTIMIZE declaration lets you instruct the compiler about the relative importance of certain properties of the program. You can specify the relative importance of the SPEED, SPACE, and SIZE of the generated code. You can specify whether you'd like to allow the compiler to spend extra time doing a better job, or to emphasize COMPILATION-SPEED. You can specify the importance of being able to DEBUG your program, which may cause the compiler to produce code that is simpler or interacts well with the debugger.

Values range from 0 to 3 for the OPTIMIZE declarations, with 0 meaning "totally unimportant" and 3 meaning "most important". The default value is 1, meaning "of normal importance". Bear in mind that for something to be relatively more important, something else must be less important; it won't give the compiler any useful guidance to specify values of 3 for all of the OPTIMIZE declarations.

Of all the OPTIMIZE declarations, the most important is SAFETY, since this affects the amount of trust the compiler is willing to extend to your type declarations. A high value of SAFETY generally compels the compiler to check the type of every value that it can't absolutely determine at compile time. Lower SAFETY values put increasing weight upon your abilities as a programmer to guarantee that type declarations are correct, array bounds are always in range, etc.

The exact effect of declarations (with the exception of SPECIAL and NOTINLINE) varies among Lisp implementations; consult your reference manual for details.


Although not required by the Common Lisp standard, almost all implementations require that you load code from a file. (The one exception that I know of is the Venue Medley environment, which normally saves the entire Lisp world when you end a session. Medley also keeps track of new definitions created in the listener and allows you to save just those definitions to a file.)

In a file-based Lisp environment, you'll normally add definitions to a file of source code. One reason for so doing is to periodically save your work; unless you're debugging FFI code or running buggy Lisp code with a low optimization value for safety, your Lisp environment will almost never crash. However, other disasters can happen -- another application could crash and bring down the system in an unprotected OS such as the Mac OS or Windows, the power could fail, or your cat could walk across the keyboard when you leave to refill your coffee.

As your program gets larger, you may find that it's useful to reload an entire source file after making a series of changes. Most Lisp environments also let you evaluate one definition at a time in any open window. This is quite useful because you can edit, then recompile, one definition at a time. But sometimes you'll forget, and then it's easier to just reload the entire file than to spend time figuring out which definition you might have forgotten to recompile after last changing its definition.

But you may also be in the midst of debugging your program when you'd like to reload its source code. If your program uses any global variables to keep track of its state, you really don't want to reinitialize these in the midst of your debugging session. So, how do you handle this? You could put definitions of your program's state variables in a separate file, but that increases your mental workload and increases debugging time by splitting clearly interrelated parts of your program into two separate files. (I know that this is an accepted practice in many programming languages, but it really does increase the amount of work you do as a programmer. Imagine how much less pleasurable reading a novel would be if the novel was delivered as a set of pamphlets, one per character, and you had to follow page references to get to the next part of the dialog.)

Fortunately, Lisp has grown through decades of real-world programming experience, and has a very simple mechanism to handle whether variables get reinitialized or not when you load a file. You use DEFVAR to declare variables with values that need to be initialized only once. To declare a variable with an initial value that gets set every time its defining form is evaluated, use DEFPARAMETER.

One final note: as a matter of form, you should name global variables with a leading and trailing asterisk, as in *MY-ADDRESS*. Think of this convention as a courtesy to those who will maintain your code at some future date.

Define constants with DEFCONSTANT

You should define global constants using DEFCONSTANT. From the viewpoint of reading a Lisp program, the distinction between DEFPARAMETER and DEFCONSTANT is that the value defined by DEFPARAMETER could concievably be altered by the user after the program is compiled, but a DEFCONSTANT value will never change. A good Lisp compiler will take advantage of DEFCONSTANT declarations to perform classical optimizations such as constant folding or compiling immediate load instructions.

Fewer Lisp programmers follow a naming convention for constants. The one I use puts a leading and trailing plus sign on the name of the constant, as in +RTS-OPCODE+.

Know when (not) to use the compiler

Most Lisp systems include both an interpreter and a compiler; when both are available, you'll normally find that it's easier to debug interpreted code. Consult your vendor's documentation to learn how to switch between the interpreter and the compiler.

Of course, when performance is important, you'll want to run your code compiled once it's debugged. But see the earlier cautions about running buggy code with low safety settings.

When you're writing Lisp code to run on multiple platforms, it's safest to assume that code will run interpreted unless you call COMPILE or COMPILE-FILE. For this reason, you should develop the practice of writing (or using) a system definition procedure that first loads all of your Lisp source files, then compiles them, then loads the compiled files. This is usually overkill, but it's a very safe, conservative approach. With suitable source code organization and proper use of EVAL-WHEN you can reduce the number of source files that must first be loaded; the main idea is to ensure that all macros are defined before compiling code that uses the macros, but there are other possible situations that can depend upon the current state of the Lisp world.

Speed vs. ability to debug

Interpreted programs are easier to debug because it's easier for the debugger to access the actual source code at the point of an error. Once you've compiled your program, the debugger typically has less source information available; you may find yourself puzzling over a transformed version of the source code or grovelling through assembly-language instructions to find the source of the error. Fortunately, the need for such low-level debugging will be rare if you follow some simple advice:

  1. Keep high SAFETY optimizations on untested code.
  2. If an interpreter is available to you, use it until your code is working well.
  3. If you have a compile-only environment, use lower optimization settings for SPEED and higher settings for DEBUG.

Once your code is running well, then you should compile it and adjust the optimization declarations for performance. If you find that simply compiling your program provides adequate performance, leave it alone. If the performance of the compiled program falls far below your expectations, first improve the algorithm; optimization declarations typically have a fractional impact upon performance.

Efficiency: spotting it, testing it

The first rule of efficiency in any programming language is to start with an efficient algorithm. It's a little harder to spot inefficiencies in a Lisp program because the underlying operations don't usually map directly onto a hardware instruction. But with a certain amount of knowledge and practice, you should be able to tell why the following four programs have radically different resource requirements.

These four programs return the sum of a list of numbers, but do it in different ways. In each case, we test the program with the TIME form, which reports run time and memory allocation. Each program is tested twice, once with a list of ten thousand elements, then again with one hundred thousand.

;; Runtime increases as the square of the number of elements 
? (defun sum-list-bad-1 (list)
    (let ((result 0))
      (dotimes (i (length list))
        (incf result (elt list i)))
? (let ((list (make-list 10000 :initial-element 1)))
    (time (sum-list-bad-1 list)))
(SUM-LIST-BAD-1 LIST) took 2,199 milliseconds (2.199 seconds) to run.
Of that, 102 milliseconds (0.102 seconds) were spent in The Cooperative Multitasking Experience.
 16 bytes of memory allocated.
? (let ((list (make-list 100000 :initial-element 1)))
    (time (sum-list-bad-1 list)))
(SUM-LIST-BAD-1 LIST) took 336,650 milliseconds (336.650 seconds) to run.
Of that, 15,680 milliseconds (15.680 seconds) were spent in The Cooperative Multitasking Experience.
 2,704 bytes of memory allocated.

;; Recursive version works when compiler does tail-call optimization 
? (defun sum-list-bad-2 (list)
    (labels ((do-sum (rest-list sum)
               (if (null rest-list)
                 (do-sum (rest rest-list) (+ sum (first rest-list))))))
      (do-sum list 0)))
? (let ((list (make-list 10000 :initial-element 1)))
    (time (sum-list-bad-2 list)))
(SUM-LIST-BAD-2 LIST) took 2 milliseconds (0.002 seconds) to run.
? (let ((list (make-list 100000 :initial-element 1)))
    (time (sum-list-bad-2 list)))
(SUM-LIST-BAD-2 LIST) took 21 milliseconds (0.021 seconds) to run.

;; The recursive version can fail w/o tail-call optimization 
? (defun sum-list-bad-3 (list)
    (declare (optimize (debug 3)))  ; disable tail-call optimization 
    (labels ((do-sum (rest-list sum)
               (if (null rest-list)
                 (do-sum (rest rest-list) (+ sum (first rest-list))))))
      (do-sum list 0)))
? (let ((list (make-list 10000 :initial-element 1)))
    (time (sum-list-bad-3 list)))
> Error: Stack overflow on control stack.

;; The iterative version is not as elegant, but it's fast! 
? (defun sum-list-good (list)
    (let ((sum 0))
      (do ((list list (rest list)))
          ((endp list) sum)
        (incf sum (first list)))))
? (let ((list (make-list 10000 :initial-element 1)))
    (time (sum-list-good list)))
(SUM-LIST-GOOD LIST) took 1 milliseconds (0.001 seconds) to run.
? (let ((list (make-list 100000 :initial-element 1)))
    (time (sum-list-good list)))
(SUM-LIST-GOOD LIST) took 10 milliseconds (0.010 seconds) to run.

The first version, SUM-LIST-BAD-1, harbors a hidden inefficiency: (ELT LIST I) must search LIST from the beginning for each value of I. In other words, ELT must examine one element when I is 1, two elements when I is 2, and so on. For a list of length N, ELT will examine almost N-squared elements. Have a look at the runtimes for 1,000 and 10,000 elements.

The second version is coded using an awareness of how lists are accessed; the helper function DO-SUM calls itself recursively with the tail of the list it is given. In SUM-LIST-BAD-2, the runtime increases linearly with the length of the input list. So why is this a bad example?

DO-SUM calls itself as the last form it executes; this is known as tail recursion. Some compilers can compile tail recursion as a jump instruction instead of a function call; this eliminates the growth of the control (function return) stack that would otherwise occur in a recursive call. However, the Common Lisp standard does not require that tail calls be optimized.

The third version shows what can happen when the compiler does not optimize tail recursion. The compiler in my Lisp system disables tail recursion optimizations when the DEBUG optimization is set to 3. Without tail recursion optimization, SUM-LIST-BAD-3 consumes a function call frame for each recursive call, and the program fails -- exhausting stack space -- before reaching the end of the test list.

The final version, SUM-LIST-GOOD, uses iteration instead of recursion for its control loop, and walks down the input list element by element. It runs slightly faster than SUM-LIST-BAD-2 and doesn't fail if the compiler doesn't support tail recursion optimization.

Recognizing inefficiency, profiling; performance vs. readability

Avoidance is the best defense against inefficiency. Use appropriate data structures and control techniques. When you're not sure, put together a small test program and time it with a variety of inputs.

Every Common Lisp implementation will have the TIME macro that we used to show the differences in the SUM-LIST-xxx functions. You can use this to examine and tune small portions of a program.

Once you have assembled a larger program, you may need to find the bottleneck that causes unexpectedly low performance. For this, you'll need a profiler. A profiler watches the execution of your whole program and generates a report to show where the program spends its time. Some profilers also report on memory allocation. A profiler is not a standard part of Lisp, but most vendors provide one. Consult your vendor's documentation.

Lisp provides abstractions that help you solve problems. You'll find that you don't have to make a tradeoff between readability and performance; an efficient Lisp program is usually one that is written using the most appropriate abstractions and operations to solve a given problem.

Contents | Cover
Chapter 27 | Chapter 28 | Chapter 29

Copyright © 1995-2001, David B. Lamkins
All Rights Reserved Worldwide

This book may not be reproduced without the written consent of its author. Online distribution is restricted to the author's site.