Programming Considered Harmful

Erann Gat
Positition paper presented at the Feyerabend Workshop, Oopsla 2001

Welcome, Reddit readers!

Please note that this paper was written specifically to spark discussion in a workshop focused on "reinventing computing." Its content should be taken with a small grain of salt.

Where we are

The PC revolution was born to a large extent out of the frustration of a group of young people with the "high priesthood" of machine operators who controlled access to mainframe computers. Considering the power of that vision it has been astonishingly short-lived. It was born with the availability of cheap personal computers around 1975, and it began to die with the introduction of the IBM PC less than ten years later. To be sure users have had physical control over their machines for longer than that, but logical control has steadily eroded. Today, even physical control is being lost with the rise of client-server architectures. Using a computer today is, for the average consumer, not so different from using an IBM3720 terminal connected to a mainframe thirty years ago. The graphics are flashier and the price is lower, but what you can do with it is still controlled to a large extent by the high priests, the programmers. And just as in 1970, the high priests are not free to do as they wish either. Their fates are controlled by corporations, IBM in 1970, Microsoft, Sun and Cisco in 2001. For those in whom the spirit of the original microcomputer revolution lives on, this is a Bad Thing.

On this dark horizon there are some significant beacons of hope. The Internet somehow managed to grow into a mass-market phenomenon without any corporations gaining a significant measure of control over it (yet). Linux and Perl have had perceptible impacts on the computing landscape. And while I would not hold up Perl as a shining example of what computing ought to be, it does demonstrate that it is possible for some individual's vision (or some small group of individuals' vision) of what computing ought to be (or what some aspect of it ought to be) to succeed.

Nonetheless, none of these efforts address what I see as the fundamental problem, which the wide gulf that separates programmers and users. As long as this gulf exists life will be miserable. The reason is simple: the money comes from users, so the users' desires trump the programmers'. Unfortunately, the producers of shoddy software have managed to hoodwink users into believing that software in *inherently* unreliable and brittle. Having to reboot your machine three times a day is like getting the oil changed on your car: annoying, but necessary maintenance. In such a world one has only three choices: either join the producers of shoddy software, become a consumer of shoddy software, or find a way to make a living that doesn't involve computers.

It may be that producing good software is just so much harder than producing shoddy software that the market has made a rational decision in opting for shoddy software because good software would cost so much more. In fact, the opposite is true: good software is almost always cheaper than shoddy software. For example, Linux, which is vastly more reliable than Windows, is free. Why doesn't the market flock to it? Because Linux (and unix in general) appeals to programmers, not users. Only people on the programmer side of the programmer-user divide understand enough about software to recognize the advantages of Linux. But these people don't drive the market, the users do.

There have been efforts in the past to "bring programming to the masses" (e.g. Logo) that have failed, prompting many to conclude that programming is just inherently difficult, and it is this inherent difficulty that leads to the programmer-user divide. Certainly some aspects of programming are difficult. It is probably unreasonable to expect the average user to grok the Y combinator. On the other hand, there have been some accidental successes at bringing programming to the masses. HTML, for example, is a programming language (not a general-purpose language to be sure, but a programming language nonetheless) that has proven accessible to many non-programmers, despite (or perhaps because of) not having been designed for that purpose.

Where we should be

I envision a world where the programmer-user divide does not exist (or at least is not so prominent, or perhaps is more of a continuum than a divide). In such a world, the people who drive the market, the ones we now call "(l)users", would be to some extent involved in the business we now call "programming". As a result, they would have enough understanding to drive the market towards higher levels of quality than we have today. As a side effect power would once again be wrested from the high priests and placed in the hands of "ordinary people". The world would be a better place.

How we got here

It's all the mathematicians' fault. What we call computers today are descended from machines that were invented by mathematicians to compute, that is, to solve math problems. Computers are largely not used for computing any more, but we are left with some unfortunate aspects of this legacy.

Solving a math problem is an activity with a distinctive structure. For example, parts of the process are mechanizable and can be performed by a machine. Moreover, it is not necessary to build a new machine for each math problem one wishes to solve. It is possible to build a universal machine that can in principle solve any solvable math problem (for some suitable definition of solvable).

Another feature of solving math problems is that at some point the process generates a *result*, at which point the process *ends*. The economic value of solving math problems generally lies in the result, and not in the process. (There are exceptions, such as when one is solving math problems for entertainment.)

It turns out that the sort of machines one ends up building to solve math problems is also good for other things, like editing documents or playing video games. The structure of these activities is fundamentally different than the structure of math problem solving. The economic value of video game playing or document editing lies in the process, not the result. In the case of video game playing there is no result. In the case of document processing there is arguably a result (the document) but the value of computer assisted document processing lies in the fact that the process of creating the document is more efficient using a computer than using other tools like typewriters and carbon paper. The computer contributes little to the result, the content of the document (the exception being things like spelling correction, and even there it is not clear that the computer is making a net contribution).

The problem is that the methods we have developed for creating software, while they work reasonably well for results-oriented computations, don't work nearly as well for process-oriented ones. The fundamental problem is that our methods make an unwarranted assumption, namely, that the process of creating software, like the process of computing, *ends* and generates a *result* called a *program*. The fact that the parts of the software creation process itself can be couched as a math problem tends to reinforce this view.

How to get there

I claim that it is a mistake to conflate the concept of "software" with that of "program." Software is necessary. Computers won't do anything useful without it. Programs, on the other hand, are dispensable. It is possible to create software without writing programs. To oversimplify: if we could eliminate programs then we would have no more need of programmers. No programmers, no programmer-user divide.

There are existing systems that take tentative steps towards a program-free world. Users of Lisp and Smalltalk create software by "interacting with an environment" rather than "writing a program." Labview users create software by dragging and dropping graphical icons. The concept of "a Lisp program" or "a Labview program" is not so crisply defined as "a C program." But I think we need to go much further than just putting a graphical or interactive window dressing on business-as-usual.

"Program" manifests itself not only in the creation of software but also, and much more significantly, in its delivery. Delivering software is nowadays synonymous with delivering executable code, and more often than not, delivering a complete self-contained application. Programmers are trained to believe that getting the algorithm to work is not enough. You have to put the user interface on it before the program is "done". As a result, an enormous amount of programmer time (I conjecture that it is actually a majority of programmer's time, though I don't have any data) is wasted continually and laboriously reinventing the same things over and over again: window-based UI's and (bad) command-line parsers.

It doesn't have to be this way. Lisp programmers, for example, rarely write command line parsers because they can simply use S-expressions and use the Lisp reader to parse them. Moreover, Lisp programmers rarely have to learn new command line syntaxes for the same reason. S-expressions are a universal command-line syntax. Everything can be expressed as an S-expression.

Contrast this state of affairs with the unix command line syntax (or lack thereof). Because there is no universal syntax, the unix world is a Babel of dozens of different syntactic conventions. For example, sometimes the '-' sign is an operator (subtraction), sometimes it is a keyword indicator (the conventional command syntax), sometimes it indicates a function call (in conditionals in some shells), sometimes it denotes a range (e.g. in some regular expression syntaxes), etc. etc. etc. In lisp '-' has no special syntactic meaning. It is merely a constituent character like the letters of the alphabet.

(NOTE: S-expressions have been around long enough that it is clear that they will not bring about the synthesis I seek. I cite them merely as an existence proof that vast simplification of the current state of affairs is possible.)

The situation on the user side of things is not much different. Every application has a different set of user interface conventions. Things have gotten so complicated that people have started to take it for granted that they have to be "trained" to use a computer. Again, it doesn't have to be this way. In the early days of the Macintosh there were programs that required no training (and no documentation!) to use.

Today's applications require training because they have more features than MacDraw. Users have been conditioned to think that features are a good thing. A spreadsheet that has a spelling checker seems better than a spreadsheet that doesn't. I disagree. A spreadsheet shouldn't *have* a spelling checker. It should be able to somehow *interface* to a spelling checker, but I should be able to use that *same* spelling checker to check the spelling in the documents that I edit using my word processor, which shouldn't *have* a spelling checker either. For that matter, I should be able to use my word processor to edit the text inside a cell of my spreadsheet. I should be able to use my spreadsheet to update the numbers in my word processor. (Microsoft OLE actually lets you do some of these sorts of things, but the granularity is much coarser than what I envision.)

So, to sum up: today we live in a world where software is organized as different kinds of programs. There are operating systems, applications, middleware, compilers, but they are all programs. They tend to be big and encompass a lot of functionality, and making two of them work together to perform a task neither one can do alone is often a major chore. Programs are hard to write *and* hard to use, and the skills required to create a program are different from the skills required to use one.

More details on where we should be

In my utopia software is organized as modules (an arbitrary term chosen to be non-evocative). There are lots of different kinds of modules. They are all small (in terms of the engineering effort that went into creating them), and by themselves they do very little. But they adhere to a set of conventions for specifying their interfaces that it is relatively easy to assemble modules together to perform complex tasks. It is analogous to using pipes to compose commands in unix, but the kinds of interactions allowed across module interfaces are much more complex and rich than a simple unidirectional byte stream.

Creating software (which is the same thing as "using a computer") consists of assembling modules to do the things you want to do. For example, one can take a module for a simple but inefficient compiler and connect it to a module that performs a certain class of optimization to obtain a more efficient compiler. Of course, what is called a compiler in my world is a different beast than what is called a compiler in the real world. They both produce machine code as output, but in my world the input to a compiler is not a programming language. (There are no programming languages because there are no programs.) There are various ways of specifying the behavior of modules, some of which are textual, but the "parsers" for these are of course modules as well and are therefore interchangeable. (This is very similar to Charles Simonyi's ideas for intentional programming, though Simonyi still wants to use these techniques for generating programs.)

One of the big conceptual differences between manipulating modules and writing programs is in the relationship between size and generality. Small programs (in terms of source code size) tend to run faster than larger ones, and have less capability. By contrast, a "small" module (that is, a module with a small amount of the human input, the modular equivalent of "source code") tends to run slow but have a lot of capability. In order to produce a module that runs fast you have to give the module compiler additional information, like type declarations. Engineering effort when writing programs tends to center on adding features and generality. By contrast, engineering effort in module design focuses on *removing* features and generality in order to gain efficiency. Thus, in the programming world, by default you get fast programs that don't work well. In the module world by default you get slower programs that do work well. By and large the fact that modules run less efficiently than programs isn't noticeable, since machines are so fast. But what is noticeable is that people have become accustomed to software that is robust and easy to understand. If someone overoptimizes a module by removing a necessary feature, computer users reject that module as unacceptable. Likewise, if someone produces a module that has too many features and is therefore too complex that module also fails.

A few other common concepts have gone by the wayside. For example, there are no more "files". Every interaction one has with a computer changes the state of some module, and by default modules know how to render themselves non-volatile by storing themselves in what we would call a database. By default, modules come with sophisticated access control, authentication, and revision control functionality (implemented by other modules). Thus, a given object knows who created it, who owns it, who is allowed to access it, who is allowed to modify it, and how to back out changes. Of course, it is possible to create modules without these capabilities, which would give you something close to what we would call an "in-memory object". (One could also make a module that behaved like a "file" by removing the authentication and revision control features, but you'd be considered a little backward if you did that.)