The Hundred-Year Language

2

April 2003

3

(This essay is derived from a keynote talk at PyCon 2003.)

4

It's hard to predict what life will be like in a hundred years.

5

There are only a few things we can say with certainty.

6

We know that everyone will drive flying cars, that zoning laws will be relaxed to allow buildings hundreds of stories tall, that it will be dark most of the time, and that women will all be trained in the martial arts.

7

Here I want to zoom in on one detail of this picture.

8

What kind of programming language will they use to write the software controlling those flying cars?

9

This is worth thinking about not so much because we'll actually get to use these languages as because, if we're lucky, we'll use languages on the path from this point to that.

10

I think that, like species, languages will form evolutionary trees, with dead-ends branching off all over.

11

We can see this happening already.

12

Cobol, for all its sometime popularity, does not seem to have any intellectual descendants.

13

It is an evolutionary dead-end-- a Neanderthal language.

14

I predict a similar fate for Java.

15

People sometimes send me mail saying, "How can you say that Java won't turn out to be a successful language?

16

It's already a successful language."

17

And I admit that it is, if you measure success by shelf space taken up by books on it (particularly individual books on it), or by the number of undergrads who believe they have to learn it to get a job.

18

When I say Java won't turn out to be a successful language, I mean something more specific: that Java will turn out to be an evolutionary dead-end, like Cobol.

19

This is just a guess.

20

I may be wrong.

21

My point here is not to dis Java, but to raise the issue of evolutionary trees and get people asking, where on the tree is language X?

22

The reason to ask this question isn't just so that our ghosts can say, in a hundred years, I told you so.

23

It's because staying close to the main branches is a useful heuristic for finding languages that will be good to program in now.

24

At any given time, you're probably happiest on the main branches of an evolutionary tree.

25

Even when there were still plenty of Neanderthals, it must have sucked to be one.

26

The Cro-Magnons would have been constantly coming over and beating you up and stealing your food.

27

The reason I want to know what languages will be like in a hundred years is so that I know what branch of the tree to bet on now.

2–9

It's hard to predict life in a hundred years, but here I'll zoom in on one detail: what programming language will they use? This matters not because we'll use it, but because, if we're lucky, we'll use languages on the path to it.

10–18

Like species, languages will form evolutionary trees with dead-ends branching off all over. Cobol, for all its popularity, has no intellectual descendants; it's a Neanderthal language. I predict the same for Java.

19–23

This is just a guess. My point isn't to dis Java but to get people asking, where on the tree is language X? Staying close to the main branches is a useful heuristic for finding languages good to program in now.

24–26

At any given time you're probably happiest on the main branches. Even when there were plenty of Neanderthals, it must have sucked to be one: the Cro-Magnons would have been constantly coming over and beating you up and stealing your food.

27

The reason I want to know what languages will be like in a hundred years is so that I know what branch of the tree to bet on now.

2–27

Like species, programming languages form evolutionary trees with dead-ends branching off everywhere. Knowing which branches lead somewhere is a useful heuristic for picking a language to use now.

29

The evolution of languages differs from the evolution of species because branches can converge.

30

The Fortran branch, for example, seems to be merging with the descendants of Algol.

31

In theory this is possible for species too, but it's not likely to have happened to any bigger than a cell.

32

Convergence is more likely for languages partly because the space of possibilities is smaller, and partly because mutations are not random.

33

Language designers deliberately incorporate ideas from other languages.

34

It's especially useful for language designers to think about where the evolution of programming languages is likely to lead, because they can steer accordingly.

35

In that case, "stay on a main branch" becomes more than a way to choose a good language.

36

It becomes a heuristic for making the right decisions about language design.

37

Any programming language can be divided into two parts: some set of fundamental operators that play the role of axioms, and the rest of the language, which could in principle be written in terms of these fundamental operators.

38

I think the fundamental operators are the most important factor in a language's long term survival.

39

The rest you can change.

40

It's like the rule that in buying a house you should consider location first of all.

41

Everything else you can fix later, but you can't fix the location.

42

I think it's important not just that the axioms be well chosen, but that there be few of them.

43

Mathematicians have always felt this way about axioms-- the fewer, the better-- and I think they're onto something.

44

At the very least, it has to be a useful exercise to look closely at the core of a language to see if there are any axioms that could be weeded out.

45

I've found in my long career as a slob that cruft breeds cruft, and I've seen this happen in software as well as under beds and in the corners of rooms.

46

I have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores.

47

The more of a language you can write in itself, the better.

29–36

Language evolution differs from species because branches can converge—Fortran seems to be merging with the descendants of Algol—because designers deliberately incorporate ideas from other languages. So they can steer, and "stay on a main branch" becomes a heuristic for good design decisions.

37–41

Any language divides into two parts: fundamental operators that play the role of axioms, and the rest, which could in principle be written in terms of them. The operators are the most important factor in long-term survival; the rest you can change. It's like buying a house: everything else you can fix later, but you can't fix the location.

42–45

It's important not just that the axioms be well chosen, but that there be few of them, as mathematicians have always felt. In my long career as a slob I've found that cruft breeds cruft.

46–47

I have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores. The more of a language you can write in itself, the better.

29–47

Branches converge because designers borrow ideas deliberately. A language divides into fundamental operators that act as axioms and the rest; the main branches pass through the smallest, cleanest cores.

49

Of course, I'm making a big assumption in even asking what programming languages will be like in a hundred years.

50

Will we even be writing programs in a hundred years?

51

Won't we just tell computers what we want them to do?

52

There hasn't been a lot of progress in that department so far.

53

My guess is that a hundred years from now people will still tell computers what to do using programs we would recognize as such.

54

There may be tasks that we solve now by writing programs and which in a hundred years you won't have to write programs to solve, but I think there will still be a good deal of programming of the type that we do today.

55

It may seem presumptuous to think anyone can predict what any technology will look like in a hundred years.

56

But remember that we already have almost fifty years of history behind us.

57

Looking forward a hundred years is a graspable idea when we consider how slowly languages have evolved in the past fifty.

58

Languages evolve slowly because they're not really technologies.

59

Languages are notation.

60

A program is a formal description of the problem you want a computer to solve for you.

61

So the rate of evolution in programming languages is more like the rate of evolution in mathematical notation than, say, transportation or communications.

62

Mathematical notation does evolve, but not with the giant leaps you see in technology.

49–54

I'm assuming we'll even be writing programs—won't we just tell computers what we want? There's been little progress there, so my guess is people will still tell computers what to do using programs much like the ones we write today.

58–62

Looking a century out is graspable because languages evolve slowly: they aren't really technologies, they're notation. A program is a formal description of a problem, so their evolution is more like that of mathematical notation, which evolves without the giant leaps you see in technology.

49–62

Will we even write programs in a hundred years? Probably yes, ones we'd recognize. Languages evolve slowly because they're notation, not technology, so a century out is graspable.

64

Whatever computers are made of in a hundred years, it seems safe to predict they will be much faster than they are now.

65

If Moore's Law continues to put out, they will be 74 quintillion (73,786,976,294,838,206,464) times faster.

66

That's kind of hard to imagine.

67

And indeed, the most likely prediction in the speed department may be that Moore's Law will stop working.

68

Anything that is supposed to double every eighteen months seems likely to run up against some kind of fundamental limit eventually.

69

But I have no trouble believing that computers will be very much faster.

70

Even if they only end up being a paltry million times faster, that should change the ground rules for programming languages substantially.

71

Among other things, there will be more room for what would now be considered slow languages, meaning languages that don't yield very efficient code.

72

And yet some applications will still demand speed.

73

Some of the problems we want to solve with computers are created by computers; for example, the rate at which you have to process video images depends on the rate at which another computer can generate them.

74

And there is another class of problems which inherently have an unlimited capacity to soak up cycles: image rendering, cryptography, simulations.

75

If some applications can be increasingly inefficient while others continue to demand all the speed the hardware can deliver, faster computers will mean that languages have to cover an ever wider range of efficiencies.

76

We've seen this happening already.

77

Current implementations of some popular new languages are shockingly wasteful by the standards of previous decades.

78

This isn't just something that happens with programming languages.

79

It's a general historical trend.

80

As technologies improve, each generation can do things that the previous generation would have considered wasteful.

81

People thirty years ago would be astonished at how casually we make long distance phone calls.

82

People a hundred years ago would be even more astonished that a package would one day travel from Boston to New York via Memphis.

83

I can already tell you what's going to happen to all those extra cycles that faster hardware is going to give us in the next hundred years.

84

They're nearly all going to be wasted.

64–71

Whatever computers are made of, they'll be much faster—maybe 74 quintillion times under Moore's Law, though that may well stop. But even a paltry million times would change the ground rules: more room for what we'd now consider slow languages.

72–76

And yet some applications will still demand speed—video, image rendering, cryptography, simulations soak up unlimited cycles. So languages have to cover an ever wider range of efficiencies, as they already do: implementations of some popular new languages are shockingly wasteful by the standards of previous decades.

77–81

This is a general historical trend: each generation can do things the previous one would have considered wasteful, like making long distance calls casually, or sending a package from Boston to New York via Memphis.

82–84

I can already tell you what's going to happen to all those extra cycles that faster hardware is going to give us. They're nearly all going to be wasted.

64–84

Computers will be vastly faster, leaving room for "slow" languages even as some applications still demand all the speed there is. Each generation looks wasteful to the last, so most of the extra cycles will be wasted.

86

I learned to program when computer power was scarce.

87

I can remember taking all the spaces out of my Basic programs so they would fit into the memory of a 4K TRS-80.

88

The thought of all this stupendously inefficient software burning up cycles doing the same thing over and over seems kind of gross to me.

89

But I think my intuitions here are wrong.

90

I'm like someone who grew up poor, and can't bear to spend money even for something important, like going to the doctor.

91

Some kinds of waste really are disgusting.

92

SUVs, for example, would arguably be gross even if they ran on a fuel which would never run out and generated no pollution.

93

SUVs are gross because they're the solution to a gross problem. (How to make minivans look more masculine.)

94

But not all waste is bad.

95

Now that we have the infrastructure to support it, counting the minutes of your long-distance calls starts to seem niggling.

96

If you have the resources, it's more elegant to think of all phone calls as one kind of thing, no matter where the other person is.

97

There's good waste, and bad waste.

98

I'm interested in good waste-- the kind where, by spending more, we can get simpler designs.

99

How will we take advantage of the opportunities to waste cycles that we'll get from new, faster hardware?

100

The desire for speed is so deeply engrained in us, with our puny computers, that it will take a conscious effort to overcome it.

101

In language design, we should be consciously seeking out situations where we can trade efficiency for even the smallest increase in convenience.

86–90

I learned to program when power was scarce, so inefficient software seems gross to me. But my intuitions are wrong. I'm like someone who grew up poor and can't bear to spend money even for something important, like going to the doctor.

91–96

Some waste really is disgusting—SUVs would be gross even running clean, because they solve a gross problem: making minivans look masculine. But not all waste is bad. Once you have the infrastructure, it's more elegant to treat all phone calls as one kind of thing than to count the minutes.

97–101

I'm interested in good waste, where by spending more we get simpler designs. The desire for speed is so engrained that overcoming it takes conscious effort: in design we should seek situations where we can trade efficiency for even the smallest increase in convenience.

86–101

My poverty-trained horror of wasted cycles is wrong. There's good waste and bad waste; I want the good kind, where spending more buys simpler designs.

103

Most data structures exist because of speed.

104

For example, many languages today have both strings and lists.

105

Semantically, strings are more or less a subset of lists in which the elements are characters.

106

So why do you need a separate data type?

107

You don't, really.

108

Strings only exist for efficiency.

109

But it's lame to clutter up the semantics of the language with hacks to make programs run faster.

110

Having strings in a language seems to be a case of premature optimization.

111

If we think of the core of a language as a set of axioms, surely it's gross to have additional axioms that add no expressive power, simply for the sake of efficiency.

112

Efficiency is important, but I don't think that's the right way to get it.

113

The right way to solve that problem, I think, is to separate the meaning of a program from the implementation details.

114

Instead of having both lists and strings, have just lists, with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes if necessary.

115

Since speed doesn't matter in most of a program, you won't ordinarily need to bother with this sort of micromanagement.

116

This will be more and more true as computers get faster.

117

Saying less about implementation should also make programs more flexible.

118

Specifications change while a program is being written, and this is not only inevitable, but desirable.

119

The word "essay" comes from the French verb "essayer", which means "to try".

120

An essay, in the original sense, is something you write to try to figure something out.

121

This happens in software too.

122

I think some of the best programs were essays, in the sense that the authors didn't know when they started exactly what they were trying to write.

123

Lisp hackers already know about the value of being flexible with data structures.

124

We tend to write the first version of a program so that it does everything with lists.

125

These initial versions can be so shockingly inefficient that it takes a conscious effort not to think about what they're doing, just as, for me at least, eating a steak requires a conscious effort not to think where it came from.

126

What programmers in a hundred years will be looking for, most of all, is a language where you can throw together an unbelievably inefficient version 1 of a program with the least possible effort.

127

At least, that's how we'd describe it in present-day terms. What they'll say is that they want a language that's easy to program in.

128

Inefficient software isn't gross.

129

What's gross is a language that makes programmers do needless work.

130

Wasting programmer time is the true inefficiency, not wasting machine time.

131

This will become ever more clear as computers get faster.

103–110

Most data structures exist because of speed. Many languages have both strings and lists, though strings are more or less a subset of lists whose elements are characters. They only exist for efficiency, and it's lame to clutter the semantics with hacks to make programs run faster: a case of premature optimization.

111–114

It's gross to add axioms that buy no expressive power, just for efficiency. The right way is to separate a program's meaning from its implementation: have just lists, with some way to give the compiler advice that lets it lay out strings as contiguous bytes if necessary.

115–118

Since speed doesn't matter in most of a program, you won't ordinarily bother with this, and saying less about implementation makes programs more flexible. Specifications change while a program is being written, and this is not only inevitable but desirable.

123–127

Lisp hackers already write the first version of a program so it does everything with lists. What programmers in a hundred years will want, most of all, is a language where you can throw together an unbelievably inefficient version 1 with the least possible effort—a language that's easy to program in.

128–131

Inefficient software isn't gross. What's gross is a language that makes programmers do needless work. Wasting programmer time is the true inefficiency, not wasting machine time. This will become ever more clear as computers get faster.

103–131

Most data structures exist for speed. Separate a program's meaning from its implementation, give the compiler optimization advice, and the true inefficiency turns out to be wasting programmer time, not machine time.

133

I think getting rid of strings is already something we could bear to think about.

134

We did it in Arc [blocked], and it seems to be a win; some operations that would be awkward to describe as regular expressions can be described easily as recursive functions.

135

How far will this flattening of data structures go?

136

I can think of possibilities that shock even me, with my conscientiously broadened mind.

137

Will we get rid of arrays, for example?

138

After all, they're just a subset of hash tables where the keys are vectors of integers.

139

Will we replace hash tables themselves with lists?

140

There are more shocking prospects even than that.

141

The Lisp that McCarthy described in 1960, for example, didn't have numbers.

142

Logically, you don't need to have a separate notion of numbers, because you can represent them as lists: the integer n could be represented as a list of n elements.

143

You can do math this way.

144

It's just unbearably inefficient.

145

No one actually proposed implementing numbers as lists in practice.

146

In fact, McCarthy's 1960 paper was not, at the time, intended to be implemented at all.

147

It was a theoretical exercise [blocked], an attempt to create a more elegant alternative to the Turing Machine.

148

When someone did, unexpectedly, take this paper and translate it into a working Lisp interpreter, numbers certainly weren't represented as lists; they were represented in binary, as in every other language.

149

Could a programming language go so far as to get rid of numbers as a fundamental data type?

150

I ask this not so much as a serious question as as a way to play chicken with the future.

151

It's like the hypothetical case of an irresistible force meeting an immovable object-- here, an unimaginably inefficient implementation meeting unimaginably great resources.

152

I don't see why not.

153

The future is pretty long.

154

If there's something we can do to decrease the number of axioms in the core language, that would seem to be the side to bet on as t approaches infinity.

155

If the idea still seems unbearable in a hundred years, maybe it won't in a thousand.

156

Just to be clear about this, I'm not proposing that all numerical calculations would actually be carried out using lists.

157

I'm proposing that the core language, prior to any additional notations about implementation, be defined this way.

158

In practice any program that wanted to do any amount of math would probably represent numbers in binary, but this would be an optimization, not part of the core language semantics.

133–134

Getting rid of strings is already something we could bear to think about. We did it in Arc [blocked], and it seems a win; some operations awkward to describe as regular expressions can be described easily as recursive functions.

135–139

How far will this flattening go? Will we get rid of arrays, just a subset of hash tables where the keys are vectors of integers? Will we replace hash tables themselves with lists?

140–144

There are more shocking prospects. The Lisp McCarthy described in 1960 didn't have numbers—you can represent the integer n as a list of n elements and do math that way, just unbearably inefficiently.

145–148

McCarthy's 1960 paper wasn't meant to be implemented at all; it was a theoretical exercise [blocked], a more elegant alternative to the Turing Machine. When someone unexpectedly translated it into a working interpreter, numbers were represented in binary, as in every other language.

149–155

Could a language get rid of numbers as a fundamental type? I ask as a way to play chicken with the future. I don't see why not. The future is pretty long. If we can decrease the axioms in the core, that's the side to bet on as t approaches infinity.

156–158

To be clear, I'm proposing only that the core language be defined this way. In practice math would represent numbers in binary, but as an optimization, not part of the core semantics.

133–158

We dropped strings in Arc and it was a win. The flattening could go shockingly far—arrays, hash tables, even numbers—because as resources grow unboundedly, fewer axioms is the side to bet on.

160

Another way to burn up cycles is to have many layers of software between the application and the hardware.

161

This too is a trend we see happening already: many recent languages are compiled into byte code.

162

Bill Woods once told me that, as a rule of thumb, each layer of interpretation costs a factor of 10 in speed.

163

This extra cost buys you flexibility.

164

The very first version of Arc was an extreme case of this sort of multi-level slowness, with corresponding benefits.

165

It was a classic "metacircular" interpreter written on top of Common Lisp, with a definite family resemblance to the eval function defined in McCarthy's original Lisp paper.

166

The whole thing was only a couple hundred lines of code, so it was very easy to understand and change.

167

The Common Lisp we used, CLisp, itself runs on top of a byte code interpreter.

168

So here we had two levels of interpretation, one of them (the top one) shockingly inefficient, and the language was usable.

169

Barely usable, I admit, but usable.

170

Writing software as multiple layers is a powerful technique even within applications.

171

Bottom-up programming means writing a program as a series of layers, each of which serves as a language for the one above.

172

This approach tends to yield smaller, more flexible programs. It's also the best route to that holy grail, reusability.

173

A language is by definition reusable.

174

The more of your application you can push down into a language for writing that type of application, the more of your software will be reusable.

175

Somehow the idea of reusability got attached to object-oriented programming in the 1980s, and no amount of evidence to the contrary seems to be able to shake it free.

176

But although some object-oriented software is reusable, what makes it reusable is its bottom-upness, not its object-orientedness.

177

Consider libraries: they're reusable because they're language, whether they're written in an object-oriented style or not.

178

I don't predict the demise of object-oriented programming, by the way.

179

Though I don't think it has much to offer good programmers, except in certain specialized domains, it is irresistible to large organizations.

180

Object-oriented programming offers a sustainable way to write spaghetti code.

181

It lets you accrete programs as a series of patches.

182

Large organizations always tend to develop software this way, and I expect this to be as true in a hundred years as it is today.

160–164

Another way to burn cycles is many layers of software between application and hardware. Bill Woods once told me each layer of interpretation costs a factor of 10 in speed. This extra cost buys you flexibility.

165–170

The first version of Arc was an extreme case: a "metacircular" interpreter on top of Common Lisp, which itself runs on a byte code interpreter. So we had two levels of interpretation, the top one shockingly inefficient, and the language was usable. Barely, I admit, but usable.

171–175

Writing software as layers is powerful even within applications. Bottom-up programming means writing a program as layers, each a language for the one above. The more of your application you push down into a language for that type of application, the more is reusable.

176–178

Reusability got attached to object-oriented programming in the 1980s, and no evidence seems able to shake it free. But what makes some object-oriented software reusable is its bottom-upness, not its object-orientedness. Libraries are reusable because they're language.

179–182

I don't predict the demise of object-oriented programming. Though it has little to offer good programmers, it's irresistible to large organizations: it offers a sustainable way to write spaghetti code, letting you accrete programs as a series of patches.

160–182

Another way to spend cycles is layers of interpretation, which buy flexibility. Bottom-up programming, writing a program as layers each a language for the one above, is the real source of reusability, not object-orientation.

184

As long as we're talking about the future, we had better talk about parallel computation, because that's where this idea seems to live.

185

That is, no matter when you're talking, parallel computation seems to be something that is going to happen in the future.

186

Will the future ever catch up with it?

187

People have been talking about parallel computation as something imminent for at least 20 years, and it hasn't affected programming practice much so far.

188

Or hasn't it?

189

Already chip designers have to think about it, and so must people trying to write systems software on multi-cpu computers.

190

The real question is, how far up the ladder of abstraction will parallelism go?

191

In a hundred years will it affect even application programmers?

192

Or will it be something that compiler writers think about, but which is usually invisible in the source code of applications?

193

One thing that does seem likely is that most opportunities for parallelism will be wasted.

194

This is a special case of my more general prediction that most of the extra computer power we're given will go to waste.

195

I expect that, as with the stupendous speed of the underlying hardware, parallelism will be something that is available if you ask for it explicitly, but ordinarily not used.

196

This implies that the kind of parallelism we have in a hundred years will not, except in special applications, be massive parallelism.

197

I expect for ordinary programmers it will be more like being able to fork off processes that all end up running in parallel.

198

And this will, like asking for specific implementations of data structures, be something that you do fairly late in the life of a program, when you try to optimize it.

199

Version 1s will ordinarily ignore any advantages to be got from parallel computation, just as they will ignore advantages to be got from specific representations of data.

200

Except in special kinds of applications, parallelism won't pervade the programs that are written in a hundred years.

201

It would be premature optimization if it did.

184–189

We'd better talk about parallel computation, because no matter when you're talking, it's always going to happen in the future. People have called it imminent for at least 20 years. Or has it arrived? Already chip designers and people writing systems software on multi-cpu computers must think about it.

190–192

The real question is how far up the ladder of abstraction parallelism will go. Will it affect even application programmers, or stay something compiler writers handle, invisible in the source?

193–199

Most opportunities for parallelism will be wasted, a special case of my prediction that most extra computer power will. I expect it to be available if you ask explicitly but ordinarily unused—not massive parallelism except in special applications, but more like forking off processes, late, when you optimize.

200–201

Except in special applications, parallelism won't pervade the programs written in a hundred years. It would be premature optimization if it did.

184–201

Parallel computation has always seemed imminent. Like the extra speed, most of its opportunities will be wasted; for ordinary programmers it'll mean forking processes late in optimization, not massive parallelism.

203

How many programming languages will there be in a hundred years?

204

There seem to be a huge number of new programming languages lately.

205

Part of the reason is that faster hardware has allowed programmers to make different tradeoffs between speed and convenience, depending on the application.

206

If this is a real trend, the hardware we'll have in a hundred years should only increase it.

207

And yet there may be only a few widely-used languages in a hundred years.

208

Part of the reason I say this is optimism: it seems that, if you did a really good job, you could make a language that was ideal for writing a slow version 1, and yet with the right optimization advice to the compiler, would also yield very fast code when necessary.

209

So, since I'm optimistic, I'm going to predict that despite the huge gap they'll have between acceptable and maximal efficiency, programmers in a hundred years will have languages that can span most of it.

210

As this gap widens, profilers will become increasingly important.

211

Little attention is paid to profiling now.

212

Many people still seem to believe that the way to get fast applications is to write compilers that generate fast code.

213

As the gap between acceptable and maximal performance widens, it will become increasingly clear that the way to get fast applications is to have a good guide from one to the other.

214

When I say there may only be a few languages, I'm not including domain-specific "little languages".

215

I think such embedded languages are a great idea, and I expect them to proliferate.

216

But I expect them to be written as thin enough skins that users can see the general-purpose language underneath.

203–206

How many languages will there be? There seem to be a huge number of new ones lately, partly because faster hardware lets programmers trade speed for convenience by application—and future hardware should only increase it.

207–209

And yet there may be only a few widely-used languages. If you did a really good job, you could make one ideal for a slow version 1 that, with the right optimization advice, also yields very fast code—spanning most of the gap between acceptable and maximal efficiency.

210–213

As the gap widens, profilers become important. Many still believe fast applications come from compilers that generate fast code; it will become clear that what you need is a good guide from acceptable to maximal performance.

214–216

I'm not counting domain-specific "little languages." Embedded languages are a great idea and I expect them to proliferate, but written as skins thin enough that users can see the general-purpose language underneath.

203–216

Faster hardware spawns many new languages, yet a good enough language could span slow and fast, so a few may dominate. As the efficiency gap widens, profilers—the guide from acceptable to maximal—become important.

218

Who will design the languages of the future?

219

One of the most exciting trends in the last ten years has been the rise of open-source languages like Perl, Python, and Ruby.

220

Language design is being taken over by hackers.

221

The results so far are messy, but encouraging.

222

There are some stunningly novel ideas in Perl, for example.

223

Many are stunningly bad, but that's always true of ambitious efforts.

224

At its current rate of mutation, God knows what Perl might evolve into in a hundred years.

225

It's not true that those who can't do, teach (some of the best hackers I know are professors), but it is true that there are a lot of things that those who teach can't do. Research [blocked] imposes constraining caste restrictions.

226

In any academic field there are topics that are ok to work on and others that aren't.

227

Unfortunately the distinction between acceptable and forbidden topics is usually based on how intellectual the work sounds when described in research papers, rather than how important it is for getting good results.

228

The extreme case is probably literature; people studying literature rarely say anything that would be of the slightest use to those producing it.

229

Though the situation is better in the sciences, the overlap between the kind of work you're allowed to do and the kind of work that yields good languages is distressingly small. (Olin Shivers has grumbled eloquently about this.)

230

For example, types seem to be an inexhaustible source of research papers, despite the fact that static typing seems to preclude true macros-- without which, in my opinion, no language is worth using.

231

The trend is not merely toward languages being developed as open-source projects rather than "research", but toward languages being designed by the application programmers who need to use them, rather than by compiler writers.

232

This seems a good trend and I expect it to continue.

218–224

Who will design the languages of the future? One of the most exciting trends has been the rise of open-source languages like Perl, Python, and Ruby. Language design is being taken over by hackers. The results are messy but encouraging—Perl has stunningly novel ideas, many stunningly bad, but that's always true of ambitious efforts.

225–228

It's not true that those who can't do, teach—some of the best hackers I know are professors—but there's a lot that those who teach can't do. Research [blocked] imposes caste restrictions: which topics are okay turns on how intellectual the work sounds in papers rather than how important it is for results. The extreme case is literature, where scholars rarely say anything of use to those producing it.

229–232

The sciences are better, but the overlap between work you're allowed to do and work that yields good languages is distressingly small. Types are an inexhaustible source of papers, despite static typing seeming to preclude true macros, without which no language is worth using. So the good trend is toward languages designed by the application programmers who need them, rather than by compiler writers.

218–232

Open-source languages like Perl, Python, and Ruby mean hackers are taking over language design. Academia rewards work that sounds intellectual over work that gets results, so the good trend is languages designed by the programmers who use them.

234

Unlike physics in a hundred years, which is almost necessarily impossible to predict, I think it may be possible in principle to design a language now that would appeal to users in a hundred years.

235

One way to design a language is to just write down the program you'd like to be able to write, regardless of whether there is a compiler that can translate it or hardware that can run it.

236

When you do this you can assume unlimited resources.

237

It seems like we ought to be able to imagine unlimited resources as well today as in a hundred years.

238

What program would one like to write?

239

Whatever is least work.

240

Except not quite: whatever would be least work if your ideas about programming weren't already influenced by the languages you're currently used to.

241

Such influence can be so pervasive that it takes a great effort to overcome it.

242

You'd think it would be obvious to creatures as lazy as us how to express a program with the least effort.

243

In fact, our ideas about what's possible tend to be so limited [blocked] by whatever language we think in that easier formulations of programs seem very surprising.

244

They're something you have to discover, not something you naturally sink into.

245

One helpful trick here is to use the length [blocked] of the program as an approximation for how much work it is to write.

246

Not the length in characters, of course, but the length in distinct syntactic elements-- basically, the size of the parse tree.

247

It may not be quite true that the shortest program is the least work to write, but it's close enough that you're better off aiming for the solid target of brevity than the fuzzy, nearby one of least work.

248

Then the algorithm for language design becomes: look at a program and ask, is there any way to write this that's shorter?

249

In practice, writing programs in an imaginary hundred-year language will work to varying degrees depending on how close you are to the core.

250

Sort routines you can write now.

251

But it would be hard to predict now what kinds of libraries might be needed in a hundred years.

252

Presumably many libraries will be for domains that don't even exist yet.

253

If SETI@home works, for example, we'll need libraries for communicating with aliens.

254

Unless of course they are sufficiently advanced that they already communicate in XML.

255

At the other extreme, I think you might be able to design the core language today.

256

In fact, some might argue that it was already mostly designed in 1958.

257

If the hundred year language were available today, would we want to program in it?

258

One way to answer this question is to look back.

259

If present-day programming languages had been available in 1960, would anyone have wanted to use them?

260

In some ways, the answer is no. Languages today assume infrastructure that didn't exist in 1960.

261

For example, a language in which indentation is significant, like Python, would not work very well on printer terminals.

262

But putting such problems aside-- assuming, for example, that programs were all just written on paper-- would programmers of the 1960s have liked writing programs in the languages we use now?

263

I think so.

264

Some of the less imaginative ones, who had artifacts of early languages built into their ideas of what a program was, might have had trouble. (How can you manipulate data without doing pointer arithmetic?

265

How can you implement flow charts without gotos?)

266

But I think the smartest programmers would have had no trouble making the most of present-day languages, if they'd had them.

267

If we had the hundred-year language now, it would at least make a great pseudocode.

268

What about using it to write software?

269

Since the hundred-year language will need to generate fast code for some applications, presumably it could generate code efficient enough to run acceptably well on our hardware.

270

We might have to give more optimization advice than users in a hundred years, but it still might be a net win.

271

Now we have two ideas that, if you combine them, suggest interesting possibilities: (1) the hundred-year language could, in principle, be designed today, and (2) such a language, if it existed, might be good to program in today.

272

When you see these ideas laid out like that, it's hard not to think, why not try writing the hundred-year language now?

273

When you're working on language design, I think it is good to have such a target and to keep it consciously in mind.

274

When you learn to drive, one of the principles they teach you is to align the car not by lining up the hood with the stripes painted on the road, but by aiming at some point in the distance.

275

Even if all you care about is what happens in the next ten feet, this is the right answer.

276

I think we can and should do the same thing with programming languages.

234–237

Unlike physics a hundred years out, it may be possible to design a language now that would appeal to users in a hundred years. One way is to write down the program you'd like to write, regardless of compiler or hardware, assuming unlimited resources—which we can imagine as well today as then.

238–244

What program would one like to write? Whatever would be least work if your ideas weren't already influenced by the languages you're used to. That influence is so pervasive that easier formulations seem surprising; you have to discover them.

245–248

One trick is to use the length [blocked] of a program as an approximation for the work—not in characters, but in distinct syntactic elements, the size of the parse tree. So the algorithm for language design becomes: look at a program and ask, is there any way to write this shorter?

249–255

This works to varying degrees depending on how close you are to the core. Sort routines you can write now, but it's hard to predict what libraries a hundred years will need. At the other extreme, you might design the core today—some argue it was already mostly designed in 1958.

256–262

Would we want to program in the hundred-year language today? Look back: if present-day languages had been available in 1960, would programmers have wanted them? In some ways no—they assume infrastructure that didn't exist. But assuming programs were just on paper, I think so.

263–266

The less imaginative ones might have struggled, but the smartest would have had no trouble making the most of present-day languages.

267–270

The hundred-year language would at least make great pseudocode. And since it needs to generate fast code for some applications, it could presumably run acceptably on our hardware—we'd give more optimization advice, but it might still be a net win.

271–276

So: the hundred-year language could in principle be designed today, and might be good to program in today. Why not try writing it now? When you learn to drive, one principle is to aim the car not at the stripes on the road but at some point in the distance. Even if all you care about is the next ten feet, this is the right answer—and we should do the same with programming languages.

234–276

Unlike physics, the hundred-year language may be designable today: write the program you'd want, using brevity as a proxy for least work. Such a language would make great pseudocode—so why not try writing it now, aiming at a point in the distance.

278

Notes

279

I believe Lisp Machine Lisp was the first language to embody the principle that declarations (except those of dynamic variables) were merely optimization advice, and would not change the meaning of a correct program.

280

Common Lisp seems to have been the first to state this explicitly.

281

Thanks to Trevor Blackwell, Robert Morris, and Dan Giffin for reading drafts of this, and to Guido van Rossum, Jeremy Hylton, and the rest of the Python crew for inviting me to speak at PyCon.

279–280

I believe Lisp Machine Lisp was the first language to embody the principle that declarations (except those of dynamic variables) were merely optimization advice and would not change the meaning of a correct program; Common Lisp seems to have been the first to state this explicitly.

281

Thanks to Trevor Blackwell, Robert Morris, and Dan Giffin for reading drafts, and to Guido van Rossum, Jeremy Hylton, and the rest of the Python crew for inviting me to speak at PyCon.

278–281

A note on the precedent for treating declarations as mere optimization advice, plus thanks to readers and the Python crew who hosted the talk.