Alan Kay on “What Made APL Programming So Revolutionary?”

APL stands for “A Programming Language”, the title of the book in 1962 written by Ken Iverson about what initially was called “Iverson Notation”. Part of the reason for the “notation” label was that it was used extensively a number of years as “a notation” before it was implemented as “APL/360” (on the IBM 360 series of mainframes).

Ken Iverson was essentially a mathematician, but who also had a physics background, and trained under Howard Aiken at Harvard in close proximity with the various computers designed and built there, receiving his PhD in Applied Math, with a thesis on how to deal with very large sparse matrices.

He started to use mathematical tools to describe computations and computers, and soon found these to be lacking. This led to a number of inventions very much in the spirit of mathematics that allowed many more structures and operations to be easily defined and “notated”, many by “functional projection”.

One of the most interesting things about “Iverson Notation” at this stage was that not having an implementation greatly helped — IMO — what he tried to do at the descriptive level: there were no worries about whether this or that could be implemented at the time, or whether there would be enough computing capacity for speed or space to eventually implement the notation.

It was in this form that I and many of the other grad students of the mid-60s learned “Iverson”. My first CS course was from the legendary and wonderful Nick Metropolis, the main architect and builder of the Los Alamos computers, especially the “Maniac” series. Nick liked “Iverson”, and used it extensively for both hardware and software descriptions. A year or so later, Bob Barton in his notorious first course in “Systems Design”, required us to “get and read and learn Iverson’s book”.

To motivate what Ken Iverson decided to do, it is worth looking at the history of Maxwell’s Equations — 4 ideas (can be just 2 or 1) that will fit on a T-shirt. However, one of Maxwell’s main renditions was not in the form we are familiar with, but was expressed as 20 partial differential equations in x, y, z coordinates.

This is not a great T-shirt!

Helmholtz and especially Oliver Heaviside did a fair amount of work to use the definitional possibilities of mathematics to hide coordinate systems with vectors and details of the PDEs, with “operators” (Div, Curl, Gradient … “and all that”) .

A terrific T-shirt!

You can think of the operators “gradient”: ∇, “divergence”: ∇•, “curl”: ∇×, as “meta”, that act a bit like macros to rewrite functions in a more complex “decorated form”.

The basic idea here is to get “whole ideas into one eyeful” by inventing notations and processes that can do this, and consequently requiring readers to learn the new notations fluently enough so there is a net benefit rather than just more noise.

When this is done well, the new “meta-stuff” becomes generally useful (like the grad, div, curl “and all that” above). An example in APL is the operator “.” , which is generalized inner product that can take any APL functions as arguments. For example, what we think of as “matrix multiplication” is +.* in APL (see inner product in APL) .

People who learn math are quite willing to do this learning and gain the necessary fluency — but there’s considerable evidence that most computer folks are not at all willing to do lots of training in special tools that would make a difference in “being professional”.†

This has led to the idea that APL is not readable. In fact, it is both very readable and very compact. This is not to say that a face lift wouldn’t help — the standard notation for APL was derived to fit on an IBM Selectric golf ball typewriter terminal, and could be greatly improved today.

The second interesting idea in APL is “projection”. This is much more relatable today in an era of “map/reduce” than it was in the 60s or 70s, even though one could also write a good “mapping” function in Lisp (and it was also an “operator” because it could take a function as one of its arguments). In the early 70s, Unix happened, and Doug McIlroy invented “pipes programming” to allow in this case “data” to be passed through “functions” to be reformulated,

However, the big uses and extreme ranges of this way to program was explored earliest and most extensively in “Iverson Notation” and to a slightly less extent in the actual language “APL”.

Attaining fluency in APL as one of three or so main ways to think about programming “is good for one’s mind”. As in the later map/reduce, one “sends” a structure in parallel through a cascade of shaping functions and then a cascade of trimming and extracting functions to finally get a result. (One must suppress one’s imagination of just how big some of the intermediate structures might be getting … this is also good for one’s mind!)

There is real clarity to be gained for both writers and readers of APL.

A number of us in our research group at Parc liked APL quite a bit, and it was clear that much more could be done using polymorphic operations and the extension features of Smalltalk (only a few of these experiments emerged publically in the 80s). But, imagine gazillions of objects provided with “events, suggestions and hints”, etc.

As always, time has moved on (and programming language ideas move much slower — and programmers move almost not at all).

There are several modern APL-like languages today — such as J and K — but I would criticize them as being too much like the classic APL. It is possible to extract what is really great from APL and use it in new language designs without being so tied to the past. This would be a great project for some grad students of today: what does the APL-perspective mean today, and what kind of great programming language could be inspired by it? ††

† This seems rather like the disinclination of so many pop culture musicians to learn to read and write music notation, despite the tremendous advantages for doing so in many areas — and in fact what seems to be a disinclination in much of our culture for learning to fluently read and write the written form of their own language. It’s not that you can’t do art in “oral modes”, but that the possibilities for art are so expanded when literacy is added.

†† As an example, a looser more versatile version of this kind of programming can be done using dataflow between processes that themselves are made from projective mappings, and this could yield a very useful and beautiful language. This is what Dan Amelang and some of his colleagues did to make the Nile Language, which was especially aimed at “graphical mathematics and rendering”. In the STEPS project of some years ago, this allowed virtually all of 2.5D “personal computer” graphics — including rendering, compositing, filtering, curves, fills, masks, etc., to be defined and run in real-time in under 500 lines of code. This replaced an estimated 50,000 to 100,000 lines of C++. Because of the dataflow and the independence of the mappings, this was able to be set up so it could use as many cores as available to run the code. (And so forth.)

500 lines of code is only about 10 pages and it can be shown as an “eyeful” on a desktop screen:

This is partially low hanging fruit since mathematics does underlie computer graphics at all levels. The kinds of ideas that APL first brought to light allows “runnable mathematics” to be possible (and when it is possible, it is as wonderful as it gets!)

Recommend

Styling HTML checkboxes is hard - here's why

Comparing Native Blazor Components to Wrapped JavaScript Components

Shipping a Linux Kernel with Windows

Performance Profiling Your React App

GitHub - fate0/webanalyzer: webanalyzer wip

GitHub - HackerPoet/YouTubeCommenter: AI to generate YouTube comments based on v...

GitHub - j-c-peters/spending-tracker: A simple, unified view of your transaction...

为什么以前女人生四五个娃都好像没什么事，现在女人生一个像大病一场？ - 知乎

Windows Console Tools

GitHub - pingcap/tidb-binlog: A tool used to collect and merge tidb's bi...

About Joyk