Programmers don't grok Metaprogramming ~ Connor Bey 13

As the need for abstraction higher than assembly language arose, two key languages in the late 1950s came into creation — Fortran and Lisp. These two languages can be said to be the main influences for most programming languages today.

Fortran (an abbreviation of "formula translator") was a solution to a corporate business need. IBM needed a higher-level language to make programming their mainframe computers easier. That computer model being the IBM 704. In 1957 John Backus alongside his team at IBM delivered the first Fortran compiler, written in assembly language. What they had made was a language that is statically typed, imperative, and statement-oriented.

Lisp (an abbreviation of "list processing") was a solution for an academic need. John McCarthy at the Massachusetts Institute of Technology desired to create an AI programming language, also for the IBM 704. McCarthy coined the term "artificial intelligence" in 1955, holding a workshop titled "A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence" which is considered as the birthplace of AI as an academic discipline. Influenced by the lambda calculus, McCarthy along with Steve Russell worked on Lisp in the late 1950s and finished the first compiler in 1962. A compiler which was written in Lisp itself. Meta! What they had created was a dynamically typed, functional language which is recursive and expression-oriented.

In the decades following many new languages were created. The statement-oriented design from Fortran was heavily followed. And most languages also took dynamic and functional traits that Lisp brought. Though, most did not keep what gives Lisp its metaprogramming power — symbolic expression.

This is one key to beginning to understand the realm of programming versus that of metaprogramming.

Fortran is statement-oriented; telling the computer what action to be done. In Fortran, a program is started with program, you type statements such as x = 0, assigning 0 to x, and end the program using end program. This seems logical, simple even, and is an extremely familiar way of thinking to programmers.

Lisp is expression-oriented; telling the computer a computation to produce a value. In Lisp, a program is formed from something called symbolic expression (s-expression for short). S-expression is a notation for parenthesized lists, and lists within lists (trees). Lists are not only used for data but for the program itself. A program in Lisp could be (* 2 3) — multiplying 2 by 3. Lisp uses a prefix notation where the operator (procedure/function) comes before the operands (parameters). Data looks like (1 2 3) — a list of 1, 2 and 3. Using s-expression the syntax for programs and for data is the same. Lists themselves are made up of something called cons pairs or cons cells which are a linking or constructing of 2 things. When the second half of a pair references the first half of another pair you get a linked list. Voila, magic!

Lisp keeps the source code of a program in a format that computers need which is syntax trees. For computers to make sense of the words written they must be broken down into a hierarchical structure before execution. When you use Lisp your programs are lists and lists of lists which are a tree structure. Source code is in uniform and ready to go. And it's because of this that programs can be easily read and used by other programs.

Most languages which don't use this uniform structure and are more like English in the way they are written lose the full metaprogramming ability. To be executed the source code must be converted into an abstract syntax tree before execution. Because the source is not a tree itself it has to be interpreted as one after the fact, abstractly. Programmers must ask themselves, why do we not write programs in a uniform structure that computers understand?

Notice how data formats like CSV, HTML, JSON, XML, etc, etc are lists and lists of lists (trees). Or tables in databases, entity-attribute-value data modeling, etc. Something to think about. When one starts seeing the tree structures in source it is hard to unsee it.

The vast majority of programmers, since the time high-level programming became popular have not programmed in Lisp. I don't believe it is a dislike of s-expressions. It is a misunderstanding brought by the flow of capital forcing what jobs were available and what languages. Writing in Lisp is not harder than other languages. I'd argue it is 100x easier because you are working in a programming language which has uniformity and not an English-like programming language which parsing it, by both machine and human is huge effort because of its complexity. All while losing the metaprogramming powers in the process.

Python, a language from the 1990s, can be seen as a blend of elements from Fortran and Lisp. It makes heavy use of statements like Fortran with added flexibility from Lisp such as dynamic typing and some functional programming. Though it of course, does not use symbolic expression.

Scheme is a dialect of Lisp from the 1970s, said to be the first language to realize the lambda calculus through lexical scoping. It is extremely simple and itself has branched into many implementations with Guile being the most popular through endorsement by GNU and as the language for the Guix packaging system.

So, 1 + 1. How does each language fare?

In python this program is 1 + 1. Great.

In Scheme this program is (+ 1 1). Different. Because we've learned about symbolic expression we know why it looks this way. Most of us were taught math using infix notation where the operator + is between the operands 1 and 1. But infix notation is ambiguous hence the need for order of operations in math. With prefix syntax order of operations are not needed. The order of operations results from the notation itself.

Now let's try something most programmers have not thought of. Let's say we want our program to be able to read our program.

Starting with Python, the canonical way is to convert the program into a string as "1 + 1". We can use the eval function to evaluate our program as it takes a string for its parameter. But by converting our program into a string, our program is no longer our program. It is a string, a sequence of characters that looks like a program, but it is just a string. All type info is lost, all structure is lost. It isn't a Python program, it is a python string. This is the mile-wide difference. To manipulate it would be string manipulation or regex. Not the same.

In Scheme we can read our program. Using one simple procedure called quote. We quote our program with (quote (+ 1 1)) or the shorthand '(+ 1 1) which gives the result (+ 1 1) — exactly our program. All structure is intact, types, everything. Nothing has been added or taken away.

Something so simple has a universe of difference in programming paradigm. Because our program is in symbolic expression we have the ability to read programs, manipulate programs, output programs to be used by other programs, change the program while it is running, etc, etc.

The AI hype train is here at full speed. For better or for worse it continues chugging along. But in regards to its technical underpinnings, most have missed the boat.

Programmers don't grok Metaprogramming

A focused history on programming

Statements versus Symbolic Expression

Programs and Data are Trees

60+ years of not programming in Lisp

Comparing 1 + 1 in Python and Scheme

The Veiled Fruits of Metaprogramming